NemoMix-12B-DellaV1a
NemoMix-12B-DellaV1a is an experimental merge of the following models using the DELLA method with mergekit:
- BeaverAI/mistral-doryV2-12b
- NeverSleep/Lumimaid-v0.2-12B
- intervitens/mini-magnum-12b-v1.1
- grimjim/mistralai-Mistral-Nemo-Instruct-2407
EDIT: There seem to be tokenizer issues. I would probably have to merge with the base model instead of instruct I'm guessing. Don't bother.
🧩 Configuration
models:
- model: BeaverAI/mistral-doryV2-12b
parameters:
weight: 0.20
density: 0.42
- model: NeverSleep/Lumimaid-v0.2-12B
parameters:
weight: 0.22
density: 0.54
- model: intervitens/mini-magnum-12b-v1.1
parameters:
weight: 0.24
density: 0.66
- model: grimjim/mistralai-Mistral-Nemo-Instruct-2407
parameters:
weight: 0.34
density: 0.78
merge_method: della
base_model: grimjim/mistralai-Mistral-Nemo-Instruct-2407
parameters:
int8_mask: true
epsilon: 0.1
lambda: 1.0
density: 0.7
dtype: bfloat16
- Downloads last month
- 3