|
--- |
|
library_name: transformers |
|
tags: |
|
- merge |
|
- mergekit |
|
license: cc-by-4.0 |
|
language: |
|
- da |
|
- sv |
|
- 'no' |
|
- nb |
|
- nn |
|
base_model: |
|
- mistralai/Mistral-7B-v0.1 |
|
- danish-foundation-models/munin-7b-alpha |
|
- norallm/normistral-7b-warm |
|
- timpal0l/Mistral-7B-v0.1-flashback-v2 |
|
--- |
|
|
|
# ScandiMerge |
|
|
|
This is a DARE-TIES merge of the following models, all based on `mistralai/Mistral-7B-v0.1`: |
|
|
|
1. `danish-foundation-models/munin-7b-alpha`, continued pretraining on Danish data; |
|
2. `norallm/normistral-7b-warm`, continued pretraining on Norwegian data; |
|
3. `timpal0l/Mistral-7B-v0.1-flashback-v2`, continued pretraining on Swedish data. |
|
|
|
|
|
|
|
## Model Details |
|
|
|
- **Merged by:** Dan Saattrup Nielsen |
|
- **Model type:** Decoder model, based on Mistral-7B-v0.1 |
|
- **Language(s) (NLP):** Danish, Swedish and Norwegian |
|
- **License:** CC-BY-4.0 |
|
- **Merge configuration:** |
|
```python |
|
dict( |
|
models=[ |
|
dict( |
|
model="danish-foundation-models/munin-7b-alpha", |
|
parameters=dict( |
|
density=0.9, |
|
weight=1.0, |
|
), |
|
), |
|
dict( |
|
model="norallm/normistral-7b-warm", |
|
parameters=dict( |
|
density=0.9, |
|
weight=1.0, |
|
), |
|
), |
|
dict( |
|
model="timpal0l/Mistral-7B-v0.1-flashback-v2", |
|
parameters=dict( |
|
density=0.9, |
|
weight=1.0, |
|
), |
|
), |
|
], |
|
merge_method="dare_ties", |
|
random_seed=4242, |
|
base_model="mistralai/Mistral-7B-v0.1", |
|
parameters=dict( |
|
normalize=True, |
|
int8_mask=True, |
|
), |
|
dtype="float16", |
|
) |
|
``` |