|
--- |
|
base_model: [] |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merge |
|
--- |
|
|
|
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6550b16f7490049d6237f200/yCmy0NUWEu8_g3Fe_FklS.jpeg) |
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6550b16f7490049d6237f200/ceugXFkm46gpEKPphl9R_.png) |
|
|
|
## Information |
|
### Description |
|
|
|
My main goal with this one was to merge the smartness of the base Instruct Nemo with the better prose from the different roleplaying fine-tunes. This is version v0.1, still to be tested. Weights shamelessly stolen from @ParasiticRogue (thank you, friend). All credits and thanks go to Intervitens, Mistralai, NeverSleep and ShuttleAI for providing amazing models used in the merge. |
|
|
|
### Instruct |
|
|
|
Both Mistral Instruct and ChatML should work. |
|
|
|
``` |
|
<s>[INST] {system} [/INST]{assistant}</s>[INST] {user} [/INST] |
|
``` |
|
|
|
Or... |
|
|
|
``` |
|
<|im_start|>system |
|
{system}<|im_end|> |
|
<|im_start|>user |
|
{user}<|im_end|> |
|
<|im_start|>assistant |
|
{assistant}<|im_end|> |
|
``` |
|
|
|
### Other Versions |
|
|
|
V1: https://huggingface.co/MarinaraSpaghetti/Nemomix-v1.0-12B |
|
|
|
V2: https://huggingface.co/MarinaraSpaghetti/Nemomix-v2.0-12B |
|
|
|
V3: https://huggingface.co/MarinaraSpaghetti/Nemomix-v3.0-12B |
|
|
|
### Settings |
|
|
|
Lower Temperature of 0.35 recommended, although I had luck with Temperatures above one (1.0-1.2) if you crank up the Min P (0.01-0.1). Run with base DRY of 0.8/1.75/2/0 and you're good to go. |
|
|
|
# Nemomix-v0.1-12B |
|
|
|
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). |
|
|
|
## Merge Details |
|
### Merge Method |
|
|
|
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using F:\mergekit\mistralaiMistral-Nemo-Base-2407 as a base. |
|
|
|
### Models Merged |
|
|
|
The following models were included in the merge: |
|
* F:\mergekit\intervitens_mini-magnum-12b-v1.1 |
|
* F:\mergekit\mistralaiMistral-Nemo-Instruct-2407 |
|
* F:\mergekit\NeverSleep_Lumimaid-v0.2-12B |
|
* F:\mergekit\shuttleai_shuttle-2.5-mini |
|
|
|
### Configuration |
|
|
|
The following YAML configuration was used to produce this model: |
|
|
|
```yaml |
|
models: |
|
- model: F:\mergekit\shuttleai_shuttle-2.5-mini |
|
parameters: |
|
weight: 0.16 |
|
density: 0.42 |
|
- model: F:\mergekit\NeverSleep_Lumimaid-v0.2-12B |
|
parameters: |
|
weight: 0.22 |
|
density: 0.54 |
|
- model: F:\mergekit\intervitens_mini-magnum-12b-v1.1 |
|
parameters: |
|
weight: 0.28 |
|
density: 0.66 |
|
- model: F:\mergekit\mistralaiMistral-Nemo-Instruct-2407 |
|
parameters: |
|
weight: 0.34 |
|
density: 0.78 |
|
merge_method: dare_ties |
|
base_model: F:\mergekit\mistralaiMistral-Nemo-Base-2407 |
|
parameters: |
|
int8_mask: true |
|
dtype: bfloat16 |
|
``` |
|
|
|
## Ko-fi |
|
### Enjoying what I do? Consider donating here, thank you! |
|
https://ko-fi.com/spicy_marinara |