File size: 2,786 Bytes
2cd8142 bc2727d 938fb80 e370fac 2cd8142 ee732b2 2cd8142 e370fac 2cd8142 e370fac 2cd8142 e370fac 2cd8142 e370fac 3b6aefe e370fac 2cd8142 76a06fc 2cd8142 c0814c9 76a06fc dcd62ba 76a06fc dcd62ba 2cd8142 e370fac 2cd8142 e370fac 2cd8142 1add8b4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 |
---
base_model: []
library_name: transformers
tags:
- mergekit
- merge
---
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6550b16f7490049d6237f200/yCmy0NUWEu8_g3Fe_FklS.jpeg)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6550b16f7490049d6237f200/ceugXFkm46gpEKPphl9R_.png)
## Information
### Description
My main goal with this one was to merge the smartness of the base Instruct Nemo with the better prose from the different roleplaying fine-tunes. This is version v0.1, still to be tested. Weights shamelessly stolen from @ParasiticRogue (thank you, friend). All credits and thanks go to Intervitens, Mistralai, NeverSleep and ShuttleAI for providing amazing models used in the merge.
### Instruct
Both Mistral Instruct and ChatML should work.
```
<s>[INST] {system} [/INST]{assistant}</s>[INST] {user} [/INST]
```
Or...
```
<|im_start|>system
{system}<|im_end|>
<|im_start|>user
{user}<|im_end|>
<|im_start|>assistant
{assistant}<|im_end|>
```
### Other Versions
V1: https://huggingface.co/MarinaraSpaghetti/Nemomix-v1.0-12B
V2: https://huggingface.co/MarinaraSpaghetti/Nemomix-v2.0-12B
V3: https://huggingface.co/MarinaraSpaghetti/Nemomix-v3.0-12B
### Settings
Lower Temperature of 0.35 recommended, although I had luck with Temperatures above one (1.0-1.2) if you crank up the Min P (0.01-0.1). Run with base DRY of 0.8/1.75/2/0 and you're good to go.
# Nemomix-v0.1-12B
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
## Merge Details
### Merge Method
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using F:\mergekit\mistralaiMistral-Nemo-Base-2407 as a base.
### Models Merged
The following models were included in the merge:
* F:\mergekit\intervitens_mini-magnum-12b-v1.1
* F:\mergekit\mistralaiMistral-Nemo-Instruct-2407
* F:\mergekit\NeverSleep_Lumimaid-v0.2-12B
* F:\mergekit\shuttleai_shuttle-2.5-mini
### Configuration
The following YAML configuration was used to produce this model:
```yaml
models:
- model: F:\mergekit\shuttleai_shuttle-2.5-mini
parameters:
weight: 0.16
density: 0.42
- model: F:\mergekit\NeverSleep_Lumimaid-v0.2-12B
parameters:
weight: 0.22
density: 0.54
- model: F:\mergekit\intervitens_mini-magnum-12b-v1.1
parameters:
weight: 0.28
density: 0.66
- model: F:\mergekit\mistralaiMistral-Nemo-Instruct-2407
parameters:
weight: 0.34
density: 0.78
merge_method: dare_ties
base_model: F:\mergekit\mistralaiMistral-Nemo-Base-2407
parameters:
int8_mask: true
dtype: bfloat16
```
## Ko-fi
### Enjoying what I do? Consider donating here, thank you!
https://ko-fi.com/spicy_marinara |