MarinaraSpaghetti commited on
Commit
2cd8142
1 Parent(s): 7a46ce4

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -3
README.md CHANGED
@@ -1,3 +1,82 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: []
3
+ library_name: transformers
4
+ tags:
5
+ - mergekit
6
+ - merge
7
+
8
+ ---
9
+ # Description
10
+
11
+ My main goal with this one was to merge the smartness of the base Instruct Nemo with the better prose from the different roleplaying fine-tunes. This is version v0.1, still to be tested. Weights shamelessly stolen from @ParasiticRogue (thank you, friend).
12
+
13
+ # Instruct
14
+
15
+ Both Mistral Instruct and ChatML should work.
16
+
17
+ ```
18
+ <s>[INST] {system} [/INST]{response}</s>[INST] {prompt} [/INST]
19
+ ```
20
+
21
+ ```
22
+ <|im_start|>system
23
+ {system}<|im_end|>
24
+ <|im_start|>user
25
+ {prompt}<|im_end|>
26
+ <|im_start|>system
27
+ {response}<|im_end|>
28
+ ```
29
+
30
+ # GGUF
31
+
32
+ https://huggingface.co/MarinaraSpaghetti/Nemomix-v0.1-12B-GGUF
33
+
34
+ # Settings
35
+
36
+ Lower Temperature recommended, although I had luck with Temperatures above one (1.0-1.2) if you crank up the Min P (0.01-0.1). Run with base DRY of 0.8/1.75/2/0 and you're good to go.
37
+
38
+ # Nemomix-v0.1-12B
39
+
40
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
41
+
42
+ ## Merge Details
43
+ ### Merge Method
44
+
45
+ This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using F:\mergekit\mistralaiMistral-Nemo-Base-2407 as a base.
46
+
47
+ ### Models Merged
48
+
49
+ The following models were included in the merge:
50
+ * F:\mergekit\intervitens_mini-magnum-12b-v1.1
51
+ * F:\mergekit\mistralaiMistral-Nemo-Instruct-2407
52
+ * F:\mergekit\NeverSleep_Lumimaid-v0.2-12B
53
+ * F:\mergekit\shuttleai_shuttle-2.5-mini
54
+
55
+ ### Configuration
56
+
57
+ The following YAML configuration was used to produce this model:
58
+
59
+ ```yaml
60
+ models:
61
+ - model: F:\mergekit\shuttleai_shuttle-2.5-mini
62
+ parameters:
63
+ weight: 0.16
64
+ density: 0.42
65
+ - model: F:\mergekit\NeverSleep_Lumimaid-v0.2-12B
66
+ parameters:
67
+ weight: 0.22
68
+ density: 0.54
69
+ - model: F:\mergekit\intervitens_mini-magnum-12b-v1.1
70
+ parameters:
71
+ weight: 0.28
72
+ density: 0.66
73
+ - model: F:\mergekit\mistralaiMistral-Nemo-Instruct-2407
74
+ parameters:
75
+ weight: 0.34
76
+ density: 0.78
77
+ merge_method: dare_ties
78
+ base_model: F:\mergekit\mistralaiMistral-Nemo-Base-2407
79
+ parameters:
80
+ int8_mask: true
81
+ dtype: bfloat16
82
+ ```