kromeurus commited on
Commit
fa1b96d
·
verified ·
1 Parent(s): 9178f29

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +282 -53
README.md CHANGED
@@ -1,53 +1,282 @@
1
- ---
2
- base_model: []
3
- library_name: transformers
4
- tags:
5
- - mergekit
6
- - merge
7
-
8
- ---
9
- # output
10
-
11
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
-
13
- ## Merge Details
14
- ### Merge Method
15
-
16
- This model was merged using the passthrough merge method.
17
-
18
- ### Models Merged
19
-
20
- The following models were included in the merge:
21
- * merge\17-42aav2.sl
22
- * parts/anterosv2.c1
23
- * merge\2-17aav2.sl
24
- * merge\42-52aav2.sl
25
- * parts/anterosv2.b
26
-
27
- ### Configuration
28
-
29
- The following YAML configuration was used to produce this model:
30
-
31
- ```yaml
32
- dtype: float32
33
- merge_method: passthrough
34
- out_dtype: bfloat16
35
- parameters:
36
- int8_mask: 1.0
37
- slices:
38
- - sources:
39
- - layer_range: [0, 2]
40
- model: parts/anterosv2.b
41
- - sources:
42
- - layer_range: [0, 15]
43
- model: merge\2-17aav2.sl
44
- - sources:
45
- - layer_range: [0, 25]
46
- model: merge\17-42aav2.sl
47
- - sources:
48
- - layer_range: [0, 10]
49
- model: merge\42-52aav2.sl
50
- - sources:
51
- - layer_range: [52, 56]
52
- model: parts/anterosv2.c1
53
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - Sao10K/L3-8B-Niitama-v1
4
+ - OEvortex/Emotional-llama-8B
5
+ - ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
6
+ - nothingiisreal/L3-8B-Celeste-V1.2
7
+ - ResplendentAI/Nymph_8B
8
+ - TheDrummer/Llama-3SOME-8B-v2
9
+ - nothingiisreal/L3-8B-Instruct-Abliterated-DWP
10
+ library_name: transformers
11
+ tags:
12
+ - mergekit
13
+ - merge
14
+ - not-for-all-audiences
15
+ ---
16
+ God this sucked to make, I'm getting burnout with this fucker so I'm releasing what I have as a partial update. I'll be focusing on L3.1 for the time being and taking a break from
17
+ this project for a bit. What's done is... fine, I'm not super happy about it but it's objectively better then v0.1.
18
+
19
+ ### Quants
20
+
21
+ None ATM.
22
+
23
+ ### Details & Recommended Settings
24
+
25
+ (Still testing; subject to change)
26
+
27
+ Very experimental, expect bugs. Thrives at more story heavy and narrative RP yet still excels with the basics like usual. Way more horny this time, no idea why. Slightly worse instruct following compared to
28
+ v0.1. Really needs examples to fuction, otherwise it'll spit out garble easily. I tried to also curb l3's tendency to double line (leaving a space between paragraphs) to mild
29
+ success. Calmer writing style.
30
+
31
+ Has a certain tendency to speak for the {user}, but that's easily negated with a few instructs.
32
+
33
+ Rec. Settings:
34
+ ```
35
+ Template: Plain Text or L3
36
+ Temperature: 1.3
37
+ Min P: 0.1
38
+ Repeat Penalty: 1.05
39
+ Repeat Penalty Tokens: 256
40
+ ```
41
+
42
+ ### Models Merged & Merge Theory
43
+
44
+ The following models were included in the merge:
45
+ * [ResplendentAI/Nymph_8B](https://huggingface.co/ResplendentAI/Nymph_8B)
46
+ * [TheDrummer/Llama-3SOME-8B-v2](https://huggingface.co/TheDrummer/Llama-3SOME-8B-v2)
47
+ * [nothingiisreal/L3-8B-Instruct-Abliterated-DWP](https://huggingface.co/nothingiisreal/L3-8B-Instruct-Abliterated-DWP)
48
+ * [Sao10K/L3-8B-Niitama-v1](https://huggingface.co/Sao10K/L3-8B-Niitama-v1)
49
+ * [OEvortex/Emotional-llama-8B](https://huggingface.co/OEvortex/Emotional-llama-8B)
50
+ * [ArliAI/ArliAI-Llama-3-8B-Formax-v1.0](https://huggingface.co/ArliAI/ArliAI-Llama-3-8B-Formax-v1.0)
51
+ * [nothingiisreal/L3-8B-Celeste-V1.2](https://huggingface.co/nothingiisreal/L3-8B-Celeste-V1.2)
52
+
53
+ ### Config
54
+
55
+ ```yaml
56
+ slices:
57
+ - sources:
58
+ - layer_range: [0, 6] #6
59
+ model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
60
+ - sources:
61
+ - layer_range: [4, 8] #10
62
+ model: nothingiisreal/L3-8B-Celeste-V1
63
+ parameters:
64
+ scale:
65
+ - filter: q_proj
66
+ value: [1, 0.89]
67
+ - sources:
68
+ - layer_range: [8, 12] #14
69
+ model: nothingiisreal/L3-8B-Instruct-Abliterated-DWP
70
+ parameters:
71
+ scale:
72
+ - filter: k_proj
73
+ value: [0.89, 1]
74
+ - sources:
75
+ - layer_range: [6, 12] #20
76
+ model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
77
+ - sources:
78
+ - layer_range: [10, 14] #24
79
+ model: nothingiisreal/L3-8B-Celeste-V1
80
+ - sources:
81
+ - layer_range: [12, 16] #28
82
+ model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
83
+ - sources:
84
+ - layer_range: [14, 28] #42
85
+ model: nothingiisreal/L3-8B-Instruct-Abliterated-DWP
86
+ - sources:
87
+ - layer_range: [21, 31] #52
88
+ model: nothingiisreal/L3-8B-Celeste-V1
89
+ - sources:
90
+ - layer_range: [28, 32] #56
91
+ model: nothingiisreal/L3-8B-Instruct-Abliterated-DWP
92
+ parameters:
93
+ int8_mask: true
94
+ merge_method: passthrough
95
+ dtype: float32
96
+ out_dtype: bfloat16
97
+ name: anterosv2.b
98
+ ---
99
+ slices:
100
+ - sources:
101
+ - layer_range: [0, 4] #4
102
+ model: ResplendentAI/Nymph_8B
103
+ - sources:
104
+ - layer_range: [2, 10] #12
105
+ model: Sao10K/L3-8B-Niitama-v1
106
+ - sources:
107
+ - layer_range: [8, 12] #16
108
+ model: TheDrummer/Llama-3SOME-8B-v2
109
+ - sources:
110
+ - layer_range: [10, 16] #22
111
+ model: Sao10K/L3-8B-Niitama-v1
112
+ - sources:
113
+ - layer_range: [12, 20] #30
114
+ model: ResplendentAI/Nymph_8B
115
+ - sources:
116
+ - layer_range: [14, 18] #34
117
+ model: OEvortex/Emotional-llama-8B
118
+ - sources:
119
+ - layer_range: [16, 20] #38
120
+ model: ResplendentAI/Nymph_8B
121
+ - sources:
122
+ - layer_range: [18, 22] #42
123
+ model: OEvortex/Emotional-llama-8B
124
+ - sources:
125
+ - layer_range: [20, 24] #46
126
+ model: TheDrummer/Llama-3SOME-8B-v2
127
+ - sources:
128
+ - layer_range: [23, 31] #54
129
+ model: ResplendentAI/Nymph_8B
130
+ - sources:
131
+ - layer_range: [30, 32] #56
132
+ model: Sao10K/L3-8B-Niitama-v1
133
+ parameters:
134
+ int8_mask: true
135
+ merge_method: passthrough
136
+ dtype: float32
137
+ out_dtype: bfloat16
138
+ name: anterosv2.c
139
+ ---
140
+ slices:
141
+ - sources:
142
+ - layer_range: [2, 17]
143
+ model: anterosv2.b
144
+ parameters:
145
+ int8_mask: true
146
+ merge_method: passthrough
147
+ dtype: float32
148
+ out_dtype: bfloat16
149
+ name: 2-17av2b.sl
150
+ ---
151
+ slices:
152
+ - sources:
153
+ - layer_range: [2, 17]
154
+ model: anterosv2.c
155
+ parameters:
156
+ int8_mask: true
157
+ merge_method: passthrough
158
+ dtype: float32
159
+ out_dtype: bfloat16
160
+ name: 2-17av2c.sl
161
+ ---
162
+ models:
163
+ - model: 2-17av2c.sl
164
+ parameters:
165
+ weight: [0.1, 0.5, 0.3]
166
+ - model: 2-17av2b.sl
167
+ parameters:
168
+ weight: [0.9, 0.5, 0.7]
169
+ merge_method: dare_linear
170
+ base_model: 2-17av2b.sl
171
+ parameters:
172
+ normalize: false
173
+ int8_mask: true
174
+ dtype: float32
175
+ out_dtype: bfloat16
176
+ name: 2-17aav2.sl
177
+ ---
178
+ slices:
179
+ - sources:
180
+ - layer_range: [17, 42]
181
+ model: anterosv2.b
182
+ parameters:
183
+ int8_mask: true
184
+ merge_method: passthrough
185
+ dtype: float32
186
+ out_dtype: bfloat16
187
+ name: 17-42av2b.sl
188
+ ---
189
+ slices:
190
+ - sources:
191
+ - layer_range: [17, 42]
192
+ model: anterosv2.c
193
+ parameters:
194
+ int8_mask: true
195
+ merge_method: passthrough
196
+ dtype: float32
197
+ out_dtype: bfloat16
198
+ name: 17-42av2c.sl
199
+ ---
200
+ models:
201
+ - model: 17-42av2b.sl
202
+ parameters:
203
+ weight: [0.8, 0.5, 0.4, 0.3, 0.15]
204
+ density: 0.65
205
+ epsilon: 0.07
206
+ lambda: 0.12
207
+ - model: 17-42av2c.sl
208
+ parameters:
209
+ weight: [0.2, 0.5, 0.6, 0.7, 0.85]
210
+ density: 0.7
211
+ epsilon: 0.05
212
+ lambda: 0.1
213
+ merge_method: della
214
+ base_model: 17-42av2c.sl
215
+ parameters:
216
+ normalize: false
217
+ int8_mask: true
218
+ dtype: float32
219
+ out_dtype: bfloat16
220
+ name: 17-42aav2.sl
221
+ ---
222
+ slices:
223
+ - sources:
224
+ - layer_range: [42, 52]
225
+ model: anterosv2.b
226
+ parameters:
227
+ int8_mask: true
228
+ merge_method: passthrough
229
+ dtype: float32
230
+ out_dtype: bfloat16
231
+ name: 42-52av2b.sl
232
+ ---
233
+ slices:
234
+ - sources:
235
+ - layer_range: [42, 52]
236
+ model: anterosv2.c
237
+ parameters:
238
+ int8_mask: true
239
+ merge_method: passthrough
240
+ dtype: float32
241
+ out_dtype: bfloat16
242
+ name: 42-52av2c.sl
243
+ ---
244
+ models:
245
+ - model: 42-52av2c.sl
246
+ parameters:
247
+ weight: [0.9, 0.65, 0.9]
248
+ - model: 42-52av2b.sl
249
+ parameters:
250
+ weight: [0.1, 0.35, 0.1]
251
+ merge_method: dare_linear
252
+ base_model: 42-52av2c.sl
253
+ parameters:
254
+ normalize: false
255
+ int8_mask: true
256
+ dtype: float32
257
+ out_dtype: bfloat16
258
+ name: 42-52aav2.sl
259
+ ---
260
+ slices:
261
+ - sources:
262
+ - layer_range: [0, 2]
263
+ model: anterosv2.b
264
+ - sources:
265
+ - layer_range: [0, 15]
266
+ model: 2-17aav2.sl
267
+ - sources:
268
+ - layer_range: [0, 25]
269
+ model: 17-42aav2.sl
270
+ - sources:
271
+ - layer_range: [0, 10]
272
+ model: 42-52aav2.sl
273
+ - sources:
274
+ - layer_range: [52, 56]
275
+ model: anterosv2.c
276
+ parameters:
277
+ int8_mask: true
278
+ merge_method: passthrough
279
+ dtype: float32
280
+ out_dtype: bfloat16
281
+ name: anterosv0.1.5
282
+ ```