File size: 10,669 Bytes
b0ecee6
 
2486430
 
 
 
 
 
 
 
 
 
 
 
b0ecee6
 
 
 
2486430
 
 
 
b0ecee6
2486430
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
86b728b
2486430
 
b0ecee6
2486430
b0ecee6
2486430
b0ecee6
2486430
b0ecee6
2486430
b0ecee6
2486430
b0ecee6
2486430
b0ecee6
2486430
 
 
 
 
 
 
 
 
 
 
 
b0ecee6
 
2486430
 
 
 
 
 
 
 
b0ecee6
2486430
 
b0ecee6
2486430
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b0ecee6
2486430
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b0ecee6
 
2486430
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b0ecee6
2486430
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b0ecee6
 
 
2486430
b0ecee6
2486430
b0ecee6
 
2486430
 
 
 
b0ecee6
2486430
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
---
base_model:
- Delta-Vector/Baldur-8B
- kromcomp/L3.1-Spark-r64-LoRA
- NarrativAI/Cakrawala-Llama-3.1-8B
- maximalists/BRAG-Llama-3.1-8b-v0.1
- NeverSleep/Lumimaid-v0.2-8B
- kromcomp/L3.1-Aura-r32-LoRA
- grimjim/BadApple-o1-Llama-3.1-8B
- crestf411/L3.1-8B-Slush-v1.1
- SicariusSicariiStuff/LLAMA-3_8B_Unaligned_BETA
- ArliAI/Llama-3.1-8B-ArliAI-RPMax-v1.3
- kromcomp/L3-T900-r64-LoRA
- invisietch/L3.1-EtherealRainbow-v1.0-rc1-8B
library_name: transformers
tags:
- mergekit
- merge
- roleplay
- RP
- storytelling
license: llama3.1
---
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/667eea5cdebd46a5ec4dcc3d/Ztk0b0LwLnf51kCnSGylf.jpeg)

It's technically 10.6B parameters, but for simple naming conventions just truncate the 6. 

I took a break from model merging for a bit then came back to see the AI community launching themselves another year into the future and I need to learn everything again. It's great. 

This project went through several iterations and though this is the final, some previous versions had some potential but didn't workout somehow. I might revisit those and try to make them their own models. Another flavor perhaps? 
### Quants

[My quants](https://huggingface.co/kromquant/L3.1-Tivir-10B-GGUFs)

Check the Model Tree for additional quants. 
### Details 

General Roleplay/Storytelling use model. The best way I can explain the model it's weirdly direct until it decides to be creative in which it'll spit out some serious prose out of no where. Minimal slop, though if you want to kill it entirely you can use DRY and/or XTC. Surprisingly picky about instructions, so I recommend you run this model without instructs to taste then slowly introduce directions. The fewer the better it seems. 

I'd also opt for higher Min P even on lower temps as for some reason, the low Min P outputs are very dry and sharp in writing. Otherwise, it's a solid model that can run hot and negative if prompted with good recall and character adhesion that can interweave said details throughout the story. 

Recommended Settings Range:
```
Template: Llama 3
Temperature: 1.1-1.3
Min P: 0.08-0.12  
Repeat Penalty: 1.05-1.1
Repeat Penalty Tokens: 256
```

### Merge Theory

Where to fucking begin. 

To start; majority of this model's creation process was experimentation and fooling around with LoRAs and new merge methods. Learned a lot at the cost of a few brain cells. Worth it. 

As per usual, the idea was to make stable models and creative models then mush them together into a better model. After trial and error, I made two stable models; one (Soda) that was generally COA competent and the other (Cider) more adept for recall. Those got merged via SCE to retain context length and intellect. 

The creative model was the next challenge. I knew I wanted to use [SicariusSicariiStuff](https://huggingface.co/SicariusSicariiStuff)'s Unaligned Llama project for its appropriately named unhinged creativity, but it's Llama 3 not 3.1. Trying to pull a LoRA directly from the model didn't work due to different layer names and doing some merging tricks to fix it resulted in a LoRA that made any model spam cusses like a 2012 COD lobby. So, the only feasible way to integrate it was to use ye ol faithful Model stock. Usual rules apply, higher L3.1 to L3 model ratio keeps the jank at bay. Though, some jank is inevitable. 

If I had to place bets, I'd say that 50% of my time making this model was attempting to master DELLA. The theory is as straight forward as AI merging methods go, it's trying to find default values that work that has made me want to chuck my keyboard against a wall on multiple occasions. What I've gleamed is the following:

You don't need to set values for `epsilon` and `lambda`, but setting them give you more control over the resulting merge so it doesn't hurt to test. All of this is my opinion and flawed testing, ymmv. 

`epsilon` dictates the range of what parameters will be 'nulled' per say, which is useful to avoid interference and slop. This is a double edge sword though as the the bigger the range that is, the more 'nulled' the model parameters will be when merging into base. Keep in mind that `epsilon` is *half* of that range since the drop probabilities are assigned between `density - epsilon` to `density + epsilon`. In my experimenting, anything above an a total of 0.05 per model runs the risk of creating a stylistically duller model and higher then a 0.1 total becomes a noticeably dumber model. I've made `epsilon: 0.0175` my personal default value to start. 

`lambda` is less complicated as it's just the multiplication factor of the final parameters after the drop probabilities are assigned from the above range. Setting `lambda: 1` (I think this is the default setting too) keep things simple and this is usually the best value to keep it at. But, there is a tiny amount of wiggle room. If `lambda` > 1, you'll get a more expressive merge but lacks creativity with exponential diminishing returns. If `lambda` <1, the merge gets repetitive yet retains more sanity somehow. There's a time and place for either option. For me, `lambda: 1` for the base model and `lambda: 1-1.1` or `lambda: 0.9-1` for additional models depending the intended purposes. 

As for why I expanded each model the way I did, two main reasons. 

1) I wasn't going to finetune on top of the resulting merge so the usual DUS stack would cause more problems then intended. The strengths of a DUS stack where you tack on an additional # of layers in the middle of the model come out after there's 'healing' to 'repair' the empty added layers via finetuning. I attempted a makeshift version of this strategy using pulled LoRAs in mergekit and it didn't work nearly as well. Having a handful of voided layers packed together makes the resulting merge less chatty and sometimes less coherent.
2) It gave me more control over where I wanted extra 'brainpower'. While they are empty layers due to being zeroed out, that's only for two modules (`o_proj` and `down_proj`). The others still hold value therefore they still effect the final merge, though to a lesser extent. By being able to split and place where these layers go, I can keep similar layers closer to each other and limit problems down the line. 

### Config

```yaml
models:
    - model: Delta-Vector/Baldur-8B+kromcomp/L3.1-Spark-r64-LoRA
    - model: NarrativAI/Cakrawala-Llama-3.1-8B
    - model: maximalists/BRAG-Llama-3.1-8b-v0.1
base_model: Delta-Vector/Baldur-8B+kromcomp/L3.1-Spark-r64-LoRA
parameters:
  normalize: false
merge_method: model_stock
chat_template: llama3
tokenizer:
  source: union
dtype: float32
name: soda
---
slices:
- sources:
  - layer_range: [0, 12]
    model: soda
- sources:
  - layer_range: [8, 12]
    model: soda
    parameters:
      scale:
      - filter: o_proj
        value: 0
      - filter: down_proj
        value: 0
      - value: 1
- sources:
  - layer_range: [12, 20]
    model: soda
- sources:
  - layer_range: [16, 20]
    model: soda
    parameters:
      scale:
      - filter: o_proj
        value: 0
      - filter: down_proj
        value: 0
      - value: 1
- sources:
  - layer_range: [20, 28]
    model: soda
- sources:
  - layer_range: [24, 28]
    model: soda
    parameters:
      scale:
      - filter: o_proj
        value: 0
      - filter: down_proj
        value: 0
      - value: 1
- sources:
  - layer_range: [28, 32]
    model: soda
parameters:
  int8_mask: true
merge_method: passthrough
dtype: float32
name: pop
---
models:
    - model: NeverSleep/Lumimaid-v0.2-8B+kromcomp/L3.1-Aura-r32-LoRA
    - model: grimjim/BadApple-o1-Llama-3.1-8B
    - model: crestf411/L3.1-8B-Slush-v1.1
base_model: crestf411/L3.1-8B-Slush-v1.1
parameters:
  normalize: false
merge_method: model_stock
chat_template: llama3
tokenizer:
  source: union
dtype: float32
name: cider
---
slices:
- sources:
  - layer_range: [0, 12]
    model: cider
- sources:
  - layer_range: [8, 12]
    model: cider
    parameters:
      scale:
      - filter: o_proj
        value: 0
      - filter: down_proj
        value: 0
      - value: 1
- sources:
  - layer_range: [12, 20]
    model: cider
- sources:
  - layer_range: [16, 20]
    model: cider
    parameters:
      scale:
      - filter: o_proj
        value: 0
      - filter: down_proj
        value: 0
      - value: 1
- sources:
  - layer_range: [20, 28]
    model: cider
- sources:
  - layer_range: [24, 28]
    model: cider
    parameters:
      scale:
      - filter: o_proj
        value: 0
      - filter: down_proj
        value: 0
      - value: 1
- sources:
  - layer_range: [28, 32]
    model: cider
parameters:
  int8_mask: true
merge_method: passthrough
dtype: float32
name: float
---
```yaml
models:
  - model: float
    parameters:
      select_topk: 0.6
  - model: pop
    parameters:
      select_topk: 0.6
base_model: float
merge_method: sce
chat_template: llama3
tokenizer:
  source: union
parameters:
  int8_mask: true
dtype: float32
name: syrup
---
models:
    - model: SicariusSicariiStuff/LLAMA-3_8B_Unaligned_BETA
    - model: ArliAI/Llama-3.1-8B-ArliAI-RPMax-v1.3+kromcomp/L3-T900-r64-LoRA
    - model: invisietch/L3.1-EtherealRainbow-v1.0-rc1-8B
base_model: invisietch/L3.1-EtherealRainbow-v1.0-rc1-8B
parameters:
  normalize: false
merge_method: model_stock
chat_template: llama3
tokenizer:
  source: union
dtype: float32
name: semialign
---
slices:
- sources:
  - layer_range: [0, 12]
    model: semialign
- sources:
  - layer_range: [8, 12]
    model: semialign
    parameters:
      scale:
      - filter: o_proj
        value: 0
      - filter: down_proj
        value: 0
      - value: 1
- sources:
  - layer_range: [12, 20]
    model: semialign
- sources:
  - layer_range: [16, 20]
    model: semialign
    parameters:
      scale:
      - filter: o_proj
        value: 0
      - filter: down_proj
        value: 0
      - value: 1
- sources:
  - layer_range: [20, 28]
    model: semialign
- sources:
  - layer_range: [24, 28]
    model: semialign
    parameters:
      scale:
      - filter: o_proj
        value: 0
      - filter: down_proj
        value: 0
      - value: 1
- sources:
  - layer_range: [28, 32]
    model: semialign
parameters:
  int8_mask: true
merge_method: passthrough
dtype: float32
name: midal
---
models:
  - model: midal
    parameters:
      weight: [0.2, 0.8]
      density: 0.7
      epsilon: 0.0125
      lambda: 1.05
  - model: syrup
    parameters:
      weight: [0.8, 0.2]
      density: 0.7
      epsilon: 0.0175
      lambda: 1
base_model: syrup
merge_method: della
chat_template: llama3
tokenizer:
  source: midal
parameters:
  normalize: false
  int8_mask: true
dtype: float32
name: ir
```