What is the 39B upscale?
merge_method: passthrough
slices:
- sources:
- layer_range: [0, 41]
model: unsloth/Mistral-Small-Instruct-2409
- sources:
- layer_range: [19, 41]
model: unsloth/Mistral-Small-Instruct-2409
parameters:
scale:
- filter: o_proj
value: 0.0
- filter: down_proj
value: 0.0
- value: 1.0
- sources:
- layer_range: [19, 41]
model: unsloth/Mistral-Small-Instruct-2409
parameters:
scale:
- filter: o_proj
value: 0.0
- filter: down_proj
value: 0.0
- value: 1.0
- sources:
- layer_range: [41, 55]
model: unsloth/Mistral-Small-Instruct-2409
- Layers 0 to 18 are original
- Layers 19 to 41 are duplicated, zero'd out, and put in the middle twice
- Layers 42 to 54 are original
- down_proj and o_proj layers for the duplicated part have been nulled and will require healing to 'unignore' the added layers
[ Unique ][ Duplicated ][ Unique ]
0 ----------- 18 19 ------------ 41 42 ---------- 54
34.5% 41.8% 23.7%
Control Sample A (Nemo & Rocinante, similar training)
Also note the layer sequence and other labels since it will be unreadable for the 39B