final_merge
This is a merge of pre-trained language models created using mergekit.
Merge Details
Merge Method
This model was merged using the DARE TIES merge method using ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 as a base.
Models Merged
The following models were included in the merge:
- ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
- ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
Configuration
The following YAML configuration was used to produce this model:
base_model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
dtype: bfloat16
merge_method: dare_ties
parameters:
int8_mask: 1.0
normalize: 1.0
slices:
- sources:
- layer_range: [0, 4]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
parameters:
density: 0.9938394142861436
weight: 0.9423657003811549
- layer_range: [0, 4]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
parameters:
density: 1.0
weight: 0.028757039653447516
- layer_range: [0, 4]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
- sources:
- layer_range: [4, 8]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
parameters:
density: 0.05673164623727889
weight: 0.7797412499727034
- layer_range: [4, 8]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
parameters:
density: 0.6914043863726181
weight: -0.15297779543381299
- layer_range: [4, 8]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
- sources:
- layer_range: [8, 12]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
parameters:
density: 1.0
weight: 0.9809469677871971
- layer_range: [8, 12]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
parameters:
density: 0.7543507024870173
weight: 0.3071004438847045
- layer_range: [8, 12]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
- sources:
- layer_range: [12, 16]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
parameters:
density: 1.0
weight: 1.4015101870290536
- layer_range: [12, 16]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
parameters:
density: 0.46763882063048323
weight: -0.1196117853194659
- layer_range: [12, 16]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
- sources:
- layer_range: [16, 20]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
parameters:
density: 0.21406332897580937
weight: 1.0480493790359993
- layer_range: [16, 20]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
parameters:
density: 0.34641930682013805
weight: -0.47947217927833796
- layer_range: [16, 20]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
- sources:
- layer_range: [20, 24]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
parameters:
density: 1.0
weight: 0.2994136103516895
- layer_range: [20, 24]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
parameters:
density: 0.5203144588951365
weight: 0.499493856558078
- layer_range: [20, 24]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
- sources:
- layer_range: [24, 28]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
parameters:
density: 0.4787814900698036
weight: 1.2570648350519475
- layer_range: [24, 28]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
parameters:
density: 0.9634193213633135
weight: 0.1895211433923194
- layer_range: [24, 28]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
- sources:
- layer_range: [28, 32]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
parameters:
density: 1.0
weight: 0.4451581671531947
- layer_range: [28, 32]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
parameters:
density: 1.0
weight: -0.4842425881895236
- layer_range: [28, 32]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
- sources:
- layer_range: [32, 36]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
parameters:
density: 1.0
weight: 0.3404347162926922
- layer_range: [32, 36]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
parameters:
density: 1.0
weight: 0.6029140105117367
- layer_range: [32, 36]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
- sources:
- layer_range: [36, 40]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
parameters:
density: 1.0
weight: 0.8991607837541874
- layer_range: [36, 40]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
parameters:
density: 0.37358877892316833
weight: 0.9451608965972428
- layer_range: [36, 40]
model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
- Downloads last month
- 3
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.