final_merge

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the DARE TIES merge method using ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 as a base.

Models Merged

The following models were included in the merge:

  • ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
  • ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417

Configuration

The following YAML configuration was used to produce this model:

base_model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
dtype: bfloat16
merge_method: dare_ties
parameters:
  int8_mask: 1.0
  normalize: 1.0
slices:
- sources:
  - layer_range: [0, 4]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
    parameters:
      density: 0.9938394142861436
      weight: 0.9423657003811549
  - layer_range: [0, 4]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
    parameters:
      density: 1.0
      weight: 0.028757039653447516
  - layer_range: [0, 4]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
- sources:
  - layer_range: [4, 8]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
    parameters:
      density: 0.05673164623727889
      weight: 0.7797412499727034
  - layer_range: [4, 8]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
    parameters:
      density: 0.6914043863726181
      weight: -0.15297779543381299
  - layer_range: [4, 8]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
- sources:
  - layer_range: [8, 12]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
    parameters:
      density: 1.0
      weight: 0.9809469677871971
  - layer_range: [8, 12]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
    parameters:
      density: 0.7543507024870173
      weight: 0.3071004438847045
  - layer_range: [8, 12]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
- sources:
  - layer_range: [12, 16]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
    parameters:
      density: 1.0
      weight: 1.4015101870290536
  - layer_range: [12, 16]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
    parameters:
      density: 0.46763882063048323
      weight: -0.1196117853194659
  - layer_range: [12, 16]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
- sources:
  - layer_range: [16, 20]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
    parameters:
      density: 0.21406332897580937
      weight: 1.0480493790359993
  - layer_range: [16, 20]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
    parameters:
      density: 0.34641930682013805
      weight: -0.47947217927833796
  - layer_range: [16, 20]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
- sources:
  - layer_range: [20, 24]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
    parameters:
      density: 1.0
      weight: 0.2994136103516895
  - layer_range: [20, 24]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
    parameters:
      density: 0.5203144588951365
      weight: 0.499493856558078
  - layer_range: [20, 24]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
- sources:
  - layer_range: [24, 28]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
    parameters:
      density: 0.4787814900698036
      weight: 1.2570648350519475
  - layer_range: [24, 28]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
    parameters:
      density: 0.9634193213633135
      weight: 0.1895211433923194
  - layer_range: [24, 28]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
- sources:
  - layer_range: [28, 32]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
    parameters:
      density: 1.0
      weight: 0.4451581671531947
  - layer_range: [28, 32]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
    parameters:
      density: 1.0
      weight: -0.4842425881895236
  - layer_range: [28, 32]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
- sources:
  - layer_range: [32, 36]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
    parameters:
      density: 1.0
      weight: 0.3404347162926922
  - layer_range: [32, 36]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
    parameters:
      density: 1.0
      weight: 0.6029140105117367
  - layer_range: [32, 36]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
- sources:
  - layer_range: [36, 40]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct3_3817207417
    parameters:
      density: 1.0
      weight: 0.8991607837541874
  - layer_range: [36, 40]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_sft_all_1epoch_3352830596
    parameters:
      density: 0.37358877892316833
      weight: 0.9451608965972428
  - layer_range: [36, 40]
    model: ./workspace/evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
Downloads last month
3
Safetensors
Model size
13.7B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.