gelukuMLG's picture
Update README.md
563070c verified
metadata
license: llama3

Compute for this merge was provided by KoboldAI.

Important: Because this model is based on Cat-8B-Instruct-V1 it has the stop sequence issues. Make sure to add </s> as a stop Sequence in whatever backend or ui you are using.

The following models were used in this recipe:

Recipe used:

merge_method: passthrough
dtype: bfloat16
vocab_type: bpe
slices:
- sources:
  - layer_range: [0, 24]
    model: TheSkullery/llama-3-cat-8b-instruct-v1
- sources:
  - layer_range: [8, 24]
    model: TheSkullery/llama-3-cat-8b-instruct-v1
    parameters:
      scale:
      - filter: o_proj
        value: 0.0
      - filter: down_proj
        value: 0.0
      - value: 1.0
- sources:
  - layer_range: [8, 24]
    model: TheSkullery/llama-3-cat-8b-instruct-v1
    parameters:
      scale:
      - filter: o_proj
        value: 0.0
      - filter: down_proj
        value: 0.0
      - value: 1.0
- sources:
  - layer_range: [24, 32]
    model: TheSkullery/llama-3-cat-8b-instruct-v1
name: LLaMa-3-Cat-Instruct-Unhealed-15B
---

merge_method: task_arithmetic
dtype: bfloat16
vocab_type: bpe
base_model: elinas/Llama-3-15B-Instruct-zeroed
models:
  - model: elinas/Llama-3-15B-Instruct-zeroed-ft
    parameters:
      weight: 1.0
  - model: LLaMa-3-Cat-Instruct-Unhealed-15B
    parameters:
      weight: 1.0