Spaetzle-v60-7b / README.md
cstr's picture
Update README.md
2cb32b9 verified
|
raw
history blame
6.78 kB
metadata
tags:
  - merge
  - mergekit
  - lazymergekit
  - abideen/AlphaMonarch-dora
base_model:
  - abideen/AlphaMonarch-dora
license: cc-by-nc-4.0
language:
  - de
  - en

Spaetzle-v60-7b

This is a progressive (mostly dare-ties, but also slerp i.a.) merge with the intention of suitable compromise for English and German local tasks.

Spaetzle-v60-7b is a merge of the following models using LazyMergekit:

Benchmarks

The performance looks ok so far: e.g. we get in EQ-Bench: Score (v2_de): 65.08 (Parseable: 171.0).

From the Occiglot Euro LLM Leaderboard:

Model DE EN ARC EN TruthfulQA EN Belebele EN HellaSwag EN MMLU EN ARC DE TruthfulQA DE Belebele DE HellaSwag DE MMLU DE
mistral-community/Mixtral-8x22B-v0.1 66.81 72.87 70.56 52.29 93.89 70.41 77.17 63.9 29.31 92.44 77.9 70.49
cstr/Spaetzle-v60-7b 60.95 71.65 69.88 66.24 90.11 68.43 63.59 58 37.31 84.22 70.09 55.11
VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct 60.07 74.71 74.49 66.19 91.67 74.55 66.65 59.37 29.57 88.56 66.43 56.44
occiglot/occiglot-7b-de-en-instruct 56.65 61.7 60.41 49.38 81.22 60.43 57.06 54.49 31.09 77.22 68.84 51.59
occiglot/occiglot-7b-de-en 54.01 58.78 55.63 42.33 79.11 59.99 56.84 50.56 26.27 74.33 67.42 51.46
meta-llama/Meta-Llama-3-8B 53.89 63.08 58.02 43.87 86.44 61.75 65.3 46.45 24.24 81.11 62.48 55.18
mistralai/Mistral-7B-Instruct-v0.2 53.52 67.63 63.74 66.81 82.44 65.96 59.2 48.59 37.69 68.89 62.24 50.2
occiglot/occiglot-7b-eu5-instruct 53.15 57.78 55.89 44.9 74.67 59.92 53.51 52.95 28.68 66.78 68.52 48.82
clibrain/lince-mistral-7b-it-es 52.98 62.43 62.46 43.32 82.44 63.86 60.06 49.44 28.17 75 61.64 50.64
mistralai/Mistral-7B-v0.1 52.8 62.73 61.26 42.62 84.44 62.89 62.46 47.65 28.43 73.89 61.06 52.96
LeoLM/leo-mistral-hessianai-7b 51.78 56.11 52.22 42.92 73.67 57.86 53.88 47.48 25.25 69.11 68.21 48.83

And for the int4-inc quantized version, from Low-bit Quantized Open LLM Leaderboard:

Type Model Average ⬆️ ARC-c ARC-e Boolq HellaSwag Lambada MMLU Openbookqa Piqa Truthfulqa Winogrande #Params (B) #Size (G)
πŸ’ Intel/SOLAR-10.7B-Instruct-v1.0-int4-inc 68.49 60.49 82.66 88.29 68.29 73.36 62.43 35.6 80.74 56.06 76.95 10.57 5.98
πŸ’ cstr/Spaetzle-v60-7b-int4-inc 68.01 62.12 85.27 87.34 66.43 70.58 61.39 37 82.26 50.18 77.51 7.04 4.16
πŸ”· TheBloke/SOLAR-10.7B-Instruct-v1.0-GGUF 66.6 60.41 83.38 88.29 67.73 52.42 62.04 37.2 82.32 56.3 75.93 10.73 6.07
πŸ”· cstr/Spaetzle-v60-7b-Q4_0-GGUF 66.44 61.35 85.19 87.98 66.54 52.78 62.05 40.6 81.72 47 79.16 7.24 4.11
πŸ’ Intel/Mistral-7B-Instruct-v0.2-int4-inc 65.73 55.38 81.44 85.26 65.67 70.89 58.66 34.2 80.74 51.16 73.95 7.04 4.16
πŸ’ Intel/Phi-3-mini-4k-instruct-int4-inc 65.09 57.08 83.33 86.18 59.45 68.14 66.62 38.6 79.33 38.68 73.48 3.66 2.28
πŸ”· TheBloke/Mistral-7B-Instruct-v0.2-GGUF 63.52 53.5 77.9 85.44 66.9 50.11 58.45 38.8 77.58 53.12 73.4 7.24 4.11
πŸ’ Intel/Meta-Llama-3-8B-Instruct-int4-inc 62.93 51.88 81.1 83.21 57.09 71.32 62.41 35.2 78.62 36.35 72.14 7.2 5.4

Contamination check results (reference model: Mistral instruct 7b v0.1):

  • MMLU: result < 0.1, %: 0.19
  • TruthfulQA: result < 0.1, %: 0.34

🧩 Configuration

models:
  - model: cstr/Spaetzle-v58-7b
    # no parameters necessary for base model
  - model: abideen/AlphaMonarch-dora
    parameters:
      density: 0.60
      weight: 0.30
merge_method: dare_ties
base_model: cstr/Spaetzle-v58-7b
parameters:
  int8_mask: true
dtype: bfloat16
random_seed: 0
tokenizer_source: base

πŸ’» Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "cstr/Spaetzle-v60-7b"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])