Edit model card

llama3-8b-spaetzle-v33

This is a merge of the following models:

It attempts a compromise in usefulness for German and English tasks.

For GGUF quants see cstr/llama3-8b-spaetzle-v33-GGUF,

Benchmarks

It achieves on EQ-Bench v2_de as q4km (old version without pre-tokenizer-fix) quants 66.59 (171 of 171 parseable) and 73.17 on v2 (english) (171/171).

For the int4-inc quants:

Benchmark Score
Average 66.13
ARC-c 59.81
ARC-e 85.27
Boolq 84.10
HellaSwag 62.47
Lambada 73.28
MMLU 64.11
OpenbookQA 37.2
Piqa 80.30
TruthfulQA 50.21
Winogrande 73.72

Nous

Model Average AGIEval GPT4All TruthfulQA Bigbench
mlabonne/Daredevil-8B πŸ“„ 55.87 44.13 73.52 59.05 46.77
cstr/llama3-8b-spaetzle-v33 πŸ“„ 55.26 42.61 73.9 59.28 45.25
mlabonne/Daredevil-8B-abliterated πŸ“„ 55.06 43.29 73.33 57.47 46.17
NousResearch/Hermes-2-Theta-Llama-3-8B πŸ“„ 54.28 43.9 72.62 56.36 44.23
openchat/openchat-3.6-8b-20240522 πŸ“„ 53.49 44.03 73.67 49.78 46.48
mlabonne/Llama-3-8B-Instruct-abliterated-dpomix πŸ“„ 52.26 41.6 69.95 54.22 43.26
meta-llama/Meta-Llama-3-8B-Instruct πŸ“„ 51.34 41.22 69.86 51.65 42.64
failspy/Meta-Llama-3-8B-Instruct-abliterated-v3 πŸ“„ 51.21 40.23 69.5 52.44 42.69
mlabonne/OrpoLlama-3-8B πŸ“„ 48.63 34.17 70.59 52.39 37.36
meta-llama/Meta-Llama-3-8B πŸ“„ 45.42 31.1 69.95 43.91 36.7

🧩 Configuration

models:
  - model: cstr/llama3-8b-spaetzle-v20
    # no parameters necessary for base model
  - model: cstr/llama3-8b-spaetzle-v31
    parameters:
      density: 0.65
      weight: 0.25
  - model: cstr/llama3-8b-spaetzle-v28
    parameters:
      density: 0.65
      weight: 0.25
  - model: cstr/llama3-8b-spaetzle-v26
    parameters:
      density: 0.65
      weight: 0.15
merge_method: dare_ties
base_model: cstr/llama3-8b-spaetzle-v20
parameters:
  int8_mask: true
dtype: bfloat16
random_seed: 0
tokenizer_source: base

πŸ’» Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "cstr/llama3-8b-spaetzle-v33"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
Downloads last month
12
Safetensors
Model size
8.03B params
Tensor type
BF16
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for cstr/llama3-8b-spaetzle-v33

Spaces using cstr/llama3-8b-spaetzle-v33 5

Collection including cstr/llama3-8b-spaetzle-v33