πΏ Daredevil-8B
Collection
Fine-tuned abliterated merge of the best Llama 3 8B model. Highest MMLU score in its category.
β’
5 items
β’
Updated
β’
11
Abliterated version of mlabonne/Daredevil-8B using failspy's notebook.
It based on the technique described in the blog post "Refusal in LLMs is mediated by a single direction".
Thanks to Andy Arditi, Oscar Balcells Obeso, Aaquib111, Wes Gurnee, Neel Nanda, and failspy.
This is an uncensored model. You can use it for any application that doesn't require alignment, like role-playing.
Tested on LM Studio using the "Llama 3" preset.
Daredevil-8B-abliterated is the second best-performing 8B model on the Open LLM Leaderboard in terms of MMLU score (27 May 24).
Evaluation performed using LLM AutoEval. See the entire leaderboard here.
Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
---|---|---|---|---|---|
mlabonne/Daredevil-8B π | 55.87 | 44.13 | 73.52 | 59.05 | 46.77 |
mlabonne/Daredevil-8B-abliterated π | 55.06 | 43.29 | 73.33 | 57.47 | 46.17 |
mlabonne/Llama-3-8B-Instruct-abliterated-dpomix π | 52.26 | 41.6 | 69.95 | 54.22 | 43.26 |
meta-llama/Meta-Llama-3-8B-Instruct π | 51.34 | 41.22 | 69.86 | 51.65 | 42.64 |
failspy/Meta-Llama-3-8B-Instruct-abliterated-v3 π | 51.21 | 40.23 | 69.5 | 52.44 | 42.69 |
mlabonne/OrpoLlama-3-8B π | 48.63 | 34.17 | 70.59 | 52.39 | 37.36 |
meta-llama/Meta-Llama-3-8B π | 45.42 | 31.1 | 69.95 | 43.91 | 36.7 |
!pip install -qU transformers accelerate
from transformers import AutoTokenizer
import transformers
import torch
model = "mlabonne/Daredevil-8B-abliterated"
messages = [{"role": "user", "content": "What is a large language model?"}]
tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
"text-generation",
model=model,
torch_dtype=torch.float16,
device_map="auto",
)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])