yleo commited on
Commit
ec297b1
1 Parent(s): 6cd87f6

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -0
README.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ base_model: mlabonne/Monarch-7B
4
+ datasets:
5
+ - yleo/emerton_dpo_pairs_judge
6
+ tags:
7
+ - dpo
8
+ ---
9
+ ---
10
+
11
+ # 🦜 EmertonMonarch-7B
12
+
13
+ EmertonOmniBeagle-7B-dpo is a DPO fine-tune of [mlabonne/Monarch-7B](https://huggingface.co/mlabonne/OmniBeagle-7B) using the [yleo/emerton_dpo_pairs_judge](https://huggingface.co/datasets/yleo/emerton_dpo_pairs_judge) preference dataset created from [Intel/orca_dpo_pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs) by replacing gpt 3.5 answer by a gpt4 Turbo answer. Then, gpt4 Turbo is put as chosen whereas gpt4 is put as rejected.
14
+
15
+ ## 🔍 Applications
16
+
17
+ This model uses a context window of 8k. It is compatible with different templates, like chatml and Llama's chat template.
18
+
19
+ ## 🏆 Evaluation
20
+
21
+ ### Open LLM Leaderboard
22
+
23
+ To come...
24
+
25
+ ## 💻 Usage
26
+
27
+ ```python
28
+ !pip install -qU transformers accelerate
29
+ from transformers import AutoTokenizer
30
+ import transformers
31
+ import torch
32
+ model = "yleo/EmertonBeagle-7B"
33
+ messages = [{"role": "user", "content": "How to improve LLM fine-tuning?"}]
34
+ tokenizer = AutoTokenizer.from_pretrained(model)
35
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
36
+ pipeline = transformers.pipeline(
37
+ "text-generation",
38
+ model=model,
39
+ torch_dtype=torch.float16,
40
+ device_map="auto",
41
+ )
42
+ outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
43
+ print(outputs[0]["generated_text"])
44
+ ```