RDson commited on
Commit
41951bb
1 Parent(s): bb3d2f4

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +102 -0
README.md ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - moe
4
+ - llama
5
+ - '3'
6
+ - llama 3
7
+ - 4x8b
8
+ ---
9
+ # GGUF files of [Llama-3-Peach-Instruct-4x8B-MoE](https://huggingface.co/RDson/Llama-3-Peach-Instruct-4x8B-MoE)
10
+ <img src="https://i.imgur.com/MlnauLb.jpeg" width="640"/>
11
+
12
+ # Llama-3-Peach-Instruct-4x8B-MoE
13
+
14
+ GGUF files are available here: [RDson/Llama-3-Peach-Instruct-4x8B-MoE-GGUF](https://huggingface.co/RDson/Llama-3-Peach-Instruct-4x8B-MoE-GGUF).
15
+
16
+ This is a experimental MoE created using Mergekit from
17
+ * [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)
18
+ * [Salesforce/SFR-Iterative-DPO-LLaMA-3-8B-R](https://huggingface.co/Salesforce/SFR-Iterative-DPO-LLaMA-3-8B-R)
19
+ * [NousResearch/Hermes-2-Theta-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B)
20
+ * [rombodawg/Llama-3-8B-Instruct-Coder](https://huggingface.co/rombodawg/Llama-3-8B-Instruct-Coder)
21
+
22
+ Mergekit yaml file:
23
+ ```
24
+ base_model: Meta-Llama-3-8B-Instruct
25
+ experts:
26
+ - source_model: Meta-Llama-3-8B-Instruct
27
+ positive_prompts:
28
+ - "explain"
29
+ - "chat"
30
+ - "assistant"
31
+ - "think"
32
+ - "roleplay"
33
+ - "versatile"
34
+ - "helpful"
35
+ - "factual"
36
+ - "integrated"
37
+ - "adaptive"
38
+ - "comprehensive"
39
+ - "balanced"
40
+ negative_prompts:
41
+ - "specialized"
42
+ - "narrow"
43
+ - "focused"
44
+ - "limited"
45
+ - "specific"
46
+ - source_model: Llama-3-8B-Instruct-Coder
47
+ positive_prompts:
48
+ - "python"
49
+ - "math"
50
+ - "solve"
51
+ - "code"
52
+ - "programming"
53
+ - "javascript"
54
+ - "algorithm"
55
+ - "factual"
56
+ negative_prompts:
57
+ - "sorry"
58
+ - "cannot"
59
+ - "concise"
60
+ - "imaginative"
61
+ - "creative"
62
+ - source_model: SFR-Iterative-DPO-LLaMA-3-8B-R
63
+ positive_prompts:
64
+ - "AI"
65
+ - "instructive"
66
+ - "chat"
67
+ - "assistant"
68
+ - "clear"
69
+ - "directive"
70
+ - "helpful"
71
+ - "informative"
72
+ - source_model: Hermes-2-Theta-Llama-3-8B
73
+ positive_prompts:
74
+ - "chat"
75
+ - "assistant"
76
+ - "analytical"
77
+ - "accurate"
78
+ - "code"
79
+ - "logical"
80
+ - "knowledgeable"
81
+ - "precise"
82
+ - "calculate"
83
+ - "compute"
84
+ - "solve"
85
+ - "work"
86
+ - "python"
87
+ - "javascript"
88
+ - "programming"
89
+ - "algorithm"
90
+ - "tell me"
91
+ - "assistant"
92
+ - "factual"
93
+ negative_prompts:
94
+ - "abstract"
95
+ - "artistic"
96
+ - "emotional"
97
+ - "mistake"
98
+ - "inaccurate"
99
+ gate_mode: hidden
100
+ dtype: float16
101
+ ```
102
+ Some inspiration for the Mergekit yaml file is from [LoneStriker/Umbra-MoE-4x10.7-2.4bpw-h6-exl2](https://huggingface.co/LoneStriker/Umbra-MoE-4x10.7-2.4bpw-h6-exl2).