Azazelle commited on
Commit
c57ee5f
·
verified ·
1 Parent(s): 5268ffb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +65 -20
README.md CHANGED
@@ -5,41 +5,86 @@ library_name: transformers
5
  tags:
6
  - mergekit
7
  - merge
8
-
 
 
9
  ---
10
- # GP26jJg
 
 
 
 
11
 
12
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
13
 
14
- ## Merge Details
15
- ### Merge Method
16
 
17
- This model was merged using the [task arithmetic](https://arxiv.org/abs/2212.04089) merge method using [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct) as a base.
 
 
 
 
 
 
18
 
19
- ### Models Merged
20
 
21
- The following models were included in the merge:
22
- * output/hq_rp
23
 
24
  ### Configuration
25
 
26
  The following YAML configuration was used to produce this model:
27
 
28
  ```yaml
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  base_model: NousResearch/Meta-Llama-3-8B-Instruct
30
  dtype: float32
31
- merge_method: task_arithmetic
32
- parameters:
33
- normalize: 0.0
34
- slices:
35
- - sources:
36
- - layer_range: [0, 32]
37
- model: output/hq_rp
38
  parameters:
39
  weight:
40
- - filter: mlp
41
- value: 1.25
42
- - value: 1.1
43
- - layer_range: [0, 32]
44
- model: NousResearch/Meta-Llama-3-8B-Instruct
 
 
 
 
45
  ```
 
 
5
  tags:
6
  - mergekit
7
  - merge
8
+ - llama
9
+ - conversational
10
+ license: llama3
11
  ---
12
+ # L3-Hecate-8B-v1.0
13
+
14
+ ![Hecate](https://huggingface.co/Azazelle/L3-Hecate-8B-v1.0/resolve/main/IhBchsAoR4ao0D2C2AEKuw.jpg)
15
+
16
+ ## About:
17
 
18
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
19
 
20
+ **Recommended Samplers:**
 
21
 
22
+ ```
23
+ Temperature - 1.0
24
+ TFS - 0.75
25
+ Smoothing Factor - 0.3
26
+ Smoothing Curve - 1.1
27
+ Repetition Penalty - 1.08
28
+ ```
29
 
30
+ ### Merge Method
31
 
32
+ This model was merged a series of model stock and lora merges, followed by ExPO. It uses a mix of smart and roleplay centered models to improve performance.
 
33
 
34
  ### Configuration
35
 
36
  The following YAML configuration was used to produce this model:
37
 
38
  ```yaml
39
+ ---
40
+ # Mopey RP Mix
41
+ models:
42
+ - model: failspy/Llama-3-8B-Instruct-MopeyMule+Azazelle/Llama-3-Sunfall-8b-lora
43
+ - model: failspy/Llama-3-8B-Instruct-MopeyMule+Azazelle/Llama-3-8B-Abomination-LORA
44
+ - model: failspy/Llama-3-8B-Instruct-MopeyMule+Azazelle/llama3-8b-hikikomori-v0.4
45
+ - model: failspy/Llama-3-8B-Instruct-MopeyMule+Azazelle/Llama-3-Instruct-LiPPA-LoRA-8B
46
+ - model: failspy/Llama-3-8B-Instruct-MopeyMule+Azazelle/BlueMoon_Llama3
47
+ - model: failspy/Llama-3-8B-Instruct-MopeyMule+Azazelle/Llama3_RP_ORPO_LoRA
48
+ - model: failspy/Llama-3-8B-Instruct-MopeyMule+Azazelle/Llama-3-LongStory-LORA
49
+ merge_method: model_stock
50
+ base_model: failspy/Llama-3-8B-Instruct-MopeyMule
51
+ dtype: float32
52
+ vocab_type: bpe
53
+ name: mopey_rp
54
+
55
+ ---
56
+ models:
57
+ - model: Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
58
+ - model: Sao10K/L3-8B-Tamamo-v1
59
+ - model: Sao10K/L3-8B-Niitama-v1
60
+ - model: cycy233/L3-base-v2-e3.0
61
+ - model: Azazelle/L3-Hecate-8B-v1.0
62
+ - model: R136a1/Bungo-L3-8B
63
+ - model: Jellywibble/meseca-20062024-c1
64
+ - model: NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS
65
+ - model: Jellywibble/lora_120k_pref_data_ep2
66
+ - model: Nitral-AI/Hathor_Stable-v0.2-L3-8B
67
+ - model: mopey_rp
68
+ merge_method: model_stock
69
  base_model: NousResearch/Meta-Llama-3-8B-Instruct
70
  dtype: float32
71
+ vocab_type: bpe
72
+ name: hq_rp
73
+
74
+ ---
75
+ # ExPO
76
+ models:
77
+ - model: hq_rp
78
  parameters:
79
  weight:
80
+ - filter: mlp
81
+ value: 1.25
82
+ - value: 1.1
83
+ merge_method: task_arithmetic
84
+ base_model: NousResearch/Meta-Llama-3-8B-Instruct
85
+ parameters:
86
+ normalize: false
87
+ dtype: float32
88
+ vocab_type: bpe
89
  ```
90
+