Nohobby commited on
Commit
df698b7
·
verified ·
1 Parent(s): 3015eda

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +99 -11
README.md CHANGED
@@ -1,29 +1,117 @@
1
  ---
2
  base_model:
3
  - unsloth/Mistral-Small-Instruct-2409
 
 
 
 
 
 
 
 
4
  library_name: transformers
5
  tags:
6
  - mergekit
7
  - merge
8
-
 
 
9
  ---
10
- # SchisandraA
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
  ## Merge Details
15
- ### Merge Method
16
 
17
- This model was merged using the della_linear merge method using Schisandra as a base.
18
 
19
- ### Models Merged
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
- The following models were included in the merge:
22
- * [unsloth/Mistral-Small-Instruct-2409](https://huggingface.co/unsloth/Mistral-Small-Instruct-2409)
23
 
24
- ### Configuration
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
- The following YAML configuration was used to produce this model:
27
 
28
  ```yaml
29
  dtype: bfloat16
@@ -62,4 +150,4 @@ models:
62
  value: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
63
  - value: 1
64
 
65
- ```
 
1
  ---
2
  base_model:
3
  - unsloth/Mistral-Small-Instruct-2409
4
+ - TheDrummer/Cydonia-22B-v1.2
5
+ - Gryphe/Pantheon-RP-Pure-1.6.2-22b-Small
6
+ - anthracite-org/magnum-v4-22b
7
+ - ArliAI/Mistral-Small-22B-ArliAI-RPMax-v1.1
8
+ - spow12/ChatWaifu_v2.0_22B
9
+ - rAIfle/Acolyte-22B
10
+ - Envoid/Mistral-Small-NovusKyver
11
+ - InferenceIllusionist/SorcererLM-22B
12
  library_name: transformers
13
  tags:
14
  - mergekit
15
  - merge
16
+ license: other
17
+ language:
18
+ - en
19
  ---
20
+ ***
21
+ ## Schisandra
22
+
23
+ Many thanks to the authors of the models used!
24
+
25
+ [RPMax v1.1](https://huggingface.co/ArliAI/Mistral-Small-22B-ArliAI-RPMax-v1.1) | [Pantheon-RP](https://huggingface.co/Gryphe/Pantheon-RP-Pure-1.6.2-22b-Small) | [Cydonia v1.2](https://huggingface.co/TheDrummer/Cydonia-22B-v1.2) | [Magnum V4](https://huggingface.co/anthracite-org/magnum-v4-22b) | [ChatWaifu v2.0](https://huggingface.co/spow12/ChatWaifu_v2.0_22B) | [SorcererLM](https://huggingface.co/InferenceIllusionist/SorcererLM-22B) | [Acolyte](https://huggingface.co/rAIfle/Acolyte-22B) | [NovusKyver](https://huggingface.co/Envoid/Mistral-Small-NovusKyver)
26
+ ***
27
+
28
+ ### Overview
29
+
30
+ Main uses: RP, Storywriting
31
+
32
+ Merge of 8 Mistral Small finetunes in total, which were then merged back into the original model to make it less stupid. Worked somehow? Definitely smarter than my previous MS merge and maybe some finetunes. Seems to really adhere to the writing style of the previous output, so you'll need either a good character card or an existing chat for a better replies.
33
+
34
+ ***
35
+
36
+ ### Quants
37
 
38
+ [Static](https://huggingface.co/mradermacher/MS-Schisandra-22B-vB-GGUF)
39
+
40
+ [Imatrix](https://huggingface.co/mradermacher/MS-Schisandra-22B-vB-i1-GGUF)
41
+
42
+ ***
43
+
44
+ ### Settings
45
+
46
+ Prompt format: Mistral-V3 Tekken
47
+
48
+ Samplers: [These](https://qu.ax/OusTx.json) or [These](https://huggingface.co/ToastyPigeon/ST-Presets-Mistral-Small/resolve/main/ST-sampling-preset-Mistral-Small.json?download=true)
49
+
50
+ ***
51
 
52
  ## Merge Details
53
+ ### Merging steps
54
 
55
+ ## QCmix
56
 
57
+ ```yaml
58
+ base_model: InferenceIllusionist/SorcererLM-22B
59
+ parameters:
60
+ int8_mask: true
61
+ rescale: true
62
+ normalize: false
63
+ dtype: bfloat16
64
+ tokenizer_source: base
65
+ merge_method: della
66
+ models:
67
+ - model: Envoid/Mistral-Small-NovusKyver
68
+ parameters:
69
+ density: [0.35, 0.65, 0.5, 0.65, 0.35]
70
+ epsilon: [0.1, 0.1, 0.25, 0.1, 0.1]
71
+ lambda: 0.85
72
+ weight: [-0.01891, 0.01554, -0.01325, 0.01791, -0.01458]
73
+ - model: rAIfle/Acolyte-22B
74
+ parameters:
75
+ density: [0.6, 0.4, 0.5, 0.4, 0.6]
76
+ epsilon: [0.15, 0.15, 0.25, 0.15, 0.15]
77
+ lambda: 0.85
78
+ weight: [0.01768, -0.01675, 0.01285, -0.01696, 0.01421]
79
+ ```
80
 
81
+ ## Schisandra-vA
 
82
 
83
+ ```yaml
84
+ merge_method: della_linear
85
+ dtype: bfloat16
86
+ parameters:
87
+ normalize: true
88
+ int8_mask: true
89
+ tokenizer_source: union
90
+ base_model: TheDrummer/Cydonia-22B-v1.2
91
+ models:
92
+ - model: ArliAI/Mistral-Small-22B-ArliAI-RPMax-v1.1
93
+ parameters:
94
+ density: 0.55
95
+ weight: 1
96
+ - model: Gryphe/Pantheon-RP-Pure-1.6.2-22b-Small
97
+ parameters:
98
+ density: 0.55
99
+ weight: 1
100
+ - model: spow12/ChatWaifu_v2.0_22B
101
+ parameters:
102
+ density: 0.55
103
+ weight: 1
104
+ - model: anthracite-org/magnum-v4-22b
105
+ parameters:
106
+ density: 0.55
107
+ weight: 1
108
+ - model: QCmix
109
+ parameters:
110
+ density: 0.55
111
+ weight: 1
112
+ ```
113
 
114
+ ## Schisandra
115
 
116
  ```yaml
117
  dtype: bfloat16
 
150
  value: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
151
  - value: 1
152
 
153
+ ```