ToastyPigeon
commited on
Commit
•
875b525
1
Parent(s):
540517d
Update README.md
Browse files
README.md
CHANGED
@@ -5,40 +5,82 @@ tags:
|
|
5 |
- merge
|
6 |
|
7 |
---
|
8 |
-
#
|
9 |
|
10 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
11 |
|
12 |
## Merge Details
|
13 |
### Merge Method
|
14 |
|
15 |
-
This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge method.
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
|
17 |
### Models Merged
|
18 |
|
19 |
The following models were included in the merge:
|
20 |
-
*
|
21 |
-
*
|
22 |
|
23 |
### Configuration
|
24 |
|
25 |
The following YAML configuration was used to produce this model:
|
26 |
|
27 |
```yaml
|
|
|
|
|
|
|
|
|
|
|
|
|
28 |
dtype: float16
|
29 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
30 |
slices:
|
31 |
-
- sources:
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
36 |
parameters:
|
37 |
weight: 0.5
|
38 |
-
-
|
39 |
-
model:
|
40 |
-
model:
|
41 |
-
path: E:\ModelMerge\merges\Psycet-V2\Psycet-Reverse
|
42 |
parameters:
|
43 |
weight: 0.5
|
|
|
|
|
44 |
```
|
|
|
5 |
- merge
|
6 |
|
7 |
---
|
8 |
+
# Psyonic-Cetacean-20B-V2
|
9 |
|
10 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
11 |
|
12 |
## Merge Details
|
13 |
### Merge Method
|
14 |
|
15 |
+
This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge method on two stack-merged models.
|
16 |
+
|
17 |
+
The first is [jebcarter/psyonic-cetacean-20B](https://huggingface.co/jebcarter/psyonic-cetacean-20B)
|
18 |
+
(Orca first, reproduced so I didn't have to download that model on top of the components).
|
19 |
+
The second is the same recipe with the models reversed.
|
20 |
+
|
21 |
+
Since [jebcarter](https://huggingface.co/jebcarter) suggested this recipe, credit goes to him.
|
22 |
|
23 |
### Models Merged
|
24 |
|
25 |
The following models were included in the merge:
|
26 |
+
* microsoft/Orca-2-13b
|
27 |
+
* KoboldAI/LLaMA2-13B-Psyfighter2
|
28 |
|
29 |
### Configuration
|
30 |
|
31 |
The following YAML configuration was used to produce this model:
|
32 |
|
33 |
```yaml
|
34 |
+
models:
|
35 |
+
- model: microsoft/Orca-2-13b
|
36 |
+
parameters:
|
37 |
+
weight: 1.0
|
38 |
+
merge_method: task_arithmetic
|
39 |
+
base_model: TheBloke/Llama-2-13B-fp16
|
40 |
dtype: float16
|
41 |
+
name: FlatOrca2
|
42 |
+
---
|
43 |
+
slices:
|
44 |
+
- sources:
|
45 |
+
- model: FlatOrca2
|
46 |
+
layer_range: [0, 16]
|
47 |
+
- sources:
|
48 |
+
- model: KoboldAI/LLaMA2-13B-Psyfighter2
|
49 |
+
layer_range: [8, 24]
|
50 |
+
- sources:
|
51 |
+
- model: FlatOrca2
|
52 |
+
layer_range: [17, 32]
|
53 |
+
- sources:
|
54 |
+
- model: KoboldAI/LLaMA2-13B-Psyfighter2
|
55 |
+
layer_range: [25, 40]
|
56 |
+
merge_method: passthrough
|
57 |
+
dtype: float16
|
58 |
+
name: Psycet
|
59 |
+
---
|
60 |
slices:
|
61 |
+
- sources:
|
62 |
+
- model: KoboldAI/LLaMA2-13B-Psyfighter2
|
63 |
+
layer_range: [0, 16]
|
64 |
+
- sources:
|
65 |
+
- model: FlatOrca2
|
66 |
+
layer_range: [8, 24]
|
67 |
+
- sources:
|
68 |
+
- model: KoboldAI/LLaMA2-13B-Psyfighter2
|
69 |
+
layer_range: [17, 32]
|
70 |
+
- sources:
|
71 |
+
- model: FlatOrca2
|
72 |
+
layer_range: [25, 40]
|
73 |
+
merge_method: passthrough
|
74 |
+
dtype: float16
|
75 |
+
name: Psycet-Reverse
|
76 |
+
---
|
77 |
+
models:
|
78 |
+
- model: Psycet
|
79 |
parameters:
|
80 |
weight: 0.5
|
81 |
+
- model: Psycet-Reverse
|
|
|
|
|
|
|
82 |
parameters:
|
83 |
weight: 0.5
|
84 |
+
merge_method: linear
|
85 |
+
dtype: float16
|
86 |
```
|