Update README.md
Browse files
README.md
CHANGED
@@ -1,41 +1,26 @@
|
|
1 |
---
|
2 |
-
base_model:
|
|
|
|
|
|
|
|
|
3 |
library_name: transformers
|
4 |
tags:
|
5 |
- mergekit
|
6 |
- merge
|
7 |
-
|
|
|
|
|
8 |
---
|
9 |
-
#
|
10 |
-
|
11 |
-
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
12 |
-
|
13 |
-
## Merge Details
|
14 |
-
### Merge Method
|
15 |
-
|
16 |
-
This model was merged using the [Arcee Fusion](https://arcee.ai) merge method using sometimesanotion/Slerp-Lamarckvevergence as a base.
|
17 |
-
|
18 |
-
### Models Merged
|
19 |
-
|
20 |
-
The following models were included in the merge:
|
21 |
-
* sometimesanotion/Chocolatine-Fusion-Qwenvergence
|
22 |
|
23 |
-
|
24 |
|
25 |
-
|
|
|
|
|
|
|
26 |
|
27 |
-
|
28 |
-
name: Lamarck-14B-v0.7-Fusion
|
29 |
-
merge_method: arcee_fusion
|
30 |
-
base_model: sometimesanotion/Slerp-Lamarckvevergence
|
31 |
-
tokenizer_source: base
|
32 |
-
parameters:
|
33 |
-
int8_mask: true
|
34 |
-
normalize: true
|
35 |
-
rescale: false
|
36 |
-
dtype: bfloat16
|
37 |
-
out_dtype: bfloat16
|
38 |
-
models:
|
39 |
-
- model: sometimesanotion/Chocolatine-Fusion-Qwenvergence
|
40 |
|
41 |
-
|
|
|
1 |
---
|
2 |
+
base_model:
|
3 |
+
- sometimesanotion/Lamarck-14B-v0.7
|
4 |
+
- sometimesanotion/Qwenvergence-14B-v12-Prose-DS
|
5 |
+
- jpacifico/Chocolatine-2-14B-Instruct-v2.0.3
|
6 |
+
- suayptalha/Lamarckvergence-14B
|
7 |
library_name: transformers
|
8 |
tags:
|
9 |
- mergekit
|
10 |
- merge
|
11 |
+
license: apache-2.0
|
12 |
+
language:
|
13 |
+
- en
|
14 |
---
|
15 |
+
# EXPERIMENTAL:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
|
17 |
+
So what's this new arcee_fusion merge method, and what can we do with it? This model aims to find out, as a multi-stage merge where 3 out of 4 steps are fusions:
|
18 |
|
19 |
+
* A fusion of [Lamarck-14B-v0.7](http://huggingface.co/sometimesanotion/Lamarck-14B-v0.7) and @suayptalha's [Lamarckvergence SLERP merge](http://huggingface.co/suayptalha/Lamarckvergence-14B) of Lamarck-14B-v0.7 and [Qwenvergence-14B-v12-Prose-DS](http://huggingface.co/sometimesanotion/Qwenvergence-14B-v12-Prose-DS).
|
20 |
+
* A SLERP of Lamarck-14B-v0.7-Fusionvergence with Qwenvergence-14B-v12-Prose-DS, the latter emphasized in later layers.
|
21 |
+
* A fusion of @jpacifico's [Chocolatine-2-14B-Instruct-v2.0.3](http://huggingface.co/jpacifico/Chocolatine-2-14B-Instruct-v2.0.3), itself a finetune of Arcee's (https://huggingface.co/arcee-ai/Virtuoso-Small-v2) with Lamarck-14B-v0.7 and Qwenvergence-14B-v12-Prose-DS, fusion-merged with - you guessed it - Qwenvergence-14B-v12-Prose-DS
|
22 |
+
* A fusion of the previous two.
|
23 |
|
24 |
+
I've seen strong prose from this model, which is natural considering its re-emphasis of Qwenvergence-14B-v12-Prose-DS. A full evaluation will be cued shortly.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
|
26 |
+
This is actually a lot simpler than a mainline Lamarck release, and where it fits for efforts towards a Lamarck v0.8 depends greatly on evaluation and feedback.
|