sometimesanotion
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -22,9 +22,9 @@ pipeline_tag: text-generation
|
|
22 |
|
23 |
Lamarck 14B v0.6: A generalist merge focused on multi-step reasoning, prose, multi-language ability, and code. It is based on components that have punched above their weight in the 14 billion parameter class.
|
24 |
|
25 |
-
Previous releases were based on a SLERP merge of model_stock->della branches focused on reasoning and prose. The prose branch got surprisingly good at reasoning, and the reasoning branch being the base for IFEVAL became an all-around generalist. Some of you have already downloaded the reasoning branch, released as [sometimesanotion/Qwen2.5-14B-Vimarckoso-v3](https://huggingface.co/sometimesanotion/Qwen2.5-14B-Vimarckoso-v3).
|
26 |
|
27 |
-
Lamarck 0.6 aims to build upon Vimarckoso v3's all-around strength with strong buffs to prose and translation quality, and strong reasoning for its class. Updates to come as leaderboards become available to evaluate it in-depth.
|
28 |
|
29 |
## Merge Details
|
30 |
|
@@ -32,14 +32,14 @@ This model was made in two branches: a della_linear merge, and a sequence of mo
|
|
32 |
|
33 |
### Models Merged
|
34 |
|
35 |
-
The model_stock, breadcrumbs, and della_linear all use the following models:
|
36 |
|
37 |
-
[sometimesanotion/Qwen2.5-14B-Vimarckoso-v3](https://huggingface.co/sometimesanotion/Qwen2.5-14B-Vimarckoso-v3)
|
38 |
-
[sometimesanotion/Lamarck-14B-v0.3](https://huggingface.co/sometimesanotion/Lamarck-14B-v0.3)
|
39 |
-
[sometimesanotion/Qwenvergence-14B-v3-Prose](https://huggingface.co/sometimesanotion/Qwenvergence-14B-v3-Prose) - a model_stock merge of multiple prose-oriented models which posts surprisingly high MATH, GPQA, and MUSR scores.
|
40 |
-
[Krystalan/DRT-o1-14B](https://huggingface.co/Krystalan/DRT-o1-14B) - A particularly interesting model which applies extra reasoning to language translation. Check out their fascinating research paper at [arxiv.org/abs/2412.17498](https://arxiv.org/abs/2412.17498).
|
41 |
-
[underwoods/medius-erebus-magnum-14b](https://huggingface.co/underwoods/medius-erebus-magnum-14b)
|
42 |
-
[sometimesanotion/Abliterate-Qwenvergence](https://huggingface.co/sometimesanotion/Abliterate-Qwenvergence) - A custom version of [huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2](https://huggingface.co/huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2)
|
43 |
|
44 |
### Configuration
|
45 |
|
|
|
22 |
|
23 |
Lamarck 14B v0.6: A generalist merge focused on multi-step reasoning, prose, multi-language ability, and code. It is based on components that have punched above their weight in the 14 billion parameter class.
|
24 |
|
25 |
+
The tempo of Lamarck releases slowed because improving IFEVAL while maintaining other scores is no small task. Previous releases were based on a SLERP merge of model_stock->della branches focused on reasoning and prose. The prose branch got surprisingly good at reasoning, and the reasoning branch being the base for IFEVAL became an all-around generalist. Some of you have already downloaded the reasoning branch, released as [sometimesanotion/Qwen2.5-14B-Vimarckoso-v3](https://huggingface.co/sometimesanotion/Qwen2.5-14B-Vimarckoso-v3).
|
26 |
|
27 |
+
Lamarck 0.6 aims to build upon Vimarckoso v3's all-around strength with strong buffs to prose and translation quality, and strong reasoning for its class. Updates to come as leaderboards become available to evaluate it in-depth. Even now, initial testing is showing solid translation, problem-solving, and prose capability.
|
28 |
|
29 |
## Merge Details
|
30 |
|
|
|
32 |
|
33 |
### Models Merged
|
34 |
|
35 |
+
**Top influences:** The model_stock, breadcrumbs, and della_linear all use the following models:
|
36 |
|
37 |
+
- **[sometimesanotion/Qwen2.5-14B-Vimarckoso-v3](https://huggingface.co/sometimesanotion/Qwen2.5-14B-Vimarckoso-v3)** - As of this writing, Vimarckoso v3 has the #1 average score on [open-llm-leaderboard/open_llm_leaderboard](https://shorturl.at/m225j) for any model under 32 billion parameters. This appears to be because of synergy between its component odels.
|
38 |
+
- **[sometimesanotion/Lamarck-14B-v0.3](https://huggingface.co/sometimesanotion/Lamarck-14B-v0.3)** - With heavy influence from [VAGOsolutions/SauerkrautLM-v2-14b-DPO](https://huggingface.co/VAGOsolutions/SauerkrautLM-v2-14b-DPO), this is a leader in technical answers.
|
39 |
+
- **[sometimesanotion/Qwenvergence-14B-v3-Prose](https://huggingface.co/sometimesanotion/Qwenvergence-14B-v3-Prose)** - a model_stock merge of multiple prose-oriented models which posts surprisingly high MATH, GPQA, and MUSR scores, with contributions from [EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2](https://huggingface.co/EVA-UNIT1/EVA-Qwen2.5-14B-v0.2) apparent.
|
40 |
+
- **[Krystalan/DRT-o1-14B](https://huggingface.co/Krystalan/DRT-o1-14B)** - A particularly interesting model which applies extra reasoning to language translation. Check out their fascinating research paper at [arxiv.org/abs/2412.17498](https://arxiv.org/abs/2412.17498).
|
41 |
+
- **[underwoods/medius-erebus-magnum-14b](https://huggingface.co/underwoods/medius-erebus-magnum-14b)** - The leading contributor to prose quality, as it's finetuned on datasets behind the well-recognized Magnum series.
|
42 |
+
- **[sometimesanotion/Abliterate-Qwenvergence](https://huggingface.co/sometimesanotion/Abliterate-Qwenvergence)** - A custom version of [huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2](https://huggingface.co/huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2)
|
43 |
|
44 |
### Configuration
|
45 |
|