Exotic Frankenmerges 🥨
Collection
Merges of models of different architectures and sizes that end up working surprisingly well
•
1 item
•
Updated
•
1
An experiment in merging models of different architectures and sizes. Here are the steps:
This model was merged using the linear merge method.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
models:
- model: jeonsworld/CarbonVillain-en-10.7B-v4
parameters:
weight: 1.0
- model: vicgalle/NeuralBeagle-11B
parameters:
weight: 0.5
merge_method: linear
dtype: float16
At the time of its creation (21-01-2024), it is the best model in the Open LLM Leaderboard for its size class (10.7B-11B), and also 13B models:
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 74.64 |
AI2 Reasoning Challenge (25-Shot) | 71.84 |
HellaSwag (10-Shot) | 88.93 |
MMLU (5-Shot) | 66.62 |
TruthfulQA (0-shot) | 69.43 |
Winogrande (5-shot) | 84.06 |
GSM8k (5-shot) | 66.94 |
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 22.36 |
IFEval (0-Shot) | 54.15 |
BBH (3-Shot) | 33.06 |
MATH Lvl 5 (4-Shot) | 5.51 |
GPQA (0-shot) | 6.94 |
MuSR (0-shot) | 9.19 |
MMLU-PRO (5-shot) | 25.29 |