ZYH-LLM-Qwen2.5-14B-V3
This is the third-generation model of the ZYH-LLM series.
It employs a large amount of model merging techniques, aiming to provide a powerful and unified 14-billion-parameter model, laying a solid foundation for further model merging and model fine-tuning.
As of February 25, 2025, the 14B model with the highest IFEval score
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 41.63 |
IFEval (0-Shot) | 85.78 |
BBH (3-Shot) | 48.18 |
MATH Lvl 5 (4-Shot) | 52.72 |
GPQA (0-shot) | 10.96 |
MuSR (0-shot) | 9.00 |
MMLU-PRO (5-shot) | 43.12 |
The following are the specific details of model merging, hoping to inspire you:
First stage:
Step 1:
models:
- model: Qwen/Qwen2.5-14B-Instruct
parameters:
density: 1
weight: 1
lambda: 0.9
merge_method: della
base_model: Qwen/Qwen2.5-14B
parameters:
density: 1
weight: 1
lambda: 0.9
normalize: true
int8_mask: true
dtype: bfloat16
tokenizer_source: base
name: Qwen2.5-14B-YOYO-1010
models:
- model: Qwen/Qwen2.5-14B-Instruct-1M
parameters:
density: 1
weight: 1
lambda: 0.9
merge_method: della
base_model: Qwen/Qwen2.5-14B
parameters:
density: 1
weight: 1
lambda: 0.9
normalize: true
int8_mask: true
dtype: bfloat16
tokenizer_source: base
name: Qwen2.5-14B-YOYO-1010-1M
models:
- model: Qwen/Qwen2.5-14B-Instruct
parameters:
density: 1
weight: 1
lambda: 0.9
merge_method: della
base_model: EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2
parameters:
density: 1
weight: 1
lambda: 0.9
normalize: true
int8_mask: true
dtype: bfloat16
tokenizer_source: base
name: EVA-Qwen2.5-14B-YOYO-1010
models:
- model: Qwen/Qwen2.5-14B-Instruct-1M
parameters:
density: 1
weight: 1
lambda: 0.9
merge_method: della
base_model: EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2
parameters:
density: 1
weight: 1
lambda: 0.9
normalize: true
int8_mask: true
dtype: bfloat16
tokenizer_source: base
name: EVA-Qwen2.5-14B-YOYO-1010-1M
Step 2:
models:
- model: EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2
parameters:
density: 1
weight: 1
lambda: 0.9
merge_method: della
base_model: Qwen/Qwen2.5-14B
parameters:
density: 1
weight: 1
lambda: 0.9
normalize: true
int8_mask: true
dtype: bfloat16
tokenizer_source: base
name: EVA-Qwen2.5-14B-base
merge_method: sce
models:
- model: EVA-Qwen2.5-14B-base
base_model: Qwen/Qwen2.5-14B-Instruct-1M
parameters:
select_topk: 1
dtype: bfloat16
tokenizer_source: base
normalize: true
int8_mask: true
name: Qwen2.5-14B-pro
Step 3:
models:
- model: Qwen2.5-14B-YOYO-1010-1M
- model: Qwen2.5-14B-YOYO-1010
- model: EVA-Qwen2.5-14B-YOYO-1010-1M
- model: EVA-Qwen2.5-14B-YOYO-1010
merge_method: sce
base_model: Qwen2.5-14B-pro
parameters:
normalize: true
int8_mask: true
dtype: bfloat16
tokenizer_source: base
name: ZYH-LLM-Qwen2.5-14B-V3-preview
Second stage:
models:
- model: Qwen/Qwen2.5-14B-Instruct
parameters:
density: 1
weight: 1
lambda: 0.9
merge_method: della
base_model: arcee-ai/Virtuoso-Small-v2
parameters:
density: 1
weight: 1
lambda: 0.9
normalize: true
int8_mask: true
dtype: bfloat16
tokenizer_source: base
name: Qwen2.5-14B-YOYO-della1
models:
- model: Qwen/Qwen2.5-14B-Instruct-1M
parameters:
density: 1
weight: 1
lambda: 0.9
merge_method: della
base_model: arcee-ai/Virtuoso-Small-v2
parameters:
density: 1
weight: 1
lambda: 0.9
normalize: true
int8_mask: true
dtype: bfloat16
tokenizer_source: base
name: Qwen2.5-14B-YOYO-della2
models:
- model: Qwen/Qwen2.5-14B-Instruct
parameters:
density: 1
weight: 1
lambda: 0.9
merge_method: della
base_model: Azure99/Blossom-V6-14B
parameters:
density: 1
weight: 1
lambda: 0.9
normalize: true
int8_mask: true
dtype: bfloat16
tokenizer_source: base
name: Qwen2.5-14B-YOYO-della3
models:
- model: Qwen/Qwen2.5-14B-Instruct-1M
parameters:
density: 1
weight: 1
lambda: 0.9
merge_method: della
base_model: Azure99/Blossom-V6-14B
parameters:
density: 1
weight: 1
lambda: 0.9
normalize: true
int8_mask: true
dtype: bfloat16
tokenizer_source: base
name: Qwen2.5-14B-YOYO-della4
Final stage:
merge_method: model_stock
base_model: ZYH-LLM-Qwen2.5-14B-V3-preview
models:
- model: Qwen2.5-14B-YOYO-della1
- model: Qwen2.5-14B-YOYO-della2
- model: Qwen2.5-14B-YOYO-della3
- model: Qwen2.5-14B-YOYO-della4
dtype: bfloat16
tokenizer_source: base
int8_mask: true
normalize: true
name: ZYH-LLM-Qwen2.5-14B-V3
- Downloads last month
- 15
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.
Model tree for YOYO-AI/ZYH-LLM-Qwen2.5-14B-V3
Merge model
this model
Evaluation results
- strict accuracy on IFEval (0-Shot)Open LLM Leaderboard85.780
- normalized accuracy on BBH (3-Shot)Open LLM Leaderboard48.180
- exact match on MATH Lvl 5 (4-Shot)Open LLM Leaderboard52.720
- acc_norm on GPQA (0-shot)Open LLM Leaderboard10.960
- acc_norm on MuSR (0-shot)Open LLM Leaderboard9.000
- accuracy on MMLU-PRO (5-shot)test set Open LLM Leaderboard43.120