|
--- |
|
base_model: |
|
- CultriX/Qwen2.5-14B-Wernickev5 |
|
- CultriX/Qwen2.5-14B-Wernickev3 |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merge |
|
|
|
--- |
|
# merge |
|
|
|
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). |
|
|
|
## Merge Details |
|
### Merge Method |
|
|
|
This model was merged using the SLERP merge method. |
|
|
|
### Models Merged |
|
|
|
The following models were included in the merge: |
|
* [CultriX/Qwen2.5-14B-Wernickev5](https://huggingface.co/CultriX/Qwen2.5-14B-Wernickev5) |
|
* [CultriX/Qwen2.5-14B-Wernickev3](https://huggingface.co/CultriX/Qwen2.5-14B-Wernickev3) |
|
|
|
### Configuration |
|
|
|
The following YAML configuration was used to produce this model: |
|
|
|
```yaml |
|
models: |
|
- model: CultriX/Qwen2.5-14B-Wernickev3 |
|
- model: CultriX/Qwen2.5-14B-Wernickev5 |
|
merge_method: slerp |
|
base_model: CultriX/Qwen2.5-14B-Wernickev3 |
|
dtype: bfloat16 |
|
parameters: |
|
t: [0, 0.5, 1, 0.5, 0] |
|
dtype: bfloat16 |
|
adaptive_merge_parameters: |
|
task_weights: |
|
tinyArc: 1.2 # Emphasize Arc's logical and reasoning tasks |
|
tinyHellaswag: 1.1 # Maintain strong performance in contextual predictions |
|
tinyMMLU: 1.2 # Ensure domain knowledge is well-preserved |
|
tinyTruthfulQA: 1.3 # Leverage v3's top TruthfulQA score |
|
tinyTruthfulQA_mc1: 1.1 # Balance for both models' performance |
|
tinyWinogrande: 1.2 # Enhance contextual understanding |
|
smoothing_factor: 0.2 # Moderate blending for stable integration |
|
gradient_clipping: 1.0 # Prevent over-contribution from any single model |
|
|
|
``` |
|
|