Been a while since I've seen you with a merge of my own, eh? This is released under the Qwen License which is part of the files in this quant and the main model.
Anyway, with the release of Deepseek R Distill, I gave it a poke and found what I expected. Not great for creative writing tasks but seems otherwise intelligent. So I decided to take a stab at another merge much like I did with LemonKunoichiWizard. What you're seeing here is the result of - give or take - four separate merges. Out of the merges, I feel this one is the best out of the four. That all being said, I think it increased the base intelligence of Kunou marginally which was nice to see. That being said, it's a 14b and I could be chugging a placebo so take that with a grain of salt. Thanks again to Sao10k for the finetune and the Deepseek team for releasing it under an open license. Hopefully you all enjoy.
I tested using SLERP as well, but the SLERP version was noticeably stupider. Only other thing of note is that it still has that Qwen tendency to occasionally just spam output the EoS tag sometimes.
This is the EXL2 4bpw version of this model. For the original model, go here
For the 8bpw version, go here
For the 6bpw version, go here
merge
This is a merge of pre-trained language models created using mergekit.
Merge Details
Merge Method
This model was merged using the SLERP merge method.
Models Merged
The following models were included in the merge:
Configuration
The following YAML configuration was used to produce this model:
models:
- model: Sao10K/14B-Qwen2.5-Kunou-v1
- model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
merge_method: slerp
base_model: Sao10K/14B-Qwen2.5-Kunou-v1
dtype: bfloat16
parameters:
t: [0, 0.5, 1, 0.5, 0] # V shaped curve: Hermes for input & output, WizardMath in the middle layers
- Downloads last month
- 7