--- base_model: - deepseek-ai/DeepSeek-R1-Distill-Qwen-14B - Sao10K/14B-Qwen2.5-Kunou-v1 library_name: transformers tags: - mergekit - merge license: apache-2.0 --- ![DeepseekerKunou](https://files.catbox.moe/n5ejwr.png) Been a while since I've seen you with a merge of my own, eh? This is released under the Qwen License which is part of the files in this quant and the main model.
Anyway, with the release of Deepseek R Distill, I gave it a poke and found what I expected. Not great for creative writing tasks but seems otherwise intelligent. So I decided to take a stab at another merge much like I did with LemonKunoichiWizard. What you're seeing here is the result of - give or take - four separate merges. Out of the merges, I feel this one is the best out of the four. That all being said, I think it increased the base intelligence of Kunou marginally which was nice to see. That being said, it's a 14b and I could be chugging a placebo so take that with a grain of salt. Thanks again to Sao10k for the finetune and the Deepseek team for releasing it under an open license. Hopefully you all enjoy.
I tested using SLERP as well, but the SLERP version was noticeably stupider. Only other thing of note is that it still has that Qwen tendency to occasionally just spam output the EoS tag sometimes.
[This is the EXL2 8bpw version of this model. For the original model, go here](https://huggingface.co/Statuo/Deepseeker-Kunou-Qwen2.5-14b)
[For the 6bpw version, go here](https://huggingface.co/Statuo/Deepseeker-Kunou-Qwen2.5-14b-EXL2-6bpw)
[For the 4bpw version, go here](https://huggingface.co/Statuo/Deepseeker-Kunou-Qwen2.5-14b-EXL2-4bpw)
# merge This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge method. ### Models Merged The following models were included in the merge: * [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B) * [Sao10K/14B-Qwen2.5-Kunou-v1](https://huggingface.co/Sao10K/14B-Qwen2.5-Kunou-v1) ### Configuration The following YAML configuration was used to produce this model: ```yaml models: - model: Sao10K/14B-Qwen2.5-Kunou-v1 parameters: weight: 1.0 - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B parameters: weight: 0.2 - model: Sao10K/14B-Qwen2.5-Kunou-v1 parameters: weight: 0.6 merge_method: linear dtype: float16 ```