--- license: apache-2.0 tags: - merge - mergekit - lazymergekit - huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2 - mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1 - Triangle104/DSR1-Distill-Qwen-7B-RP - deepseek-ai/DeepSeek-R1-Distill-Qwen-7B language: - en - zh base_model: - deepseek-ai/DeepSeek-R1-Distill-Qwen-7B - huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2 - mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1 - Triangle104/DSR1-Distill-Qwen-7B-RP pipeline_tag: text-classification library_name: transformers --- # ZeroXClem/Qwen2.5-7B-DistilPrism **Qwen2.5-7B-DistilPrism** is a **distillation / reasoning focused model merge** designed to combine multiple variations of DeepSeek-R1 distillations, resulting in a **refined, high-performance language model**. Utilizing the **Model Stock** merge method, this fusion captures the best attributes of **DeepSeek-R1-Distill-Qwen-7B** and its improved derivatives. ## 🚀 Merged Models This model is a weighted merge of the following: - [**huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2**](https://huggingface.co/huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2): An uncensored distillation of DeepSeek-R1, optimized to remove refusals and improve usability. - [**mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1**](https://huggingface.co/mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1): A refined distillation that improves accuracy and robustness across various benchmarks. - [**Triangle104/DSR1-Distill-Qwen-7B-RP**](https://huggingface.co/Triangle104/DSR1-Distill-Qwen-7B-RP): A composite merge of various distilled DeepSeek variants, serving as an essential ingredient for performance tuning. - [**deepseek-ai/DeepSeek-R1-Distill-Qwen-7B**](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B): The foundation of this merge, representing the distilled form of DeepSeek-R1 optimized for efficiency and strong reasoning capabilities. ## 🧩 Merge Configuration The following **YAML configuration** defines how these models were combined using **Model Stock**, ensuring **balanced contributions** from each source: ```yaml # Merge configuration for ZeroXClem/Qwen2.5-7B-DistilPrism using Model Stock name: ZeroXClem-Qwen2.5-7B-DistilPrism merge_method: model_stock base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B tokenizer_source: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B dtype: bfloat16 parameters: normalize: true rescale: true models: - model: huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2 parameters: weight: 0.3 - model: mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1 parameters: weight: 0.25 - model: Triangle104/DSR1-Distill-Qwen-7B-RP parameters: weight: 0.2 - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B parameters: weight: 0.25 ``` ### 🔑 Key Parameters - **Normalization & Rescaling**: Ensures weight distributions remain balanced across all components. - **Model Stock Merge Method**: Optimizes contribution from each model to retain the best attributes. - **Weighted Blending**: The **abliterated** and **re-distilled** models contribute the most, refining both alignment and general usability. --- ## 🗣️ Inference You can use the model for text generation as follows: ### Ollama **[Quickstart to Ollama Guide Here](https://aidev.zeroxclem.com/blog/08-setting-up-ollama)** I recommend ollama for daily driver applications, as it supports thinkking tags. ```bash ollama run hf.co/ZeroXClem/Qwen2.5-7B-DistilPrism # If you are using quants, just copy the url and replace 'huggingface.co/' with 'hf.co/' followed by name of quant. ``` ### Transformers ```python from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline import torch # Define the model name model_name = "ZeroXClem/Qwen2.5-7B-DistilPrism" # Load the tokenizer tokenizer = AutoTokenizer.from_pretrained(model_name) # Load the model model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.bfloat16, device_map="auto" ) # Initialize the pipeline text_generator = pipeline( "text-generation", model=model, tokenizer=tokenizer, torch_dtype=torch.bfloat16, device_map="auto" ) # Define the input prompt prompt = "Explain the significance of artificial intelligence in modern healthcare." # Generate the output outputs = text_generator( prompt, max_new_tokens=150, do_sample=True, temperature=0.7, top_k=50, top_p=0.95 ) # Print the generated text print(outputs[0]["generated_text"]) ``` --- ## 🎯 Use Case & Applications **Qwen2.5-7B-DistilPrism** is designed for **efficient, high-quality text generation** with strong reasoning capabilities. It is well-suited for: - **Advanced Reasoning & Problem Solving**: Excels in logic-heavy tasks and multi-step reasoning problems. - **Conversational AI**: Optimized for **fluid, responsive dialogue**, reducing refusals and improving engagement. - **Mathematical & Scientific Computation**: Enhanced **math & code generation abilities** compared to standard distillations. - **Content Creation & Summarization**: Generates coherent and **contextually rich** text suitable for various applications. --- ## 📜 License This model is released under the **MIT License**. --- ## 📊 Benchmark Results (Coming Soon) We are currently in the process of **quantizing and benchmarking** this model. Stay tuned for performance updates across: - **IFEval (0-Shot)** - **BBH (3-Shot)** - **MATH (4-Shot)** - **GPQA (0-Shot)** - **MuSR (0-Shot)** - **MMLU-PRO (5-Shot)** --- ## 💡 Tags - `merge` - `mergekit` - `model_stock` - `DeepSeek-R1` - `Distillation` - `abliterated` - `re-distilled` - `DeepSeek-R1-Distill-Qwen-7B` --- ## 🙏 Special Thanks This project wouldn't be possible without the incredible contributions from: - **[@huihui-ai](https://huggingface.co/huihui-ai)** – For developing **DeepSeek-R1-Distill-Qwen-7B-abliterated-v2**, a bold step towards improving model alignment. - **[@mobiuslabsgmbh](https://huggingface.co/mobiuslabsgmbh)** – For refining distillation techniques with **DeepSeek-R1-ReDistill-Qwen-7B-v1.1**. - **[@Triangle104](https://huggingface.co/Triangle104)** – For crafting innovative merges like **DSR1-Distill-Qwen-7B-RP**, an essential component in this blend. - **[@deepseek-ai](https://huggingface.co/deepseek-ai)** – For open-sourcing **DeepSeek-R1-Distill-Qwen-7B**, a foundation for reasoning advancements. And a heartfelt **thank you** to everyone in the **🤗 & Open-Source AI community** for their continued research, testing, and support. 💜🚀 --- # 🔗 Additional Resources - [Hugging Face Model Card](https://huggingface.co/ZeroXClem/Qwen2.5-7B-DistilPrism) - [MergeKit Repository](https://github.com/ZeroXClem/mergekit) - [DeepSeek AI Homepage](https://huggingface.co/deepseek-ai) - [Open LLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)