File size: 6,970 Bytes

---
license: apache-2.0
tags:
- merge
- mergekit
- lazymergekit
- huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2
- mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1
- Triangle104/DSR1-Distill-Qwen-7B-RP
- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
language:
- en
- zh
base_model:
- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
- huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2
- mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1
- Triangle104/DSR1-Distill-Qwen-7B-RP
pipeline_tag: text-classification
library_name: transformers
---

# ZeroXClem/Qwen2.5-7B-DistilPrism

**Qwen2.5-7B-DistilPrism** is a **distillation / reasoning focused model merge** designed to combine multiple variations of DeepSeek-R1 distillations, resulting in a **refined, high-performance language model**. Utilizing the **Model Stock** merge method, this fusion captures the best attributes of **DeepSeek-R1-Distill-Qwen-7B** and its improved derivatives.

## 🚀 Merged Models

This model is a weighted merge of the following:

- [**huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2**](https://huggingface.co/huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2): An uncensored distillation of DeepSeek-R1, optimized to remove refusals and improve usability.
- [**mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1**](https://huggingface.co/mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1): A refined distillation that improves accuracy and robustness across various benchmarks.
- [**Triangle104/DSR1-Distill-Qwen-7B-RP**](https://huggingface.co/Triangle104/DSR1-Distill-Qwen-7B-RP): A composite merge of various distilled DeepSeek variants, serving as an essential ingredient for performance tuning.
- [**deepseek-ai/DeepSeek-R1-Distill-Qwen-7B**](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B): The foundation of this merge, representing the distilled form of DeepSeek-R1 optimized for efficiency and strong reasoning capabilities.

## 🧩 Merge Configuration

The following **YAML configuration** defines how these models were combined using **Model Stock**, ensuring **balanced contributions** from each source:

```yaml
# Merge configuration for ZeroXClem/Qwen2.5-7B-DistilPrism using Model Stock
name: ZeroXClem-Qwen2.5-7B-DistilPrism
merge_method: model_stock
base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
tokenizer_source: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
dtype: bfloat16
parameters:
  normalize: true
  rescale: true
models:
  - model: huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2
    parameters:
      weight: 0.3
  - model: mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1
    parameters:
      weight: 0.25
  - model: Triangle104/DSR1-Distill-Qwen-7B-RP
    parameters:
      weight: 0.2
  - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
    parameters:
      weight: 0.25
```

### 🔑 Key Parameters

- **Normalization & Rescaling**: Ensures weight distributions remain balanced across all components.
- **Model Stock Merge Method**: Optimizes contribution from each model to retain the best attributes.
- **Weighted Blending**: The **abliterated** and **re-distilled** models contribute the most, refining both alignment and general usability.

---

## 🗣️ Inference

You can use the model for text generation as follows:

### Ollama

**[Quickstart to Ollama Guide Here](https://aidev.zeroxclem.com/blog/08-setting-up-ollama)** I recommend ollama for daily driver applications, as it supports thinkking tags. 

```bash
ollama run hf.co/ZeroXClem/Qwen2.5-7B-DistilPrism

# If you are using quants, just copy the url and replace 'huggingface.co/' with 'hf.co/' followed by name of quant. 
```

### Transformers

```python
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch

# Define the model name
model_name = "ZeroXClem/Qwen2.5-7B-DistilPrism"

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Load the model
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Initialize the pipeline
text_generator = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Define the input prompt
prompt = "Explain the significance of artificial intelligence in modern healthcare."

# Generate the output
outputs = text_generator(
    prompt,
    max_new_tokens=150,
    do_sample=True,
    temperature=0.7,
    top_k=50,
    top_p=0.95
)

# Print the generated text
print(outputs[0]["generated_text"])
```

---

## 🎯 Use Case & Applications

**Qwen2.5-7B-DistilPrism** is designed for **efficient, high-quality text generation** with strong reasoning capabilities. It is well-suited for:

- **Advanced Reasoning & Problem Solving**: Excels in logic-heavy tasks and multi-step reasoning problems.
- **Conversational AI**: Optimized for **fluid, responsive dialogue**, reducing refusals and improving engagement.
- **Mathematical & Scientific Computation**: Enhanced **math & code generation abilities** compared to standard distillations.
- **Content Creation & Summarization**: Generates coherent and **contextually rich** text suitable for various applications.

---

## 📜 License

This model is released under the **MIT License**.

---

## 📊 Benchmark Results (Coming Soon)

We are currently in the process of **quantizing and benchmarking** this model. Stay tuned for performance updates across:

- **IFEval (0-Shot)**
- **BBH (3-Shot)**
- **MATH (4-Shot)**
- **GPQA (0-Shot)**
- **MuSR (0-Shot)**
- **MMLU-PRO (5-Shot)**

---

## 💡 Tags

- `merge`
- `mergekit`
- `model_stock`
- `DeepSeek-R1`
- `Distillation`
- `abliterated`
- `re-distilled`
- `DeepSeek-R1-Distill-Qwen-7B`

---

## 🙏 Special Thanks

This project wouldn't be possible without the incredible contributions from:

- **[@huihui-ai](https://huggingface.co/huihui-ai)** – For developing **DeepSeek-R1-Distill-Qwen-7B-abliterated-v2**, a bold step towards improving model alignment.  
- **[@mobiuslabsgmbh](https://huggingface.co/mobiuslabsgmbh)** – For refining distillation techniques with **DeepSeek-R1-ReDistill-Qwen-7B-v1.1**.  
- **[@Triangle104](https://huggingface.co/Triangle104)** – For crafting innovative merges like **DSR1-Distill-Qwen-7B-RP**, an essential component in this blend.  
- **[@deepseek-ai](https://huggingface.co/deepseek-ai)** – For open-sourcing **DeepSeek-R1-Distill-Qwen-7B**, a foundation for reasoning advancements.  

And a heartfelt **thank you** to everyone in the **🤗 & Open-Source AI community** for their continued research, testing, and support. 💜🚀  

---


# 🔗 Additional Resources

- [Hugging Face Model Card](https://huggingface.co/ZeroXClem/Qwen2.5-7B-DistilPrism)
- [MergeKit Repository](https://github.com/ZeroXClem/mergekit)
- [DeepSeek AI Homepage](https://huggingface.co/deepseek-ai)
- [Open LLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)