---
license: mit
base_model:
- WinkingFace/Mit-7B

extra_gated_fields:
  Date of birth: date_picker
  Country: country
  Affiliation: text
  Job title:
    type: select
    options:
      - Student
      - Research Graduate
      - AI Researcher
      - AI Developer/Engineer
      - Data Scientist
      - Machine Learning Engineer
      - Software Engineer
      - Research Scientist
      - Professor/Academic
      - Product Manager
      - Journalist/Reporter
      - Entrepreneur/Startup Founder
      - Policy Maker/Regulator
      - Other
  geo: ip_location
  ? By clicking Submit below I accept the terms of the WinkingFace license
  : checkbox
extra_gated_description: (WinkingFace license)[http://huggingface.co/WinkingFace/Mit-ThinkDeeply-0.5B/blob/main/LICENSE.md]
extra_gated_button_content: Submit
---
# Mit-ThinkDeeply-7B

## Model Description

**Mit-ThinkDeeply** is the advanced version of the Mit series of large language models (LLMs) developed by WinkingFace. Built upon the robust foundation of the Mit base model, **Mit-ThinkDeeply** introduces enhanced reasoning capabilities, superior contextual understanding, and refined function-calling precision. This model is designed to seamlessly integrate intuitive conversational abilities with advanced multi-step reasoning, making it ideal for complex analytical tasks, structured problem-solving, and high-stakes decision-making.

Key features of **Mit-ThinkDeeply** include:

- **Advanced Reasoning**: Capable of generating long chains of thought to deeply analyze problems and provide well-reasoned solutions.
- **Enhanced Contextual Awareness**: Improved ability to maintain coherence across multi-turn conversations and long-form interactions.
- **Function Calling Precision**: Optimized for reliable and accurate execution of tool calls, enabling seamless integration with external APIs and services.
- **Versatile Use Cases**: Adaptable for both standard conversational tasks and complex reasoning scenarios, including mathematical problem-solving, code generation, and structured output generation.
- **Long Context Support**: Supports context lengths of up to 128K tokens, ensuring robust performance in applications requiring extensive input data.

**Mit-ThinkDeeply** has undergone extensive architectural refinements and fine-tuning to align more effectively with real-world applications. Our training process emphasizes deeper contextual awareness, enhanced response coherence, and improved execution of function-calling, making **Mit-ThinkDeeply** a powerful and versatile AI system.


## Key Features

**Multi-Lingual by Design**

Supports over 29 languages, including but not limited to:

- English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.

**Proficient in Coding**

Trained on 80+ programming languages, including Python, Java, C, C++, JavaScript, and Bash. Also supports specialized languages like Swift and Fortran.

**Advanced Reasoning**

State-of-the-art mathematical and reasoning capabilities, enabling the model to tackle complex problems with step-by-step analysis.

**Robust Context Adherence**

Ensures strong adherence for RAG (Retrieval-Augmented Generation) and large-context applications, maintaining coherence across lengthy interactions.

**System Prompt Support**

Maintains strong support for system prompts, allowing users to define roles, rules, and stylistic preferences for tailored interactions.


## Requirements

Mit's code is integrated into WinkingFace's customized version of `transformers` by Huggingface, and we recommend using this modified version for optimal compatibility.

To prevent potential errors, such as:

```
KeyError: 'mit'
```

Install the customized `transformers` package using the following command:

```bash
pip install git+https://github.com/WinkingFaceAI/tfm-recooked.git
```


## Usage Recommendations

To achieve optimal performance with **Mit-ThinkDeeply**, we recommend adhering to the following configurations:

- **Temperature Settings**: Set the temperature within the range of 0.5–0.7 (0.6 is recommended) to prevent endless repetitions or incoherent outputs.
- **System Prompts**: Avoid adding unnecessary system prompts; all instructions should be contained within the user prompt.
- **Mathematical Problems**: Include a directive in your prompt such as: "Please reason step by step, and put your final answer within \boxed{}."
- **Reasoning Mode**: To activate advanced reasoning, use the following system prompt:
```
You are Mit, created by WinkingFace. You are a deep thinking AI, capable of using extremely long chains of thought to deeply consider the problem and deliberate via systematic reasoning processes. Enclose your thoughts and internal monologue inside <think_deeply> </think_deeply> tags, and then provide your solution or response to the problem.
```
- **When Benchmarking model**: Conduct multiple tests and average the results to evaluate model performance accurately.


## Prompt Format

**Standard Conversational Mode**

For standard conversational tasks, use the following format:

- Use with pipeline:
```python
from transformers import pipeline

messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe = pipeline("text-generation", model="WinkingFaceAI/Mit-ThinkDeeply-7B")
pipe(messages)
```

- Use with transformers:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "WinkingFaceAI/Mit-ThinkDeeply-7B"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

messages = [
    {"role": "system", "content": "You are Mit, created by WinkingFace. You are a helpful assistant."},
    {"role": "user", "content": "What are the most interesting things to do in Paris?"}
]

input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors='pt').to("cuda")
generated_ids = model.generate(input_ids, max_new_tokens=2500, temperature=0.8, repetition_penalty=1.1, do_sample=True, eos_token_id=tokenizer.eos_token_id)
response = tokenizer.decode(generated_ids[0], skip_special_tokens=True, clean_up_tokenization_space=True)
print(f"Response: {response}")
```

**Advanced Reasoning Mode**

To activate advanced reasoning, use the following system prompt:

```json
messages = [
    {
        "role": "system",
        "content": "You are Mit, created by WinkingFace. You are a deep thinking AI, capable of using extremely long chains of thought to deeply consider the problem and deliberate via systematic reasoning processes. Enclose your thoughts and internal monologue inside <think_deeply> </think_deeply> tags, and then provide your solution or response to the problem."
    },
    {
        "role": "user",
        "content": "What is y if y=2*2-4+(3*2)?"
    }
]
```

*Example Interaction*

- User Query

```
What is y if y = 2 * 2 - 4 + (3 * 2)?
```

- Model Response

```
<think_deeply>
Let's break this down step by step:
1. Calculate the multiplication first: 2 * 2 = 4.
2. Subtract 4: 4 - 4 = 0.
3. Calculate the second multiplication: 3 * 2 = 6.
4. Add the results: 0 + 6 = 6.
</think_deeply>
The answer is \boxed{6}.
```


## Function Calling

Our model was trained on specific system prompts and structures for function calling. The function-calling mechanism allows the model to interact with external tools by generating JSON objects that describe the tool to be called and its arguments.

- Tools should be described using JSON Schema, as shown below:

```json
[
  {
    "type": "function",
    "function": {
      "name": "get_stock_fundamentals",
      "description": "Get fundamental data for a given stock symbol using yfinance API.",
      "parameters": {
        "type": "object",
        "properties": {
          "symbol": {
            "type": "string",
            "description": "The stock symbol (e.g., TSLA for Tesla)."
          }
        },
        "required": ["symbol"]
      }
    }
  }
]
```
**Example Interaction**

- Scenario

Suppose we would like to ask the model about the fundamental data for Tesla (TSLA). The model does not have real-time data access but can use the predefined tool `get_stock_fundamentals` to retrieve the information.

- System Prompt

Provide the system prompt with the available tools and any additional context:

```
You are Mit, created by WinkingFaceAI. You are a helpful assistant.

Available Tools:
[
  {
    "type": "function",
    "function": {
      "name": "get_stock_fundamentals",
      "description": "Get fundamental data for a given stock symbol using yfinance API.",
      "parameters": {
        "type": "object",
        "properties": {
          "symbol": {
            "type": "string",
            "description": "The stock symbol (e.g., TSLA for Tesla)."
          }
        },
        "required": ["symbol"]
      }
    }
  }
]
```

*Example Interaction*

- User Query

```
Fetch the stock fundamentals data for Tesla (TSLA).
```

- Model Response

The model will generate a tool call in the following format:

```json
{
  "name": "get_stock_fundamentals",
  "arguments": {
    "symbol": "TSLA"
  }
}
```
- Tool Execution

Once the tool call is parsed and executed, the result should be passed back to the model in the following format:

```json
{
  "name": "get_stock_fundamentals",
  "content": {
    "symbol": "TSLA",
    "company_name": "Tesla, Inc.",
    "sector": "Consumer Cyclical",
    "industry": "Auto Manufacturers",
    "market_cap": 611384164352,
    "pe_ratio": 49.604652,
    "pb_ratio": 9.762013,
    "dividend_yield": null,
    "eps": 4.3,
    "beta": 2.427,
    "52_week_high": 299.29,
    "52_week_low": 152.37
  }
}
```

- Final Assistant Response

The model will then generate a natural language response based on the tool output:

```
The stock fundamentals data for Tesla (TSLA) are as follows:
- **Symbol**: TSLA
- **Company Name**: Tesla, Inc.
- **Sector**: Consumer Cyclical
- **Industry**: Auto Manufacturers
- **Market Capitalization**: $566,160,130,480
- **Forward Price-to-Earnings Ratio (PE Ratio)**: 42.73
- **Price-to-Book Ratio (PB Ratio)**: 9.04
- **Dividend Yield**: N/A
- **Trailing Earnings Per Share (EPS)**: $4.3
- **Beta Value of the Stock**: 2.42
- **52-Week High Price of the Stock**: $299.29
- **52-Week Low Price of the Stock**: $152.37

This information provides a snapshot of Tesla's financial position and performance based on the fundamental data obtained from the yfinance API.
```


## Run locally

**Using Customized Ollama version**

Install the customized `Ollama` package (For safety and security reasons, we will not provide download methods for this this modified version!):

```
https://github.com/WinkingFaceAI/olm-recooked.git
```

To pull model checkpoints and run the model, use the `ollama run` command. You can specify the model size by suffixing `v/Mit-ThinkDeeply`, such as `:0.5b`, :`1.5b`, :`3b`, or :`7b`:

```bash
ollama run v/Mit-ThinkDeeply:7b
```

You can also interact with the model using Python. Here's an example using its OpenAI-compatible API:

```python
from openai import OpenAI

client = OpenAI(
    base_url='http://localhost:11434/v1/',
    api_key='ollama',  # required but ignored
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            'role': 'user',
            'content': 'Say this is a test',
        }
    ],
    model='v/Mit-ThinkDeeply:7b',
)

print(chat_completion.choices[0].message.content)
```

[See more models on ollama here!](https://ollama.com/_)

**Using Ollama**(will be weaker)

After [installing ollama](https://github.com/ollama/ollama), you can pull model checkpoints and run the model, use the `ollama run` command. You can specify the model size by suffixing `v/Qwen-Mit-ThinkDeeply`, such as `:0.5b`, :`1.5b`, :`3b`, or :`7b`:

```bash
ollama run v/Qwen-Mit-ThinkDeeply:7b
```

You can also interact with the model using Python. Here's an example using its OpenAI-compatible API:

```python
from openai import OpenAI

client = OpenAI(
    base_url='http://localhost:11434/v1/',
    api_key='ollama',  # required but ignored
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            'role': 'user',
            'content': 'Say this is a test',
        }
    ],
    model='v/Mit-ThinkDeeply:7b',
)

print(chat_completion.choices[0].message.content)
```

[See more models on ollama here!](https://ollama.com/_)


## Deployment

**Using vLLM**

We recommend using **Customized vLLM version** for deploying **Mit-ThinkDeeply** in production environments.

Install the customized `vLLM` package (For safety and security reasons, we will not provide download methods for this this modified version!):

```bash
git clone https://github.com/WinkingFaceAI/vllm-recooked.git
```

To serve the model, use the following command:

```bash
vllm serve WinkingFace/Mit-ThinkDeeply-7B
```

Use the chat API via `curl`:

```bash
curl http://localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
    "model": "WinkingFace/Mit-ThinkDeeply-7B",
    "messages": [
        {"role": "system", "content": "You are Mit, created by WinkingFace. You are a deep thinking AI, capable of using extremely long chains of thought to deeply consider the problem and deliberate via systematic reasoning processes. Enclose your thoughts and internal monologue inside <think_deeply> </think_deeply> tags, and then provide your solution or response to the problem."},
        {"role": "user", "content": "If a regular hexagon has a short diagonal of 64, what is its long diagonal?"}
    ],
    "temperature": 0.6,
    "top_p": 0.8,
    "repetition_penalty": 1.05,
    "max_tokens": 512
}'
```

Use the chat API via `OpenAI` package:

```python
from openai import OpenAI
# Set OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

chat_response = client.chat.completions.create(
    model="Mit-ThinkDeeply-7B",
    messages=[
        {"role": "system", "content": "You are Mit, created by WinkingFace. You are a deep thinking AI, capable of using extremely long chains of thought to deeply consider the problem and deliberate via systematic reasoning processes. Enclose your thoughts and internal monologue inside <think_deeply> </think_deeply> tags, and then provide your solution or response to the problem."},
        {"role": "user", "content": "If a regular hexagon has a short diagonal of 64, what is its long diagonal?"},
    ],
    temperature=0.7,
    top_p=0.8,
    max_tokens=512,
    extra_body={
        "repetition_penalty": 1.05,
    },
)
print("Chat response:", chat_response)
```


## Evaluation & Performance

<div align="center">


| Category | Benchmark | Mit-ThinkDeeply-0.5B | Mit-ThinkDeeply-1.5B | Mit-ThinkDeeply-3B | Mit-ThinkDeeply-7B |
|----------|-----------|----------------------|----------------------|--------------------|--------------------|
| | Context Length | 32K | 32K | 32K | 128K |
| | Generation Length | 8K | 8K | 8K | 8K |
| General | MMLU | 45.4 | 58.9 | 63.8 | 72.6 |
| | MMLU-pro | 13.8 | 26.6 | 33.0 | 43.7 |
| | MMLU-redux | 43.1 | 56.8 | 62.7 | 70.3 |
| | BBH | 18.3 | 41.7 | 64.9 | 68.1 |
| | ARC-C | 32.9 | 56.0 | 57.5 | 65.8 |
| Code | LiveCodeBench | 11.5 | 21.4 | 25.9 | 36.2 |
| | HumanEval | 25.4 | 44.6 | 51.6 | 69.5 |
| | HumanEval+ | 29.7 | 38.1 | 43.9 | 60.7 |
| | MBPP | 46.3 | 74.2 | 69.9 | 82.9 |
| | MBPP+ | 36.8 | 59.5 | 59.3 | 70.2 |
| | MultiPL-E | 24.9 | 51.7 | 49.6 | 58.1 |
| Mathematics | GPQA | 25.1 | 29.0 | 31.5 | 40.7 |
| | Theoremqa | 18.2 | 23.2 | 27.9 | 39.4 |
| | MATH | 25.4 | 38.1 | 46.7 | 54.8 |
| | MATH-500 | 62.5 | 79.2 | 88.4 | 94.6 |
| | MMLU-stem | 43.3 | 65.8 | 75.1 | 81.3 |
| | GSM8K | 45.8 | 70.1 | 81.5 | 86.2 |

</div>


## License

This code repository and the model weights are licensed under the [Apache 2.0 License](https://huggingface.co/WinkingFace/Mit-0.5B/blob/main/LICENSE). The **Mit-ThinkDeeply** series is fully compatible with commercial use and allows for modifications and derivative works, including but not limited to distillation for training other LLMs.

Please note that:

- Mit-ThinkDeeply-0.5B, Mit-ThinkDeeply-1.5B, Mit-ThinkDeeply-3B, and Mit-ThinkDeeply-7B are derived from the [Qwen series](https://huggingface.co/Qwen), which is also licensed under the [Apache 2.0 License](https://huggingface.co/Qwen/Qwen2.5-1.5B/blob/main/LICENSE).


## Citation

```
If you find our work helpful, feel free to cite us:


@misc{mit-thinkdeeply,
    title = {Mit-ThinkDeeply: Advanced Reasoning and Contextual Awareness in Large Language Models},
    author = {WinkingFace Team},
    year = {2025},
    url = {https://huggingface.co/WinkingFace/Mit-ThinkDeeply-7B}
}
```


## Contact

For any questions or inquiries, feel free to [contact us here 📨](mailto:contact@winkingfacehub.com).