Mit-ThinkDeeply-7B

Model Description

Mit-ThinkDeeply is the advanced version of the Mit series of large language models (LLMs) developed by WinkingFace. Built upon the robust foundation of the Mit base model, Mit-ThinkDeeply introduces enhanced reasoning capabilities, superior contextual understanding, and refined function-calling precision. This model is designed to seamlessly integrate intuitive conversational abilities with advanced multi-step reasoning, making it ideal for complex analytical tasks, structured problem-solving, and high-stakes decision-making.

Key features of Mit-ThinkDeeply include:

Advanced Reasoning: Capable of generating long chains of thought to deeply analyze problems and provide well-reasoned solutions.
Enhanced Contextual Awareness: Improved ability to maintain coherence across multi-turn conversations and long-form interactions.
Function Calling Precision: Optimized for reliable and accurate execution of tool calls, enabling seamless integration with external APIs and services.
Versatile Use Cases: Adaptable for both standard conversational tasks and complex reasoning scenarios, including mathematical problem-solving, code generation, and structured output generation.
Long Context Support: Supports context lengths of up to 128K tokens, ensuring robust performance in applications requiring extensive input data.

Mit-ThinkDeeply has undergone extensive architectural refinements and fine-tuning to align more effectively with real-world applications. Our training process emphasizes deeper contextual awareness, enhanced response coherence, and improved execution of function-calling, making Mit-ThinkDeeply a powerful and versatile AI system.

Key Features

Multi-Lingual by Design

Supports over 29 languages, including but not limited to:

English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.

Proficient in Coding

Trained on 80+ programming languages, including Python, Java, C, C++, JavaScript, and Bash. Also supports specialized languages like Swift and Fortran.

Advanced Reasoning

State-of-the-art mathematical and reasoning capabilities, enabling the model to tackle complex problems with step-by-step analysis.

Robust Context Adherence

Ensures strong adherence for RAG (Retrieval-Augmented Generation) and large-context applications, maintaining coherence across lengthy interactions.

System Prompt Support

Maintains strong support for system prompts, allowing users to define roles, rules, and stylistic preferences for tailored interactions.

Requirements

Mit's code is integrated into WinkingFace's customized version of transformers by Huggingface, and we recommend using this modified version for optimal compatibility.

To prevent potential errors, such as:

KeyError: 'mit'

Install the customized transformers package using the following command:

pip install git+https://github.com/WinkingFaceAI/tfm-recooked.git

Usage Recommendations

To achieve optimal performance with Mit-ThinkDeeply, we recommend adhering to the following configurations:

Temperature Settings: Set the temperature within the range of 0.5–0.7 (0.6 is recommended) to prevent endless repetitions or incoherent outputs.
System Prompts: Avoid adding unnecessary system prompts; all instructions should be contained within the user prompt.
Mathematical Problems: Include a directive in your prompt such as: "Please reason step by step, and put your final answer within \boxed{}."
Reasoning Mode: To activate advanced reasoning, use the following system prompt:

You are Mit, created by WinkingFace. You are a deep thinking AI, capable of using extremely long chains of thought to deeply consider the problem and deliberate via systematic reasoning processes. Enclose your thoughts and internal monologue inside <think_deeply> </think_deeply> tags, and then provide your solution or response to the problem.

When Benchmarking model: Conduct multiple tests and average the results to evaluate model performance accurately.

Prompt Format

Standard Conversational Mode

For standard conversational tasks, use the following format:

Use with pipeline:

from transformers import pipeline

messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe = pipeline("text-generation", model="WinkingFaceAI/Mit-ThinkDeeply-7B")
pipe(messages)

Use with transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "WinkingFaceAI/Mit-ThinkDeeply-7B"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

messages = [
    {"role": "system", "content": "You are Mit, created by WinkingFace. You are a helpful assistant."},
    {"role": "user", "content": "What are the most interesting things to do in Paris?"}
]

input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors='pt').to("cuda")
generated_ids = model.generate(input_ids, max_new_tokens=2500, temperature=0.8, repetition_penalty=1.1, do_sample=True, eos_token_id=tokenizer.eos_token_id)
response = tokenizer.decode(generated_ids[0], skip_special_tokens=True, clean_up_tokenization_space=True)
print(f"Response: {response}")

Advanced Reasoning Mode

To activate advanced reasoning, use the following system prompt:

messages = [
    {
        "role": "system",
        "content": "You are Mit, created by WinkingFace. You are a deep thinking AI, capable of using extremely long chains of thought to deeply consider the problem and deliberate via systematic reasoning processes. Enclose your thoughts and internal monologue inside <think_deeply> </think_deeply> tags, and then provide your solution or response to the problem."
    },
    {
        "role": "user",
        "content": "What is y if y=2*2-4+(3*2)?"
    }
]

Example Interaction

User Query

What is y if y = 2 * 2 - 4 + (3 * 2)?

Model Response

<think_deeply>
Let's break this down step by step:
1. Calculate the multiplication first: 2 * 2 = 4.
2. Subtract 4: 4 - 4 = 0.
3. Calculate the second multiplication: 3 * 2 = 6.
4. Add the results: 0 + 6 = 6.
</think_deeply>
The answer is \boxed{6}.

Function Calling

Our model was trained on specific system prompts and structures for function calling. The function-calling mechanism allows the model to interact with external tools by generating JSON objects that describe the tool to be called and its arguments.

Tools should be described using JSON Schema, as shown below:

[
  {
    "type": "function",
    "function": {
      "name": "get_stock_fundamentals",
      "description": "Get fundamental data for a given stock symbol using yfinance API.",
      "parameters": {
        "type": "object",
        "properties": {
          "symbol": {
            "type": "string",
            "description": "The stock symbol (e.g., TSLA for Tesla)."
          }
        },
        "required": ["symbol"]
      }
    }
  }
]

Example Interaction

Scenario

Suppose we would like to ask the model about the fundamental data for Tesla (TSLA). The model does not have real-time data access but can use the predefined tool get_stock_fundamentals to retrieve the information.

System Prompt

Provide the system prompt with the available tools and any additional context:

You are Mit, created by WinkingFaceAI. You are a helpful assistant.

Available Tools:
[
  {
    "type": "function",
    "function": {
      "name": "get_stock_fundamentals",
      "description": "Get fundamental data for a given stock symbol using yfinance API.",
      "parameters": {
        "type": "object",
        "properties": {
          "symbol": {
            "type": "string",
            "description": "The stock symbol (e.g., TSLA for Tesla)."
          }
        },
        "required": ["symbol"]
      }
    }
  }
]

Example Interaction

User Query

Fetch the stock fundamentals data for Tesla (TSLA).

Model Response

The model will generate a tool call in the following format:

{
  "name": "get_stock_fundamentals",
  "arguments": {
    "symbol": "TSLA"
  }
}

Tool Execution

Once the tool call is parsed and executed, the result should be passed back to the model in the following format:

{
  "name": "get_stock_fundamentals",
  "content": {
    "symbol": "TSLA",
    "company_name": "Tesla, Inc.",
    "sector": "Consumer Cyclical",
    "industry": "Auto Manufacturers",
    "market_cap": 611384164352,
    "pe_ratio": 49.604652,
    "pb_ratio": 9.762013,
    "dividend_yield": null,
    "eps": 4.3,
    "beta": 2.427,
    "52_week_high": 299.29,
    "52_week_low": 152.37
  }
}

Final Assistant Response

The model will then generate a natural language response based on the tool output:

The stock fundamentals data for Tesla (TSLA) are as follows:
- **Symbol**: TSLA
- **Company Name**: Tesla, Inc.
- **Sector**: Consumer Cyclical
- **Industry**: Auto Manufacturers
- **Market Capitalization**: $566,160,130,480
- **Forward Price-to-Earnings Ratio (PE Ratio)**: 42.73
- **Price-to-Book Ratio (PB Ratio)**: 9.04
- **Dividend Yield**: N/A
- **Trailing Earnings Per Share (EPS)**: $4.3
- **Beta Value of the Stock**: 2.42
- **52-Week High Price of the Stock**: $299.29
- **52-Week Low Price of the Stock**: $152.37

This information provides a snapshot of Tesla's financial position and performance based on the fundamental data obtained from the yfinance API.

Run locally

Using Customized Ollama version

Install the customized Ollama package (For safety and security reasons, we will not provide download methods for this this modified version!):

https://github.com/WinkingFaceAI/olm-recooked.git

To pull model checkpoints and run the model, use the ollama run command. You can specify the model size by suffixing v/Mit-ThinkDeeply, such as :0.5b, :1.5b, :3b, or :7b:

ollama run v/Mit-ThinkDeeply:7b

You can also interact with the model using Python. Here's an example using its OpenAI-compatible API:

from openai import OpenAI

client = OpenAI(
    base_url='http://localhost:11434/v1/',
    api_key='ollama',  # required but ignored
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            'role': 'user',
            'content': 'Say this is a test',
        }
    ],
    model='v/Mit-ThinkDeeply:7b',
)

print(chat_completion.choices[0].message.content)

See more models on ollama here!

Using Ollama(will be weaker)

After installing ollama, you can pull model checkpoints and run the model, use the ollama run command. You can specify the model size by suffixing v/Qwen-Mit-ThinkDeeply, such as :0.5b, :1.5b, :3b, or :7b:

ollama run v/Qwen-Mit-ThinkDeeply:7b

You can also interact with the model using Python. Here's an example using its OpenAI-compatible API:

from openai import OpenAI

client = OpenAI(
    base_url='http://localhost:11434/v1/',
    api_key='ollama',  # required but ignored
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            'role': 'user',
            'content': 'Say this is a test',
        }
    ],
    model='v/Mit-ThinkDeeply:7b',
)

print(chat_completion.choices[0].message.content)

See more models on ollama here!

Deployment

Using vLLM

We recommend using Customized vLLM version for deploying Mit-ThinkDeeply in production environments.

Install the customized vLLM package (For safety and security reasons, we will not provide download methods for this this modified version!):

git clone https://github.com/WinkingFaceAI/vllm-recooked.git

To serve the model, use the following command:

vllm serve WinkingFace/Mit-ThinkDeeply-7B

Use the chat API via curl:

curl http://localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
    "model": "WinkingFace/Mit-ThinkDeeply-7B",
    "messages": [
        {"role": "system", "content": "You are Mit, created by WinkingFace. You are a deep thinking AI, capable of using extremely long chains of thought to deeply consider the problem and deliberate via systematic reasoning processes. Enclose your thoughts and internal monologue inside <think_deeply> </think_deeply> tags, and then provide your solution or response to the problem."},
        {"role": "user", "content": "If a regular hexagon has a short diagonal of 64, what is its long diagonal?"}
    ],
    "temperature": 0.6,
    "top_p": 0.8,
    "repetition_penalty": 1.05,
    "max_tokens": 512
}'

Use the chat API via OpenAI package:

from openai import OpenAI
# Set OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

chat_response = client.chat.completions.create(
    model="Mit-ThinkDeeply-7B",
    messages=[
        {"role": "system", "content": "You are Mit, created by WinkingFace. You are a deep thinking AI, capable of using extremely long chains of thought to deeply consider the problem and deliberate via systematic reasoning processes. Enclose your thoughts and internal monologue inside <think_deeply> </think_deeply> tags, and then provide your solution or response to the problem."},
        {"role": "user", "content": "If a regular hexagon has a short diagonal of 64, what is its long diagonal?"},
    ],
    temperature=0.7,
    top_p=0.8,
    max_tokens=512,
    extra_body={
        "repetition_penalty": 1.05,
    },
)
print("Chat response:", chat_response)

Evaluation & Performance

Category	Benchmark	Mit-ThinkDeeply-0.5B	Mit-ThinkDeeply-1.5B	Mit-ThinkDeeply-3B	Mit-ThinkDeeply-7B
	Context Length	32K	32K	32K	128K
	Generation Length	8K	8K	8K	8K
General	MMLU	45.4	58.9	63.8	72.6
	MMLU-pro	13.8	26.6	33.0	43.7
	MMLU-redux	43.1	56.8	62.7	70.3
	BBH	18.3	41.7	64.9	68.1
	ARC-C	32.9	56.0	57.5	65.8
Code	LiveCodeBench	11.5	21.4	25.9	36.2
	HumanEval	25.4	44.6	51.6	69.5
	HumanEval+	29.7	38.1	43.9	60.7
	MBPP	46.3	74.2	69.9	82.9
	MBPP+	36.8	59.5	59.3	70.2
	MultiPL-E	24.9	51.7	49.6	58.1
Mathematics	GPQA	25.1	29.0	31.5	40.7
	Theoremqa	18.2	23.2	27.9	39.4
	MATH	25.4	38.1	46.7	54.8
	MATH-500	62.5	79.2	88.4	94.6
	MMLU-stem	43.3	65.8	75.1	81.3
	GSM8K	45.8	70.1	81.5	86.2

License

This code repository and the model weights are licensed under the Apache 2.0 License. The Mit-ThinkDeeply series is fully compatible with commercial use and allows for modifications and derivative works, including but not limited to distillation for training other LLMs.

Please note that:

Mit-ThinkDeeply-0.5B, Mit-ThinkDeeply-1.5B, Mit-ThinkDeeply-3B, and Mit-ThinkDeeply-7B are derived from the Qwen series, which is also licensed under the Apache 2.0 License.

Citation

If you find our work helpful, feel free to cite us:


@misc{mit-thinkdeeply,
    title = {Mit-ThinkDeeply: Advanced Reasoning and Contextual Awareness in Large Language Models},
    author = {WinkingFace Team},
    year = {2025},
    url = {https://huggingface.co/WinkingFace/Mit-ThinkDeeply-7B}
}

Contact

For any questions or inquiries, feel free to contact us here 📨.

WinkingFace
/

Mit-ThinkDeeply-7B

You need to agree to share your contact information to access this model