--- license: mit base_model: - WinkingFace/Mit-7B extra_gated_fields: Date of birth: date_picker Country: country Affiliation: text Job title: type: select options: - Student - Research Graduate - AI Researcher - AI Developer/Engineer - Data Scientist - Machine Learning Engineer - Software Engineer - Research Scientist - Professor/Academic - Product Manager - Journalist/Reporter - Entrepreneur/Startup Founder - Policy Maker/Regulator - Other geo: ip_location ? By clicking Submit below I accept the terms of the WinkingFace license : checkbox extra_gated_description: (WinkingFace license)[http://huggingface.co/WinkingFace/Mit-ThinkDeeply-0.5B/blob/main/LICENSE.md] extra_gated_button_content: Submit --- # Mit-ThinkDeeply-7B ## Model Description **Mit-ThinkDeeply** is the advanced version of the Mit series of large language models (LLMs) developed by WinkingFace. Built upon the robust foundation of the Mit base model, **Mit-ThinkDeeply** introduces enhanced reasoning capabilities, superior contextual understanding, and refined function-calling precision. This model is designed to seamlessly integrate intuitive conversational abilities with advanced multi-step reasoning, making it ideal for complex analytical tasks, structured problem-solving, and high-stakes decision-making. Key features of **Mit-ThinkDeeply** include: - **Advanced Reasoning**: Capable of generating long chains of thought to deeply analyze problems and provide well-reasoned solutions. - **Enhanced Contextual Awareness**: Improved ability to maintain coherence across multi-turn conversations and long-form interactions. - **Function Calling Precision**: Optimized for reliable and accurate execution of tool calls, enabling seamless integration with external APIs and services. - **Versatile Use Cases**: Adaptable for both standard conversational tasks and complex reasoning scenarios, including mathematical problem-solving, code generation, and structured output generation. - **Long Context Support**: Supports context lengths of up to 128K tokens, ensuring robust performance in applications requiring extensive input data. **Mit-ThinkDeeply** has undergone extensive architectural refinements and fine-tuning to align more effectively with real-world applications. Our training process emphasizes deeper contextual awareness, enhanced response coherence, and improved execution of function-calling, making **Mit-ThinkDeeply** a powerful and versatile AI system. ## Key Features **Multi-Lingual by Design** Supports over 29 languages, including but not limited to: - English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more. **Proficient in Coding** Trained on 80+ programming languages, including Python, Java, C, C++, JavaScript, and Bash. Also supports specialized languages like Swift and Fortran. **Advanced Reasoning** State-of-the-art mathematical and reasoning capabilities, enabling the model to tackle complex problems with step-by-step analysis. **Robust Context Adherence** Ensures strong adherence for RAG (Retrieval-Augmented Generation) and large-context applications, maintaining coherence across lengthy interactions. **System Prompt Support** Maintains strong support for system prompts, allowing users to define roles, rules, and stylistic preferences for tailored interactions. ## Requirements Mit's code is integrated into WinkingFace's customized version of `transformers` by Huggingface, and we recommend using this modified version for optimal compatibility. To prevent potential errors, such as: ``` KeyError: 'mit' ``` Install the customized `transformers` package using the following command: ```bash pip install git+https://github.com/WinkingFaceAI/tfm-recooked.git ``` ## Usage Recommendations To achieve optimal performance with **Mit-ThinkDeeply**, we recommend adhering to the following configurations: - **Temperature Settings**: Set the temperature within the range of 0.5–0.7 (0.6 is recommended) to prevent endless repetitions or incoherent outputs. - **System Prompts**: Avoid adding unnecessary system prompts; all instructions should be contained within the user prompt. - **Mathematical Problems**: Include a directive in your prompt such as: "Please reason step by step, and put your final answer within \boxed{}." - **Reasoning Mode**: To activate advanced reasoning, use the following system prompt: ``` You are Mit, created by WinkingFace. You are a deep thinking AI, capable of using extremely long chains of thought to deeply consider the problem and deliberate via systematic reasoning processes. Enclose your thoughts and internal monologue inside tags, and then provide your solution or response to the problem. ``` - **When Benchmarking model**: Conduct multiple tests and average the results to evaluate model performance accurately. ## Prompt Format **Standard Conversational Mode** For standard conversational tasks, use the following format: - Use with pipeline: ```python from transformers import pipeline messages = [ {"role": "user", "content": "Who are you?"}, ] pipe = pipeline("text-generation", model="WinkingFaceAI/Mit-ThinkDeeply-7B") pipe(messages) ``` - Use with transformers: ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "WinkingFaceAI/Mit-ThinkDeeply-7B" model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(model_name) messages = [ {"role": "system", "content": "You are Mit, created by WinkingFace. You are a helpful assistant."}, {"role": "user", "content": "What are the most interesting things to do in Paris?"} ] input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors='pt').to("cuda") generated_ids = model.generate(input_ids, max_new_tokens=2500, temperature=0.8, repetition_penalty=1.1, do_sample=True, eos_token_id=tokenizer.eos_token_id) response = tokenizer.decode(generated_ids[0], skip_special_tokens=True, clean_up_tokenization_space=True) print(f"Response: {response}") ``` **Advanced Reasoning Mode** To activate advanced reasoning, use the following system prompt: ```json messages = [ { "role": "system", "content": "You are Mit, created by WinkingFace. You are a deep thinking AI, capable of using extremely long chains of thought to deeply consider the problem and deliberate via systematic reasoning processes. Enclose your thoughts and internal monologue inside tags, and then provide your solution or response to the problem." }, { "role": "user", "content": "What is y if y=2*2-4+(3*2)?" } ] ``` *Example Interaction* - User Query ``` What is y if y = 2 * 2 - 4 + (3 * 2)? ``` - Model Response ``` Let's break this down step by step: 1. Calculate the multiplication first: 2 * 2 = 4. 2. Subtract 4: 4 - 4 = 0. 3. Calculate the second multiplication: 3 * 2 = 6. 4. Add the results: 0 + 6 = 6. The answer is \boxed{6}. ``` ## Function Calling Our model was trained on specific system prompts and structures for function calling. The function-calling mechanism allows the model to interact with external tools by generating JSON objects that describe the tool to be called and its arguments. - Tools should be described using JSON Schema, as shown below: ```json [ { "type": "function", "function": { "name": "get_stock_fundamentals", "description": "Get fundamental data for a given stock symbol using yfinance API.", "parameters": { "type": "object", "properties": { "symbol": { "type": "string", "description": "The stock symbol (e.g., TSLA for Tesla)." } }, "required": ["symbol"] } } } ] ``` **Example Interaction** - Scenario Suppose we would like to ask the model about the fundamental data for Tesla (TSLA). The model does not have real-time data access but can use the predefined tool `get_stock_fundamentals` to retrieve the information. - System Prompt Provide the system prompt with the available tools and any additional context: ``` You are Mit, created by WinkingFaceAI. You are a helpful assistant. Available Tools: [ { "type": "function", "function": { "name": "get_stock_fundamentals", "description": "Get fundamental data for a given stock symbol using yfinance API.", "parameters": { "type": "object", "properties": { "symbol": { "type": "string", "description": "The stock symbol (e.g., TSLA for Tesla)." } }, "required": ["symbol"] } } } ] ``` *Example Interaction* - User Query ``` Fetch the stock fundamentals data for Tesla (TSLA). ``` - Model Response The model will generate a tool call in the following format: ```json { "name": "get_stock_fundamentals", "arguments": { "symbol": "TSLA" } } ``` - Tool Execution Once the tool call is parsed and executed, the result should be passed back to the model in the following format: ```json { "name": "get_stock_fundamentals", "content": { "symbol": "TSLA", "company_name": "Tesla, Inc.", "sector": "Consumer Cyclical", "industry": "Auto Manufacturers", "market_cap": 611384164352, "pe_ratio": 49.604652, "pb_ratio": 9.762013, "dividend_yield": null, "eps": 4.3, "beta": 2.427, "52_week_high": 299.29, "52_week_low": 152.37 } } ``` - Final Assistant Response The model will then generate a natural language response based on the tool output: ``` The stock fundamentals data for Tesla (TSLA) are as follows: - **Symbol**: TSLA - **Company Name**: Tesla, Inc. - **Sector**: Consumer Cyclical - **Industry**: Auto Manufacturers - **Market Capitalization**: $566,160,130,480 - **Forward Price-to-Earnings Ratio (PE Ratio)**: 42.73 - **Price-to-Book Ratio (PB Ratio)**: 9.04 - **Dividend Yield**: N/A - **Trailing Earnings Per Share (EPS)**: $4.3 - **Beta Value of the Stock**: 2.42 - **52-Week High Price of the Stock**: $299.29 - **52-Week Low Price of the Stock**: $152.37 This information provides a snapshot of Tesla's financial position and performance based on the fundamental data obtained from the yfinance API. ``` ## Run locally **Using Customized Ollama version** Install the customized `Ollama` package (For safety and security reasons, we will not provide download methods for this this modified version!): ``` https://github.com/WinkingFaceAI/olm-recooked.git ``` To pull model checkpoints and run the model, use the `ollama run` command. You can specify the model size by suffixing `v/Mit-ThinkDeeply`, such as `:0.5b`, :`1.5b`, :`3b`, or :`7b`: ```bash ollama run v/Mit-ThinkDeeply:7b ``` You can also interact with the model using Python. Here's an example using its OpenAI-compatible API: ```python from openai import OpenAI client = OpenAI( base_url='http://localhost:11434/v1/', api_key='ollama', # required but ignored ) chat_completion = client.chat.completions.create( messages=[ { 'role': 'user', 'content': 'Say this is a test', } ], model='v/Mit-ThinkDeeply:7b', ) print(chat_completion.choices[0].message.content) ``` [See more models on ollama here!](https://ollama.com/_) **Using Ollama**(will be weaker) After [installing ollama](https://github.com/ollama/ollama), you can pull model checkpoints and run the model, use the `ollama run` command. You can specify the model size by suffixing `v/Qwen-Mit-ThinkDeeply`, such as `:0.5b`, :`1.5b`, :`3b`, or :`7b`: ```bash ollama run v/Qwen-Mit-ThinkDeeply:7b ``` You can also interact with the model using Python. Here's an example using its OpenAI-compatible API: ```python from openai import OpenAI client = OpenAI( base_url='http://localhost:11434/v1/', api_key='ollama', # required but ignored ) chat_completion = client.chat.completions.create( messages=[ { 'role': 'user', 'content': 'Say this is a test', } ], model='v/Mit-ThinkDeeply:7b', ) print(chat_completion.choices[0].message.content) ``` [See more models on ollama here!](https://ollama.com/_) ## Deployment **Using vLLM** We recommend using **Customized vLLM version** for deploying **Mit-ThinkDeeply** in production environments. Install the customized `vLLM` package (For safety and security reasons, we will not provide download methods for this this modified version!): ```bash git clone https://github.com/WinkingFaceAI/vllm-recooked.git ``` To serve the model, use the following command: ```bash vllm serve WinkingFace/Mit-ThinkDeeply-7B ``` Use the chat API via `curl`: ```bash curl http://localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "WinkingFace/Mit-ThinkDeeply-7B", "messages": [ {"role": "system", "content": "You are Mit, created by WinkingFace. You are a deep thinking AI, capable of using extremely long chains of thought to deeply consider the problem and deliberate via systematic reasoning processes. Enclose your thoughts and internal monologue inside tags, and then provide your solution or response to the problem."}, {"role": "user", "content": "If a regular hexagon has a short diagonal of 64, what is its long diagonal?"} ], "temperature": 0.6, "top_p": 0.8, "repetition_penalty": 1.05, "max_tokens": 512 }' ``` Use the chat API via `OpenAI` package: ```python from openai import OpenAI # Set OpenAI's API key and API base to use vLLM's API server. openai_api_key = "EMPTY" openai_api_base = "http://localhost:8000/v1" client = OpenAI( api_key=openai_api_key, base_url=openai_api_base, ) chat_response = client.chat.completions.create( model="Mit-ThinkDeeply-7B", messages=[ {"role": "system", "content": "You are Mit, created by WinkingFace. You are a deep thinking AI, capable of using extremely long chains of thought to deeply consider the problem and deliberate via systematic reasoning processes. Enclose your thoughts and internal monologue inside tags, and then provide your solution or response to the problem."}, {"role": "user", "content": "If a regular hexagon has a short diagonal of 64, what is its long diagonal?"}, ], temperature=0.7, top_p=0.8, max_tokens=512, extra_body={ "repetition_penalty": 1.05, }, ) print("Chat response:", chat_response) ``` ## Evaluation & Performance
| Category | Benchmark | Mit-ThinkDeeply-0.5B | Mit-ThinkDeeply-1.5B | Mit-ThinkDeeply-3B | Mit-ThinkDeeply-7B | |----------|-----------|----------------------|----------------------|--------------------|--------------------| | | Context Length | 32K | 32K | 32K | 128K | | | Generation Length | 8K | 8K | 8K | 8K | | General | MMLU | 45.4 | 58.9 | 63.8 | 72.6 | | | MMLU-pro | 13.8 | 26.6 | 33.0 | 43.7 | | | MMLU-redux | 43.1 | 56.8 | 62.7 | 70.3 | | | BBH | 18.3 | 41.7 | 64.9 | 68.1 | | | ARC-C | 32.9 | 56.0 | 57.5 | 65.8 | | Code | LiveCodeBench | 11.5 | 21.4 | 25.9 | 36.2 | | | HumanEval | 25.4 | 44.6 | 51.6 | 69.5 | | | HumanEval+ | 29.7 | 38.1 | 43.9 | 60.7 | | | MBPP | 46.3 | 74.2 | 69.9 | 82.9 | | | MBPP+ | 36.8 | 59.5 | 59.3 | 70.2 | | | MultiPL-E | 24.9 | 51.7 | 49.6 | 58.1 | | Mathematics | GPQA | 25.1 | 29.0 | 31.5 | 40.7 | | | Theoremqa | 18.2 | 23.2 | 27.9 | 39.4 | | | MATH | 25.4 | 38.1 | 46.7 | 54.8 | | | MATH-500 | 62.5 | 79.2 | 88.4 | 94.6 | | | MMLU-stem | 43.3 | 65.8 | 75.1 | 81.3 | | | GSM8K | 45.8 | 70.1 | 81.5 | 86.2 |
## License This code repository and the model weights are licensed under the [Apache 2.0 License](https://huggingface.co/WinkingFace/Mit-0.5B/blob/main/LICENSE). The **Mit-ThinkDeeply** series is fully compatible with commercial use and allows for modifications and derivative works, including but not limited to distillation for training other LLMs. Please note that: - Mit-ThinkDeeply-0.5B, Mit-ThinkDeeply-1.5B, Mit-ThinkDeeply-3B, and Mit-ThinkDeeply-7B are derived from the [Qwen series](https://huggingface.co/Qwen), which is also licensed under the [Apache 2.0 License](https://huggingface.co/Qwen/Qwen2.5-1.5B/blob/main/LICENSE). ## Citation ``` If you find our work helpful, feel free to cite us: @misc{mit-thinkdeeply, title = {Mit-ThinkDeeply: Advanced Reasoning and Contextual Awareness in Large Language Models}, author = {WinkingFace Team}, year = {2025}, url = {https://huggingface.co/WinkingFace/Mit-ThinkDeeply-7B} } ``` ## Contact For any questions or inquiries, feel free to [contact us here 📨](mailto:contact@winkingfacehub.com).