--- base_model: unsloth/DeepSeek-R1-Distill-Qwen-1.5B-unsloth-bnb-4bit tags: - text-generation-inference - transformers - unsloth - qwen2 - trl - sft license: apache-2.0 language: - en --- Function calling requires two step inferences, below is the example: # Step 1: ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch import json model_id = "R1_tool_call_Distill-Qwen-1.5B" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.bfloat16, device_map="auto", ) tools = [ { "name": "create_contact", "description": "Create a new contact", "parameters": { "type": "object", "properties": { "name": { "type": "string", "description": "The name of the contact" }, "email": { "type": "string", "description": "The email address of the contact" } }, "required": ["name", "email"] } } ] messages = [ { "role": "user", "content": f"""You have access to these tools, use them if necessary: {tools} I need to create a new contact for my friend John Doe. His email is johndoe@example.com.""" } ] input_ids = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors="pt" ).to(model.device) outputs = model.generate( input_ids, max_new_tokens=256, do_sample=True, temperature=0.6, top_p=0.9, ) response = outputs[0][input_ids.shape[-1]:] print(tokenizer.decode(response, skip_special_tokens=True)) # >> # >> <|tool▁calls▁begin|><|tool▁call▁begin|>function<|tool▁sep|>create_contact # >> ```json # >> {"name": "John Doe", "email": "johndoe@example.com"} # >> ```<|tool▁call▁end|><|tool▁calls▁end|> # Above is a response from assistant, you need to parse it and execute a tool on your own. ``` # Step 2: ```python messages = [ {"role": "user", "content": """You have access to these tools, use them if necessary: {tools}\n\nI need to create a new contact for my friend John Doe. His email is johndoe@example.com."""}, {"role": "assistant", "content": None, "tool_calls": [ { "type": "function", "function": { "name": "create_contact", "arguments": json.dumps({"name": "John Doe", "email": "johndoe@example.com"}) } }, ]}, {"role": "tool", "name": "create_contact", "content": """{"status": "success", "message": "Contact for John Doe with email johndoe@example.com has been created successfully."}"""}, ] input_ids = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors="pt" ).to(model.device) outputs = model.generate( input_ids, max_new_tokens=256, do_sample=True, temperature=0.6, top_p=0.9, ) response = outputs[0][input_ids.shape[-1]:] print(tokenizer.decode(response, skip_special_tokens=True)) # >> # >> Based on the user's request, I have created a contact for John Doe with his email address. The tool has successfully created the contact. I will now provide the contact information to the user. # >> The contact for John Doe has been successfully created with the email address johndoe@example.com. Please feel free to reach out to him if needed. ``` # Limitations: - The model sometimes refused not to think # Uploaded model - **Developed by:** hiieu - **License:** apache-2.0 - **Finetuned from model :** unsloth/DeepSeek-R1-Distill-Qwen-1.5B-unsloth-bnb-4bit This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. [](https://github.com/unslothai/unsloth)