nguyenthanhthuan commited on
Commit
f00ba25
·
verified ·
1 Parent(s): 891b963

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +253 -8
README.md CHANGED
@@ -1,22 +1,267 @@
1
  ---
2
- base_model: unsloth/llama-3.2-1b-instruct-bnb-4bit
 
3
  language:
4
  - en
 
5
  license: apache-2.0
6
  tags:
7
  - text-generation-inference
8
  - transformers
9
  - unsloth
10
  - llama
11
- - gguf
 
 
 
 
12
  ---
 
13
 
14
- # Uploaded model
 
15
 
16
- - **Developed by:** nguyenthanhthuan
17
- - **License:** apache-2.0
18
- - **Finetuned from model :** unsloth/llama-3.2-1b-instruct-bnb-4bit
19
 
20
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
 
 
 
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model:
3
+ - meta-llama/Llama-3.2-1B-Instruct
4
  language:
5
  - en
6
+ - vi
7
  license: apache-2.0
8
  tags:
9
  - text-generation-inference
10
  - transformers
11
  - unsloth
12
  - llama
13
+ - trl
14
+ - Ollama
15
+ - Tool-Calling
16
+ datasets:
17
+ - hiyouga/glaive-function-calling-v2-sharegpt
18
  ---
19
+ # Function Calling Llama Model Version 2
20
 
21
+ ## Overview
22
+ A specialized fine-tuned version of the **`meta-llama/Llama-3.2-1B-Instruct`** model enhanced with function/tool calling capabilities. The model leverages the **`hiyouga/glaive-function-calling-v2-sharegpt`** dataset for training.
23
 
24
+ ## Model Specifications
 
 
25
 
26
+ * **Base Architecture**: meta-llama/Llama-3.2-1B-Instruct
27
+ * **Primary Language**: English (Function/Tool Calling), Vietnamese
28
+ * **Licensing**: Apache 2.0
29
+ * **Primary Developer**: nguyenthanhthuan_banhmi
30
+ * **Key Capabilities**: text-generation-inference, transformers, unsloth, llama, trl, Ollama, Tool-Calling
31
 
32
+ ## Getting Started
33
+
34
+ ### Prerequisites
35
+ Method 1:
36
+ 1. Install [Ollama](https://ollama.com/)
37
+ 2. Install required Python packages:
38
+ ```bash
39
+ pip install langchain pydantic torch langchain-ollama langchain_core
40
+ ```
41
+ Method 1:
42
+ 1. Click use this model
43
+ 2. Click Ollama
44
+
45
+ ### Installation Steps
46
+
47
+ 1. Clone the repository
48
+ 2. Navigate to the project directory
49
+ 3. Create the model in Ollama:
50
+ ```bash
51
+ ollama create <model_name> -f <path_to_modelfile>
52
+ ```
53
+
54
+ ## Implementation Guide
55
+
56
+ ### Model Initialization
57
+
58
+ ```python
59
+ from langchain_ollama import ChatOllama
60
+
61
+ # Initialize model instance
62
+ llm = ChatOllama(model="<model_name>")
63
+ ```
64
+
65
+ ### Basic Usage Example
66
+
67
+ ```python
68
+ # Arithmetic computation example
69
+ query = "What is 3 * 12? Also, what is 11 + 49?"
70
+ response = llm.invoke(query)
71
+
72
+ print(response.content)
73
+ # Output:
74
+ # 1. 3 times 12 is 36.
75
+ # 2. 11 plus 49 is 60.
76
+ ```
77
+
78
+ ### Advanced Function Calling (English Recommended)
79
+
80
+ #### Basic Arithmetic Tools (Different from the first version)
81
+ ```python
82
+ from pydantic import BaseModel
83
+
84
+
85
+ # Note that the docstrings here are crucial, as they will be passed along
86
+ # to the model along with the class name.
87
+ class add(BaseModel):
88
+ """Add two integers together."""
89
+
90
+ a: int = Field(..., description="First integer")
91
+ b: int = Field(..., description="Second integer")
92
+
93
+
94
+ class multiply(BaseModel):
95
+ """Multiply two integers together."""
96
+
97
+ a: int = Field(..., description="First integer")
98
+ b: int = Field(..., description="Second integer")
99
+
100
+
101
+ tools = [add, multiply]
102
+ llm_with_tools = llm.bind_tools(tools)
103
+
104
+ # Execute query and parser result (Different from the first version)
105
+ from langchain_core.output_parsers.openai_tools import PydanticToolsParser
106
+
107
+ query = "What is 3 * 12? Also, what is 11 + 49?"
108
+ chain = llm_with_tools | PydanticToolsParser(tools=tools)
109
+ result = chain.invoke(query)
110
+ print(result)
111
+
112
+ # Output:
113
+ # [multiply(a=3, b=12), add(a=11, b=49)]
114
+ ```
115
+
116
+ #### Complex Tool Integration (Different from the first version)
117
+
118
+ ```python
119
+ from pydantic import BaseModel, Field
120
+ from typing import List, Optional
121
+
122
+ class SendEmail(BaseModel):
123
+ """Send an email to specified recipients."""
124
+
125
+ to: List[str] = Field(..., description="List of email recipients")
126
+ subject: str = Field(..., description="Email subject")
127
+ body: str = Field(..., description="Email content/body")
128
+ cc: Optional[List[str]] = Field(None, description="CC recipients")
129
+ attachments: Optional[List[str]] = Field(None, description="List of attachment file paths")
130
+
131
+ class WeatherInfo(BaseModel):
132
+ """Get weather information for a specific location."""
133
+
134
+ city: str = Field(..., description="City name")
135
+ country: Optional[str] = Field(None, description="Country name")
136
+ units: str = Field("celsius", description="Temperature units (celsius/fahrenheit)")
137
+
138
+ class SearchWeb(BaseModel):
139
+ """Search the web for given query."""
140
+
141
+ query: str = Field(..., description="Search query")
142
+ num_results: int = Field(5, description="Number of results to return")
143
+ language: str = Field("en", description="Search language")
144
+
145
+ class CreateCalendarEvent(BaseModel):
146
+ """Create a calendar event."""
147
+
148
+ title: str = Field(..., description="Event title")
149
+ start_time: str = Field(..., description="Event start time (ISO format)")
150
+ end_time: str = Field(..., description="Event end time (ISO format)")
151
+ description: Optional[str] = Field(None, description="Event description")
152
+ attendees: Optional[List[str]] = Field(None, description="List of attendee emails")
153
+
154
+ class TranslateText(BaseModel):
155
+ """Translate text between languages."""
156
+
157
+ text: str = Field(..., description="Text to translate")
158
+ source_lang: str = Field(..., description="Source language code (e.g., 'en', 'es')")
159
+ target_lang: str = Field(..., description="Target language code (e.g., 'fr', 'de')")
160
+
161
+ class SetReminder(BaseModel):
162
+ """Set a reminder for a specific time."""
163
+
164
+ message: str = Field(..., description="Reminder message")
165
+ time: str = Field(..., description="Reminder time (ISO format)")
166
+ priority: str = Field("normal", description="Priority level (low/normal/high)")
167
+ tools = [
168
+ SendEmail,
169
+ WeatherInfo,
170
+ SearchWeb,
171
+ CreateCalendarEvent,
172
+ TranslateText,
173
+ SetReminder
174
+ ]
175
+ llm_tools = llm.bind_tools(tools)
176
+
177
+ # # Execute query and parser result (Different from the first version)
178
+ from langchain_core.output_parsers.openai_tools import PydanticToolsParser
179
+
180
+ query = "Set a reminder to call John at 3 PM tomorrow. Also, translate 'Hello, how are you?' to Spanish."
181
+ chain = llm_tools | PydanticToolsParser(tools=tools)
182
+ result = chain.invoke(query)
183
+ print(result)
184
+
185
+ # Output:
186
+ # [SetReminder(message='Set a reminder for a specific time.', time='3 PM tomorrow', priority='normal'),
187
+ # TranslateText(text='Hello, how are you?', source_lang='en', target_lang='es')]
188
+ ```
189
+
190
+ ## Core Features
191
+
192
+ * Arithmetic computation support
193
+ * Advanced function/tool calling capabilities
194
+ * Seamless Langchain integration
195
+ * Full Ollama platform compatibility
196
+
197
+ ## Technical Details
198
+
199
+ ### Dataset Information
200
+ Training utilized the **`hiyouga/glaive-function-calling-v2-sharegpt`** dataset, featuring comprehensive function calling interaction examples.
201
+
202
+ ### Known Limitations
203
+
204
+ * Basic function/tool calling
205
+ * English language support exclusively
206
+ * Ollama installation dependency
207
+
208
+ ## Important Notes & Considerations
209
+
210
+ ### Potential Limitations and Edge Cases
211
+
212
+ * **Function Parameter Sensitivity**: The model may occasionally misinterpret complex parameter combinations, especially when multiple optional parameters are involved. Double-check parameter values in critical applications.
213
+
214
+ * **Response Format Variations**:
215
+ - In some cases, the function calling format might deviate from the expected JSON structure
216
+ - The model may generate additional explanatory text alongside the function call
217
+ - Multiple function calls in a single query might not always be processed in the expected order
218
+
219
+ * **Error Handling Considerations**:
220
+ - Empty or null values might not be handled consistently across different function types
221
+ - Complex nested objects may sometimes be flattened unexpectedly
222
+ - Array inputs might occasionally be processed as single values
223
+
224
+ ### Best Practices for Reliability
225
+
226
+ 1. **Input Validation**:
227
+ - Always validate input parameters before processing
228
+ - Implement proper error handling for malformed function calls
229
+ - Consider adding default values for optional parameters
230
+
231
+ 2. **Testing Recommendations**:
232
+ - Test with various input combinations and edge cases
233
+ - Implement retry logic for inconsistent responses
234
+ - Log and monitor function call patterns for debugging
235
+
236
+ 3. **Performance Optimization**:
237
+ - Keep function descriptions concise and clear
238
+ - Limit the number of simultaneous function calls
239
+ - Cache frequently used function results when possible
240
+
241
+ ### Known Issues
242
+
243
+ * Model may struggle with:
244
+ - Very long function descriptions
245
+ - Highly complex nested parameter structures
246
+ - Ambiguous or overlapping function purposes
247
+ - Non-English parameter values or descriptions
248
+
249
+ ## Development
250
+
251
+ ### Contributing Guidelines
252
+ We welcome contributions through issues and pull requests for improvements and bug fixes.
253
+
254
+ ### License Information
255
+ Released under Apache 2.0 license. See LICENSE file for complete terms.
256
+
257
+ ## Academic Citation
258
+
259
+ ```bibtex
260
+ @misc{function-calling-llama,
261
+ author = {nguyenthanhthuan_banhmi},
262
+ title = {Function Calling Llama Model Version 2} ,
263
+ year = {2024},
264
+ publisher = {GitHub},
265
+ journal = {GitHub repository}
266
+ }
267
+ ```