giux78 commited on
Commit
904d6df
1 Parent(s): b1a1070

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +131 -147
README.md CHANGED
@@ -1,199 +1,183 @@
1
  ---
2
  library_name: transformers
3
- tags: []
 
 
 
 
 
4
  ---
 
5
 
6
- # Model Card for Model ID
 
 
7
 
8
- <!-- Provide a quick summary of what the model is/does. -->
 
 
9
 
10
 
 
11
 
12
- ## Model Details
 
 
 
 
 
13
 
14
- ### Model Description
15
 
16
- <!-- Provide a longer summary of what this model is. -->
 
 
 
17
 
18
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
 
20
- - **Developed by:** [More Information Needed]
21
- - **Funded by [optional]:** [More Information Needed]
22
- - **Shared by [optional]:** [More Information Needed]
23
- - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
- - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
27
 
28
- ### Model Sources [optional]
29
 
30
- <!-- Provide the basic links for the model. -->
31
 
32
- - **Repository:** [More Information Needed]
33
- - **Paper [optional]:** [More Information Needed]
34
- - **Demo [optional]:** [More Information Needed]
35
 
36
- ## Uses
37
 
38
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
 
40
- ### Direct Use
41
 
42
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 
 
43
 
44
- [More Information Needed]
45
 
46
- ### Downstream Use [optional]
 
47
 
48
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
 
 
 
49
 
50
- [More Information Needed]
51
 
52
- ### Out-of-Scope Use
53
 
54
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
 
 
 
 
 
 
 
 
55
 
56
- [More Information Needed]
 
 
57
 
58
- ## Bias, Risks, and Limitations
59
 
60
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
 
 
 
 
 
61
 
62
- [More Information Needed]
63
 
64
- ### Recommendations
 
65
 
66
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
 
68
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 
69
 
70
- ## How to Get Started with the Model
 
 
 
 
71
 
72
- Use the code below to get started with the model.
73
 
74
- [More Information Needed]
75
 
76
- ## Training Details
 
77
 
78
- ### Training Data
79
 
80
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
 
 
 
 
 
 
81
 
82
- [More Information Needed]
 
83
 
84
- ### Training Procedure
85
 
86
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 
 
 
 
 
 
 
 
 
 
 
87
 
88
- #### Preprocessing [optional]
89
 
90
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
91
 
92
 
93
- #### Training Hyperparameters
 
 
 
 
 
 
 
 
 
 
 
94
 
95
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 
 
 
 
 
96
 
97
- #### Speeds, Sizes, Times [optional]
98
 
99
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
 
101
- [More Information Needed]
102
 
103
- ## Evaluation
 
104
 
105
- <!-- This section describes the evaluation protocols and provides the results. -->
106
 
107
- ### Testing Data, Factors & Metrics
108
-
109
- #### Testing Data
110
-
111
- <!-- This should link to a Dataset Card if possible. -->
112
-
113
- [More Information Needed]
114
-
115
- #### Factors
116
-
117
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
-
119
- [More Information Needed]
120
-
121
- #### Metrics
122
-
123
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
-
125
- [More Information Needed]
126
-
127
- ### Results
128
-
129
- [More Information Needed]
130
-
131
- #### Summary
132
-
133
-
134
-
135
- ## Model Examination [optional]
136
-
137
- <!-- Relevant interpretability work for the model goes here -->
138
-
139
- [More Information Needed]
140
-
141
- ## Environmental Impact
142
-
143
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
-
145
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
-
147
- - **Hardware Type:** [More Information Needed]
148
- - **Hours used:** [More Information Needed]
149
- - **Cloud Provider:** [More Information Needed]
150
- - **Compute Region:** [More Information Needed]
151
- - **Carbon Emitted:** [More Information Needed]
152
-
153
- ## Technical Specifications [optional]
154
-
155
- ### Model Architecture and Objective
156
-
157
- [More Information Needed]
158
-
159
- ### Compute Infrastructure
160
-
161
- [More Information Needed]
162
-
163
- #### Hardware
164
-
165
- [More Information Needed]
166
-
167
- #### Software
168
-
169
- [More Information Needed]
170
-
171
- ## Citation [optional]
172
-
173
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
-
175
- **BibTeX:**
176
-
177
- [More Information Needed]
178
-
179
- **APA:**
180
-
181
- [More Information Needed]
182
-
183
- ## Glossary [optional]
184
-
185
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
-
187
- [More Information Needed]
188
-
189
- ## More Information [optional]
190
-
191
- [More Information Needed]
192
-
193
- ## Model Card Authors [optional]
194
-
195
- [More Information Needed]
196
-
197
- ## Model Card Contact
198
-
199
- [More Information Needed]
 
1
  ---
2
  library_name: transformers
3
+ tags:
4
+ - functioncalling
5
+ license: apache-2.0
6
+ language:
7
+ - it
8
+ pipeline_tag: text2text-generation
9
  ---
10
+ <img src="https://hoodie-creator.s3.eu-west-1.amazonaws.com/2c331689-original.png" alt="gorilla-llm" border="0" width="400px">
11
 
12
+ ## Introduction
13
+ Zefiro functioncalling extends Large Language Model(LLM) Chat Completion feature to formulate
14
+ executable APIs call given Italian based natural language instructions and API context. With OpenFunctions v2,
15
 
16
+ we now support:
17
+ 1. Relevance detection - when chatting, chat. When asked for function, returns a function
18
+ 2. REST - native REST support
19
 
20
 
21
+ ## Model description
22
 
23
+ - **Model type:** A 7B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
24
+ - **Language(s) (NLP):** Primarily Italian
25
+ - **License:** Apache 2
26
+ - **Finetuned from model:** [gorilla-llm](https://https://huggingface.co/gorilla-llm/gorilla-openfunctions-v2)
27
+ - **Developed by:** [zefiro.ai](https://zefiro.ai)
28
+ - **Sponsored by:** [Seeweb](https://seeweb.it)
29
 
 
30
 
31
+ ## Models Available
32
+ |Model | Functionality|
33
+ |---|---|
34
+ |zefiro-funcioncalling-v0.3-alpha | Given a function, and user intent, returns properly formatted json with the right arguments|
35
 
36
+ All of our models are hosted on our Huggingface mii-community org: [zefiro-functioncalling-v0.3-alpha](https://huggingface.co/mii-community/zefiro-functioncalling-v0.3-alpha).
37
 
38
+ ## Training
 
 
 
 
 
 
39
 
40
+ Zefiro functioncalling alpha is a 7B parameter model, and is fine tuned version of [gorilla-llm](https://huggingface.co/gorilla-llm/gorilla-openfunctions-v2) that is built on top of the [deepseek coder](https://huggingface.co/deepseek-ai/deepseek-coder-7b-instruct-v1.5) LLM.
41
 
 
42
 
 
 
 
43
 
44
+ ## Example Usage (Local)
45
 
 
46
 
47
+ 1. OpenFunctions is compatible with OpenAI Functions
48
 
49
+ ```bash
50
+ !pip install openai==0.28.1, transformers
51
+ ```
52
 
53
+ 2. Load the model
54
 
55
+ ```python
56
+ from transformers import AutoModelForCausalLM, AutoTokenizer
57
 
58
+ model_id = "mii-community/zefiro-functioncalling-v0.3-alpha"
59
+ model = AutoModelForCausalLM.from_pretrained(model_id)
60
+ model.to('cuda')
61
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
62
 
63
+ ```
64
 
65
+ 3. Prepare your data with a system prompt and an array of json openapi compatible: only the description key should be in Italian all the json in english a part all description keys.
66
 
67
+ ```python
68
+ json_arr = [{"name": "order_dinner", "description": "Ordina una cena al ristorante", "parameters": {"type": "object", "properties": {"restaurant_name": {"type": "string", "description": "il nome del ristorante", "enum" : ['Bufalo Bill','Pazzas']}}, "required": ["restaurant_name"]}},
69
+ {"name": "get_weather", "description": "Ottieni le previsioni del tempo meteorologica", "parameters": {"type": "object", "properties": {"location": {"type": "string", "description": "Il nome del luogo "}}, "required": ["location"]}},
70
+ {"name": "create_product", "description": "Crea un prodotto da vendere", "parameters": {"type": "object", "properties": {"product_name": {"type": "string", "description": "Il nome del prodotto "}, "size": {"type": "string", "description": "la taglia del prodotto"}, "price": {"type": "integer", "description": "Il prezzo del prodotto "}}, "required": ["product_name", "size", "price"]}},
71
+ {"name": "get_news", "description": "Dammi le ultime notizie", "parameters": {"type": "object", "properties": {"argument": {"type": "string", "description": "L'argomento su cui fare la ricerca"}}, "required": ["argument"]}},
72
+ ]
73
+ json_string = ' '.join([json.dumps(json_obj) for json_obj in json_arr])
74
+ system_prompt = 'Tu sei un assistenze utile che ha accesso alle seguenti funzioni. Usa le funzioni solo se necessario - \n ' + json_string + ' \n '
75
+ print(system_prompt)
76
 
77
+ test_message = [{'role' : 'system' , 'content' : system_prompt2},
78
+ {'role' : 'user' ,'content' : 'Crea un prodotto di nome AIR size L price 100'}]
79
+ ```
80
 
81
+ 4. Call the model
82
 
83
+ ```python
84
+ def generate_text():
85
+ prompt = tokenizer.apply_chat_template(test_message, tokenize=False)
86
+ model_inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
87
+ generated_ids = model.generate(**model_inputs, max_new_tokens=1024)
88
+ return tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
89
 
 
90
 
91
+ text_response = generate_text()
92
+ ```
93
 
94
+ 5. Parse the response
95
 
96
+ ```python
97
+ FN_CALL_DELIMITER = "<<functioncall>>"
98
 
99
+ def strip_function_calls(content: str) -> list[str]:
100
+ """
101
+ Split the content by the function call delimiter and remove empty strings
102
+ """
103
+ return [element.replace('\n', '') for element in content.split(FN_CALL_DELIMITER)[1:] if element ]
104
 
 
105
 
106
+ functions_string = strip_function_calls(text_response)
107
 
108
+ # Output: [' {"name": "create_product", "arguments": \'{"product_name": "AIR", "size": "L", "price": 100}\'}']
109
+ ```
110
 
111
+ 6. Create an object representation of the string
112
 
113
+ ```python
114
+ # if functions_string contains a function string create a json cleaning
115
+ # multiple functions not supported yet
116
+ if functions_string:
117
+ obj_to_call = json.loads(functions_string[0].replace('\'', ''))
118
+ else:
119
+ print('nothing to do or return a normal chat response')
120
 
121
+ # Output: {'name': 'create_product', 'arguments': {'product_name': 'AIR', 'size': 'L', 'price': 100}}
122
+ ```
123
 
 
124
 
125
+ 7. Prepare data to be OpenAI compatible
126
+
127
+ ```python
128
+ def obj_to_func(obj):
129
+ arguments_keys = obj['arguments'].keys()
130
+ params = []
131
+ for key in arguments_keys:
132
+ param = f'{key}=\"{obj["arguments"][key]}\"'
133
+ params.append(param)
134
+ func_params = ','.join(params)
135
+ print(f'{obj["name"]}({func_params})')
136
+ return f'{obj["name"]}({func_params})'
137
 
138
+ func_str = obj_to_func(obj_to_call)
139
 
140
+ openai_response = {
141
+ "index": 0,
142
+ "message": {
143
+ "role": "assistant",
144
+ "content": func_str,
145
+ "function_call": [
146
+ obj_to_call
147
+ ]
148
+ },
149
+ "finish_reason": "stop"
150
+ }
151
 
152
 
153
+ '''
154
+ Output OpenAI compatible Dictionary
155
+ {'index': 0,
156
+ 'message': {
157
+ 'role': 'assistant',
158
+ 'content': 'create_product(product_name="AIR",size="L",price="100")',
159
+ 'function_call': [{'name': 'create_product', 'arguments': {'product_name': 'AIR', 'size': 'L', 'price': 100}}]
160
+ },
161
+ 'finish_reason': 'stop'
162
+ }
163
+ '''
164
+ ```
165
 
166
+ JSON to be OpenAI compatible.
167
+
168
+ ## Limitation
169
+ The model has some bug and some unexpected behaviour for example the more json you pass the less accurate it become filling the json output but
170
+ the interesting thing is that those are pattern that i did not consider in the data. It will be enough to improove the cases in the data to fix the bugs.
171
+ Stay tuned for a better version soon.
172
 
 
173
 
174
+ ## License
175
 
176
+ Zefiro-functioncalling is distributed under the Apache 2.0 license as the base model Gorilla-LLM v0.2. This software incorporates elements from the Deepseek model. Consequently, the licensing of Gorilla OpenFunctions v2 adheres to the Apache 2.0 license, with additional terms as outlined in [Appendix A](https://github.com/deepseek-ai/DeepSeek-LLM/blob/6712a86bfb7dd25c73383c5ad2eb7a8db540258b/LICENSE-MODEL) of the Deepseek license.
177
 
178
+ ## Contributing
179
+ Please email us your comments, criticism, and questions. More information about the project can be found at [https://zefiro.ai](https://zefiro.ai)
180
 
 
181
 
182
+ ## Citation
183
+ This work is based on Gorilla an open source effort from UC Berkeley and we welcome contributors.