Safetensors
qwen2
Guanyu419 commited on
Commit
092cfc3
·
verified ·
1 Parent(s): 0fbf8dc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -430
README.md CHANGED
@@ -14,443 +14,25 @@ Hammer-7b is a finetuned model built upon [Qwen2-7B-Instruct](https://huggingfac
14
 
15
  ## Evaluation
16
  First, we evaluate our model on the Berkeley Function-Calling Leaderboard (BFCL), and the performance is as follows:
17
- <style type="text/css">
18
- .tg {border-collapse:collapse;border-spacing:0;}
19
- .tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
20
- overflow:hidden;padding:10px 5px;word-break:normal;}
21
- .tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
22
- font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
23
- .tg .tg-9id2{color:#007BFF;text-align:center;vertical-align:middle}
24
- .tg .tg-pchv{color:#212529;font-weight:bold;text-align:center;vertical-align:middle}
25
- .tg .tg-qai4{color:#212529;text-align:center;vertical-align:middle}
26
- .tg .tg-p59o{color:#00E;text-align:center;text-decoration:underline;vertical-align:top}
27
- </style>
28
- <table class="tg"><thead>
29
- <tr>
30
- <th class="tg-pchv"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#212529">Rank</span></th>
31
- <th class="tg-pchv"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#212529">Overall</span> <span style="font-weight:700;font-style:normal;text-decoration:none;color:#212529">Acc</span></th>
32
- <th class="tg-pchv"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#212529">Model</span></th>
33
- <th class="tg-pchv"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#212529">AST</span> <span style="font-weight:700;font-style:normal;text-decoration:none;color:#212529">Summary</span></th>
34
- <th class="tg-pchv"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#212529">Exec</span> <span style="font-weight:700;font-style:normal;text-decoration:none;color:#212529">Summary</span></th>
35
- <th class="tg-pchv"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#212529">Irrelevance</span></th>
36
- <th class="tg-pchv"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#212529">Relevance</span></th>
37
- <th class="tg-pchv"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#212529">Organization</span></th>
38
- <th class="tg-pchv"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#212529">License</span></th>
39
- </tr>
40
- </thead>
41
- <tbody>
42
- <tr>
43
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">1</span></td>
44
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">85.79</span></td>
45
- <td class="tg-p59o"><a href="https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo">GPT-4-0125-Preview (Prompt)</a></td>
46
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">85.5</span></td>
47
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">89.25</span></td>
48
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">61.35</span></td>
49
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">97.56</span></td>
50
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">OpenAI</span></td>
51
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">Proprietary</span></td>
52
- </tr>
53
- <tr>
54
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">2</span></td>
55
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">85</span></td>
56
- <td class="tg-p59o"><a href="https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo">GPT-4-1106-Preview (Prompt)</a></td>
57
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">86.31</span></td>
58
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">87.38</span></td>
59
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">64.98</span></td>
60
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">90.24</span></td>
61
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">OpenAI</span></td>
62
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">Proprietary</span></td>
63
- </tr>
64
- <tr>
65
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">3</span></td>
66
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">84.74</span></td>
67
- <td class="tg-p59o"><a href="https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo">GPT-4-0613 (Prompt)</a></td>
68
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">84.66</span></td>
69
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">87.57</span></td>
70
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">75.57</span></td>
71
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">82.93</span></td>
72
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">OpenAI</span></td>
73
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">Proprietary</span></td>
74
- </tr>
75
- <tr>
76
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">4</span></td>
77
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">83.92</span></td>
78
- <td class="tg-9id2"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#007BFF">Hammer-7b</span></td>
79
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">78.7</span></td>
80
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">89.71</span></td>
81
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">72.87</span></td>
82
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">92.68</span></td>
83
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">MadeAgents</span></td>
84
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">cc-by-nc-4.0</span></td>
85
- </tr>
86
- <tr>
87
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">5</span></td>
88
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">83.89</span></td>
89
- <td class="tg-p59o"><a href="https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo">GPT-4-turbo-2024-04-09 (Prompt)</a></td>
90
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">85.41</span></td>
91
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">88.12</span></td>
92
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">61.82</span></td>
93
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">82.93</span></td>
94
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">OpenAI</span></td>
95
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">Proprietary</span></td>
96
- </tr>
97
- <tr>
98
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">6</span></td>
99
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">83.35</span></td>
100
- <td class="tg-p59o"><a href="https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/">GPT-4o-mini-2024-07-18 (Prompt)</a></td>
101
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">80.51</span></td>
102
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">87.95</span></td>
103
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">79.2</span></td>
104
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">80.49</span></td>
105
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">OpenAI</span></td>
106
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">Proprietary</span></td>
107
- </tr>
108
- <tr>
109
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">7</span></td>
110
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">83.13</span></td>
111
- <td class="tg-p59o"><a href="https://openai.com/index/hello-gpt-4o/">GPT-4o-2024-05-13 (Prompt)</a></td>
112
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">83.83</span></td>
113
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">85.12</span></td>
114
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">77.44</span></td>
115
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">78.05</span></td>
116
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">OpenAI</span></td>
117
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">Proprietary</span></td>
118
- </tr>
119
- <tr>
120
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">8</span></td>
121
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">82.55</span></td>
122
- <td class="tg-p59o"><a href="https://huggingface.co/meetkai/functionary-medium-v3.1">Functionary-Medium-v3.1 (FC)</a></td>
123
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">81.06</span></td>
124
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">89.32</span></td>
125
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">73.23</span></td>
126
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">70.73</span></td>
127
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">MeetKai</span></td>
128
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">MIT</span></td>
129
- </tr>
130
- <tr>
131
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">9</span></td>
132
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">81.78</span></td>
133
- <td class="tg-p59o"><a href="https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo">GPT-4-1106-Preview (FC)</a></td>
134
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">77.95</span></td>
135
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">87.61</span></td>
136
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">72.7</span></td>
137
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">82.93</span></td>
138
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">OpenAI</span></td>
139
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">Proprietary</span></td>
140
- </tr>
141
- <tr>
142
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">10</span></td>
143
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">81.59</span></td>
144
- <td class="tg-p59o"><a href="https://llama.meta.com/llama3">Meta-Llama-3-70B-Instruct (Prompt)</a></td>
145
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">80.15</span></td>
146
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">88.04</span></td>
147
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">50.47</span></td>
148
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">92.68</span></td>
149
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">Meta</span></td>
150
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">Meta</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">Llama</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">3</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">Community</span></td>
151
- </tr>
152
- <tr>
153
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">11</span></td>
154
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">80.88</span></td>
155
- <td class="tg-p59o"><a href="https://www.anthropic.com/news/claude-3-family">Claude-3-Opus-20240229 (Prompt)</a></td>
156
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">79.42</span></td>
157
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">87.39</span></td>
158
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">56.15</span></td>
159
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">85.37</span></td>
160
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">Anthropic</span></td>
161
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">Proprietary</span></td>
162
- </tr>
163
- <tr>
164
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">12</span></td>
165
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">80.87</span></td>
166
- <td class="tg-p59o"><a href="https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo">GPT-4-0125-Preview (FC)</a></td>
167
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">77.02</span></td>
168
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">85.3</span></td>
169
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">74.03</span></td>
170
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">85.37</span></td>
171
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">OpenAI</span></td>
172
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">Proprietary</span></td>
173
- </tr>
174
- <tr>
175
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">13</span></td>
176
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">80.23</span></td>
177
- <td class="tg-p59o"><a href="https://huggingface.co/nvidia/nemotron-4-340b-instruct">Nemotron-4-340b-instruct (Prompt)</a></td>
178
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">76.67</span></td>
179
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">83.38</span></td>
180
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">84.1</span></td>
181
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">78.05</span></td>
182
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">NVIDIA</span></td>
183
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">nvidia-open-model-license</span></td>
184
- </tr>
185
- <tr>
186
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">14</span></td>
187
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">80.21</span></td>
188
- <td class="tg-p59o"><a href="https://huggingface.co/meetkai/functionary-small-v3.1">Functionary-Small-v3.1 (FC)</a></td>
189
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">78.64</span></td>
190
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">83.45</span></td>
191
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">68.36</span></td>
192
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">85.37</span></td>
193
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">MeetKai</span></td>
194
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">MIT</span></td>
195
- </tr>
196
- <tr>
197
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">15</span></td>
198
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">79.66</span></td>
199
- <td class="tg-p59o"><a href="https://mistral.ai/news/mistral-large-2407/">mistral-large-2407 (FC Any)</a></td>
200
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">85.61</span></td>
201
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">88.45</span></td>
202
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">0.34</span></td>
203
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">100</span></td>
204
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">Mistral</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">AI</span></td>
205
- <td class="tg-qai4"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#212529">Proprietary</span></td>
206
- </tr>
207
- </tbody></table>
208
 
209
  *Note: The rankings are based on the performance metrics provided.*
210
 
211
  In addition, we also evaluated our model on other benchmarks. Below are the results across several benchmarks, derived from evaluations performed in a zero-shot manner. Our model, Hammer-7b, demonstrated superior performance compared to other models. The table below replicates and extends the format found in ["Granite-Function Calling Model"](https://arxiv.org/abs/2407.00121), particularly Table 6: Function Calling Academic Benchmarks.
212
 
213
- <style type="text/css">
214
- .tg {border-collapse:collapse;border-spacing:0;}
215
- .tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
216
- overflow:hidden;padding:12px 5px;word-break:normal;}
217
- .tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
218
- font-weight:normal;overflow:hidden;padding:12px 5px;word-break:normal;}
219
- .tg .tg-baqh{text-align:center;vertical-align:top}
220
- .tg .tg-7geq{background-color:#ffffc7;text-align:center;vertical-align:top}
221
- .tg .tg-k5c1{background-color:#ffffc7;font-weight:bold;text-align:center;vertical-align:top}
222
- .tg .tg-nrix{text-align:center;vertical-align:middle}
223
- .tg .tg-amwm{font-weight:bold;text-align:center;vertical-align:top}
224
- </style>
225
- <table class="tg"><thead>
226
- <tr>
227
- <th class="tg-nrix" rowspan="2"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Model</span></th>
228
- <th class="tg-nrix" rowspan="2"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Size</span></th>
229
- <th class="tg-baqh" colspan="2"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">API-Bank</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">L-1</span></th>
230
- <th class="tg-baqh" colspan="2"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">API-Bank</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">L-2</span></th>
231
- <th class="tg-baqh" colspan="2"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Tool-Alpaca</span></th>
232
- <th class="tg-baqh" colspan="2"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Nexus</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Raven</span></th>
233
- <th class="tg-baqh" colspan="2"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">F1</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Average</span></th>
234
- </tr>
235
- <tr>
236
- <th class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">F1</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Func-Name</span></th>
237
- <th class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">F1</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Args</span></th>
238
- <th class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">F1</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Func-Name</span></th>
239
- <th class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">F1</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Args</span></th>
240
- <th class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">F1</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Func-Name</span></th>
241
- <th class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">F1</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Args</span></th>
242
- <th class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">F1</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Func-Name</span></th>
243
- <th class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">F1</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Args</span></th>
244
- <th class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">F1</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Func-Name</span></th>
245
- <th class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">F1</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Args</span></th>
246
- </tr>
247
- </thead>
248
- <tbody>
249
- <tr>
250
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Functionary-small-v2.4</span></td>
251
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">7B</span></td>
252
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">78.00%</span></td>
253
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">70.00%</span></td>
254
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">54.00%</span></td>
255
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">45.00%</span></td>
256
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">88.00%</span></td>
257
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">47.00%</span></td>
258
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">82.00%</span></td>
259
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">64.00%</span></td>
260
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">75.50%</span></td>
261
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">56.50%</span></td>
262
- </tr>
263
- <tr>
264
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Gorilla-openfunctions-v2</span></td>
265
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">7B</span></td>
266
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">43.00%</span></td>
267
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">41.00%</span></td>
268
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">12.00%</span></td>
269
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">12.00%</span></td>
270
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">69.00%</span></td>
271
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">39.00%</span></td>
272
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">81.00%</span></td>
273
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">65.00%</span></td>
274
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">51.20%</span></td>
275
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">39.30%</span></td>
276
- </tr>
277
- <tr>
278
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Hermes-2-Pro-Mistral</span></td>
279
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">7B</span></td>
280
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">93.00%</span></td>
281
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">77.00%</span></td>
282
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">54.00%</span></td>
283
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">25.00%</span></td>
284
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">80.00%</span></td>
285
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">26.00%</span></td>
286
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">90.00%</span></td>
287
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">63.00%</span></td>
288
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">79.30%</span></td>
289
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">47.80%</span></td>
290
- </tr>
291
- <tr>
292
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Mistral-Instruct-v0.3</span></td>
293
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">7B</span></td>
294
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">79.00%</span></td>
295
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">69.00%</span></td>
296
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">69.00%</span></td>
297
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">46.00%</span></td>
298
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">33.00%</span></td>
299
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">33.00%</span></td>
300
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">71.00%</span></td>
301
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">54.00%</span></td>
302
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">63.00%</span></td>
303
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">50.50%</span></td>
304
- </tr>
305
- <tr>
306
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">CodeGemma-Instruct</span></td>
307
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">7B</span></td>
308
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">77.00%</span></td>
309
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">57.00%</span></td>
310
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">59.00%</span></td>
311
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">38.00%</span></td>
312
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">59.00%</span></td>
313
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">31.00%</span></td>
314
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">84.00%</span></td>
315
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">68.00%</span></td>
316
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">69.80%</span></td>
317
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">48.50%</span></td>
318
- </tr>
319
- <tr>
320
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Nexusflow-Raven-v2</span></td>
321
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">13B</span></td>
322
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">51.00%</span></td>
323
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">42.00%</span></td>
324
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">28.00%</span></td>
325
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">22.00%</span></td>
326
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">85.00%</span></td>
327
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">37.00%</span></td>
328
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">92.00%</span></td>
329
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">75.00%</span></td>
330
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">64.00%</span></td>
331
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">44.00%</span></td>
332
- </tr>
333
- <tr>
334
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">C4AI-Command-R-v01</span></td>
335
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">35B</span></td>
336
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">93.00%</span></td>
337
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">76.00%</span></td>
338
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">77.00%</span></td>
339
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">54.00%</span></td>
340
- <td class="tg-amwm"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">90.00%</span></td>
341
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">42.00%</span></td>
342
- <td class="tg-amwm"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">93.00%</span></td>
343
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">71.00%</span></td>
344
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">88.30%</span></td>
345
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">60.80%</span></td>
346
- </tr>
347
- <tr>
348
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Meta-Llama-3-70B-Instruct</span></td>
349
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">70B</span></td>
350
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">85.00%</span></td>
351
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">67.00%</span></td>
352
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">69.00%</span></td>
353
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">52.00%</span></td>
354
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">78.00%</span></td>
355
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">43.00%</span></td>
356
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">70.00%</span></td>
357
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">52.00%</span></td>
358
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">75.50%</span></td>
359
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">53.50%</span></td>
360
- </tr>
361
- <tr>
362
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">GRANITE-20B-FUNCTIONCALLING</span></td>
363
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">20B</span></td>
364
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">91.00%</span></td>
365
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">71.00%</span></td>
366
- <td class="tg-amwm"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">83.00%</span></td>
367
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">60.00%</span></td>
368
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">89.00%</span></td>
369
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">44.00%</span></td>
370
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">92.00%</span></td>
371
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">72.00%</span></td>
372
- <td class="tg-amwm"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">88.80%</span></td>
373
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">61.80%</span></td>
374
- </tr>
375
- <tr>
376
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">xlam-7b-fc-r</span></td>
377
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">7B</span></td>
378
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">90.00%</span></td>
379
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">80.70%</span></td>
380
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">68.90%</span></td>
381
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">60.70%</span></td>
382
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">67.30%</span></td>
383
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">59.00%</span></td>
384
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">54.10%</span></td>
385
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">57.50%</span></td>
386
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">70.10%</span></td>
387
- <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">64.50%</span></td>
388
- </tr>
389
- <tr>
390
- <td class="tg-7geq"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Hammer-7b</span></td>
391
- <td class="tg-7geq"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">7B</span></td>
392
- <td class="tg-k5c1"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">93.80%</span></td>
393
- <td class="tg-k5c1"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">85.90%</span></td>
394
- <td class="tg-7geq"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">79.20%</span></td>
395
- <td class="tg-7geq"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">64.40%</span></td>
396
- <td class="tg-7geq"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">82.30%</span></td>
397
- <td class="tg-k5c1"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">59.90%</span></td>
398
- <td class="tg-7geq"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">92.50%</span></td>
399
- <td class="tg-k5c1"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">77.40%</span></td>
400
- <td class="tg-7geq"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">86.90%</span></td>
401
- <td class="tg-k5c1"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">71.90%</span></td>
402
- </tr>
403
- </tbody></table>
404
 
405
  Finally, we evaluate the performance of our model on the [Seal-Tools](https://arxiv.org/abs/2405.08355) dataset, which also achieves better performance.
406
- <style type="text/css">
407
- .tg {border-collapse:collapse;border-spacing:0;}
408
- .tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
409
- overflow:hidden;padding:12px 5px;word-break:normal;}
410
- .tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
411
- font-weight:normal;overflow:hidden;padding:12px 5px;word-break:normal;}
412
- .tg .tg-9wq8{border-color:inherit;text-align:center;vertical-align:middle}
413
- .tg .tg-c3ow{border-color:inherit;text-align:center;vertical-align:top}
414
- .tg .tg-7btt{border-color:inherit;font-weight:bold;text-align:center;vertical-align:top}
415
- .tg .tg-mfhl{background-color:#ffffc7;border-color:inherit;text-align:center;vertical-align:top}
416
- .tg .tg-py60{background-color:#ffffc7;border-color:inherit;font-weight:bold;text-align:center;vertical-align:top}
417
- </style>
418
- <table class="tg"><thead>
419
- <tr>
420
- <th class="tg-9wq8" rowspan="2"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Model</span></th>
421
- <th class="tg-9wq8" rowspan="2"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Size</span></th>
422
- <th class="tg-c3ow" colspan="2"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">SealTool(Single-Tool)</span></th>
423
- </tr>
424
- <tr>
425
- <th class="tg-c3ow"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">F1</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Func-Name</span></th>
426
- <th class="tg-c3ow"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">F1</span> <span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Args</span></th>
427
- </tr></thead>
428
- <tbody>
429
- <tr>
430
- <td class="tg-c3ow"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Gorilla-openfunctions-v2</span></td>
431
- <td class="tg-c3ow"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">7B</span></td>
432
- <td class="tg-c3ow"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">93.20%</span></td>
433
- <td class="tg-c3ow"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">91.10%</span></td>
434
- </tr>
435
- <tr>
436
- <td class="tg-c3ow"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">GRANITE-20B-FUNCTIONCALLING</span></td>
437
- <td class="tg-c3ow"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">20B</span></td>
438
- <td class="tg-c3ow"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">94.90%</span></td>
439
- <td class="tg-7btt"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">92.70%</span></td>
440
- </tr>
441
- <tr>
442
- <td class="tg-c3ow"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">xlam-7b-fc-r</span></td>
443
- <td class="tg-c3ow"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">7B</span></td>
444
- <td class="tg-c3ow"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">79.00%</span></td>
445
- <td class="tg-c3ow"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">76.90%</span></td>
446
- </tr>
447
- <tr>
448
- <td class="tg-mfhl"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Hammer-7b</span></td>
449
- <td class="tg-mfhl"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">7B</span></td>
450
- <td class="tg-py60"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">97.40%</span></td>
451
- <td class="tg-mfhl"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">91.70%</span></td>
452
- </tr>
453
- </tbody></table>
454
 
455
  ## Requiements
456
  The code of Hammer-7b has been in the latest Hugging face transformers and we advise you to install `transformers>=4.37.0`.
 
14
 
15
  ## Evaluation
16
  First, we evaluate our model on the Berkeley Function-Calling Leaderboard (BFCL), and the performance is as follows:
17
+
18
+ <div style="text-align: center;">
19
+ <img src="figures/bfcl.PNG" alt="overview" width="1480" style="margin: auto;">
20
+ </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
  *Note: The rankings are based on the performance metrics provided.*
23
 
24
  In addition, we also evaluated our model on other benchmarks. Below are the results across several benchmarks, derived from evaluations performed in a zero-shot manner. Our model, Hammer-7b, demonstrated superior performance compared to other models. The table below replicates and extends the format found in ["Granite-Function Calling Model"](https://arxiv.org/abs/2407.00121), particularly Table 6: Function Calling Academic Benchmarks.
25
 
26
+ <div style="text-align: center;">
27
+ <img src="figures/other.PNG" alt="overview" width="880" style="margin: auto;">
28
+ </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
 
30
  Finally, we evaluate the performance of our model on the [Seal-Tools](https://arxiv.org/abs/2405.08355) dataset, which also achieves better performance.
31
+
32
+ <div style="text-align: center;">
33
+ <img src="figures/sealtool.PNG" alt="overview" width="480" style="margin: auto;">
34
+ </div>
35
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
 
37
  ## Requiements
38
  The code of Hammer-7b has been in the latest Hugging face transformers and we advise you to install `transformers>=4.37.0`.