minpeter commited on
Commit
efbb0e0
·
verified ·
1 Parent(s): 1635bf7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -9
README.md CHANGED
@@ -16,14 +16,6 @@ tags:
16
  ---
17
 
18
 
19
- The only difference from Llama-3.2-1B-chatml-tool-v1 is that it uses AlternateTokenizer, which does not define tool-related tokens (<tools>, <tool_call>, <tool_response>).
20
-
21
- In the case of the existing tool-AlternateTokenizer, the <tool_call> tag was not properly generated before the function call, but in v2, it was observed that it performed well when trained with the general AlternateTokenizer.
22
-
23
- need to check whether this phenomenon is repeated in larger models (3B, 8B).
24
-
25
-
26
-
27
  ## Model Performance Comparison (BFCL)
28
 
29
  | task name | minpeter/Llama-3.2-1B-chatml-tool-v2 | meta-llama/Llama-3.2-1B-Instruct (measure) | meta-llama/Llama-3.2-1B-Instruct (Reported) |
@@ -33,5 +25,12 @@ need to check whether this phenomenon is repeated in larger models (3B, 8B).
33
  | simple | **0.72** | 0.215 | 0.2925 |
34
  | multiple | **0.695** | 0.17 | 0.335 |
35
 
36
-
37
  *Parallel calls are not taken into account. 0 points are expected. We plan to fix this in v3.
 
 
 
 
 
 
 
 
 
16
  ---
17
 
18
 
 
 
 
 
 
 
 
 
19
  ## Model Performance Comparison (BFCL)
20
 
21
  | task name | minpeter/Llama-3.2-1B-chatml-tool-v2 | meta-llama/Llama-3.2-1B-Instruct (measure) | meta-llama/Llama-3.2-1B-Instruct (Reported) |
 
25
  | simple | **0.72** | 0.215 | 0.2925 |
26
  | multiple | **0.695** | 0.17 | 0.335 |
27
 
 
28
  *Parallel calls are not taken into account. 0 points are expected. We plan to fix this in v3.
29
+
30
+ ### Note
31
+
32
+ The only difference from Llama-3.2-1B-chatml-tool-v1 is that it uses AlternateTokenizer, which does not define tool-related tokens (`<tools>`, `<tool_call>`, `<tool_response>`).
33
+
34
+ In the case of the existing tool-AlternateTokenizer, the `<tool_call>` tag was not properly generated before the function call, but in v2, it was observed that it performed well when trained with the general AlternateTokenizer.
35
+
36
+ We need to check whether this phenomenon is repeated in larger models (3B, 8B).