File size: 1,672 Bytes
cf0b1b2 48b692b 1635bf7 48b692b efbb0e0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
---
license: llama3.2
datasets:
- teknium/OpenHermes-2.5
- NousResearch/hermes-function-calling-v1
base_model:
- minpeter/QLoRA-Llama-3.2-1B-chatml-tool-v2
- minpeter/Llama-3.2-1B-AlternateTokenizer-chatml
language:
- en
pipeline_tag: text-generation
library_name: transformers
tags:
- axolotl
- merge
---
## Model Performance Comparison (BFCL)
| task name | minpeter/Llama-3.2-1B-chatml-tool-v2 | meta-llama/Llama-3.2-1B-Instruct (measure) | meta-llama/Llama-3.2-1B-Instruct (Reported) |
|-----------------|-----------------------------------|-----------------------------------|---------------------------------------|
| parallel_multiple | 0.000 | 0.025 | **0.15** |
| parallel | 0.000 | 0.035 | **0.36** |
| simple | **0.72** | 0.215 | 0.2925 |
| multiple | **0.695** | 0.17 | 0.335 |
*Parallel calls are not taken into account. 0 points are expected. We plan to fix this in v3.
### Note
The only difference from Llama-3.2-1B-chatml-tool-v1 is that it uses AlternateTokenizer, which does not define tool-related tokens (`<tools>`, `<tool_call>`, `<tool_response>`).
In the case of the existing tool-AlternateTokenizer, the `<tool_call>` tag was not properly generated before the function call, but in v2, it was observed that it performed well when trained with the general AlternateTokenizer.
We need to check whether this phenomenon is repeated in larger models (3B, 8B).
|