|
--- |
|
license: llama3.2 |
|
datasets: |
|
- teknium/OpenHermes-2.5 |
|
- NousResearch/hermes-function-calling-v1 |
|
base_model: |
|
- minpeter/QLoRA-Llama-3.2-1B-chatml-tool-v2 |
|
- minpeter/Llama-3.2-1B-AlternateTokenizer-chatml |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
library_name: transformers |
|
tags: |
|
- axolotl |
|
- merge |
|
--- |
|
|
|
|
|
## Model Performance Comparison (BFCL) |
|
|
|
| task name | minpeter/Llama-3.2-1B-chatml-tool-v2 | meta-llama/Llama-3.2-1B-Instruct (measure) | meta-llama/Llama-3.2-1B-Instruct (Reported) | |
|
|-----------------|-----------------------------------|-----------------------------------|---------------------------------------| |
|
| parallel_multiple | 0.000 | 0.025 | **0.15** | |
|
| parallel | 0.000 | 0.035 | **0.36** | |
|
| simple | **0.72** | 0.215 | 0.2925 | |
|
| multiple | **0.695** | 0.17 | 0.335 | |
|
|
|
*Parallel calls are not taken into account. 0 points are expected. We plan to fix this in v3. |
|
|
|
### Note |
|
|
|
The only difference from Llama-3.2-1B-chatml-tool-v1 is that it uses AlternateTokenizer, which does not define tool-related tokens (`<tools>`, `<tool_call>`, `<tool_response>`). |
|
|
|
In the case of the existing tool-AlternateTokenizer, the `<tool_call>` tag was not properly generated before the function call, but in v2, it was observed that it performed well when trained with the general AlternateTokenizer. |
|
|
|
We need to check whether this phenomenon is repeated in larger models (3B, 8B). |
|
|