Model Performance Comparison (BFCL)

task name minpeter/Llama-3.2-1B-chatml-tool-v2 meta-llama/Llama-3.2-1B-Instruct (measure) meta-llama/Llama-3.2-1B-Instruct (Reported)
parallel_multiple 0.000 0.025 0.15
parallel 0.000 0.035 0.36
simple 0.72 0.215 0.2925
multiple 0.695 0.17 0.335

*Parallel calls are not taken into account. 0 points are expected. We plan to fix this in v3.

Note

The only difference from Llama-3.2-1B-chatml-tool-v1 is that it uses AlternateTokenizer, which does not define tool-related tokens (<tools>, <tool_call>, <tool_response>).

In the case of the existing tool-AlternateTokenizer, the <tool_call> tag was not properly generated before the function call, but in v2, it was observed that it performed well when trained with the general AlternateTokenizer.

We need to check whether this phenomenon is repeated in larger models (3B, 8B).

Downloads last month
5
Safetensors
Model size
1.24B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for minpeter/Llama-3.2-1B-chatml-tool-v2

Datasets used to train minpeter/Llama-3.2-1B-chatml-tool-v2

Collection including minpeter/Llama-3.2-1B-chatml-tool-v2