Safetensors
qwen2
Muning9999 commited on
Commit
1a3d546
·
verified ·
1 Parent(s): 8cac626

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -22,7 +22,7 @@ First, we evaluate Hammer series on the Berkeley Function-Calling Leaderboard (B
22
  <img src="figures/bfcl.PNG" alt="overview" width="1480" style="margin: auto;">
23
  </div>
24
 
25
- The above table indicates that within the BFCL framework, our Hammer series consistently achieves corresponding state-of-the-art (SOTA) performance at comparable scales, particularly Hammer-7B, whose overall performance ranks second only to the proprietary GPT-4.
26
 
27
 
28
  In addition, we evaluated our Hammer series (1.5b, 4b, 7b) on other academic benchmarks to further show our model's generalization ability:
@@ -31,7 +31,7 @@ In addition, we evaluated our Hammer series (1.5b, 4b, 7b) on other academic ben
31
  <img src="figures/others.PNG" alt="overview" width="1000" style="margin: auto;">
32
  </div>
33
 
34
- Upon observing Hammer's performance across various benchmarks unrelated to the APIGen Function Calling Datasets, we find that Hammer demonstrates remarkably stable relative performance, which indicates the robustness of Hammers. In contrast, the baseline methods exhibit varying degrees of effectiveness across these other benchmarks.
35
 
36
  ## Requiements
37
  The code of Hammer-7b has been in the latest Hugging face transformers and we advise you to install `transformers>=4.37.0`.
 
22
  <img src="figures/bfcl.PNG" alt="overview" width="1480" style="margin: auto;">
23
  </div>
24
 
25
+ The above table indicates that within the BFCL framework, our Hammer series consistently achieves corresponding sota performance at comparable scales, particularly Hammer-7B, whose overall performance ranks second only to the proprietary GPT-4.
26
 
27
 
28
  In addition, we evaluated our Hammer series (1.5b, 4b, 7b) on other academic benchmarks to further show our model's generalization ability:
 
31
  <img src="figures/others.PNG" alt="overview" width="1000" style="margin: auto;">
32
  </div>
33
 
34
+ Upon observing Hammer's performance across various benchmarks unrelated to the APIGen Function Calling Datasets, we find that Hammer demonstrates remarkably stable performance, which indicates the robustness of Hammers. In contrast, the baseline methods exhibit varying degrees of effectiveness across these other benchmarks.
35
 
36
  ## Requiements
37
  The code of Hammer-7b has been in the latest Hugging face transformers and we advise you to install `transformers>=4.37.0`.