Update README.md
Browse files
README.md
CHANGED
@@ -63,6 +63,24 @@ First, we evaluate Hammer series on the Berkeley Function-Calling Leaderboard (B
|
|
63 |
|
64 |
In addition, we evaluated our Hammer2.0 series (0.5b, 1.5b, 3b, 7b) on other academic benchmarks to further show our model's generalization ability:
|
65 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
66 |
|
67 |
## Requiements
|
68 |
The code of Hammer2.0-7b has been in the latest Hugging face transformers and we advise you to install `transformers>=4.37.0`.
|
|
|
63 |
|
64 |
In addition, we evaluated our Hammer2.0 series (0.5b, 1.5b, 3b, 7b) on other academic benchmarks to further show our model's generalization ability:
|
65 |
|
66 |
+
| Model | Size | Func-Name+Args Det. (F1 Func-Name \| F1 Args) | | | | | | | | | | F1 Average | |
|
67 |
+
|:---------------------------:|:----:|:---------------------------------------------:|:-----:|:------------:|:-----:|:-----------:|:-----:|:---------------------:|:-----:|:-----------:|:-----:|:----------:|:-----:|
|
68 |
+
| | | API-Bank L-1 | | API-Bank L-2 | | Tool-Alpaca | | SealTool(Single-Tool) | | Nexus Raven | | Func Name | Args |
|
69 |
+
| GPT-4o-mini (Prompt) | -- | 95.1% | 89.3% | 84.3% | 67.5% | 64.3% | 54.7% | 87.9% | 86.0% | 91.7% | 84.6% | 84.7% | 76.4% |
|
70 |
+
| qwen2-7b-instruct | 7B | 81.5% | 60.6% | 95.7% | 49.5% | 71.6% | 48.1% | 93.9% | 77.5% | 87.1% | 63.5% | 85.9% | 59.8% |
|
71 |
+
| qwen1.5-4b-Chat | 4B | 55.3% | 59.8% | 46.7% | 38.5% | 35.4% | 17.0% | 48.4% | 62.3% | 29.0% | 33.7% | 43.0% | 42.2% |
|
72 |
+
| qwen2-1.5b-instruct | 1.5B | 74.6% | 63.6% | 57.7% | 33.6% | 65.8% | 45.2% | 82.1% | 75.5% | 70.6% | 45.5% | 70.2% | 52.7% |
|
73 |
+
| Gorilla-openfunctions-v2 | 7B | 69.2% | 70.3% | 48.8% | 54.7% | 72.9% | 51.3% | 93.2% | 91.1% | 72.8% | 68.4% | 71.4% | 67.2% |
|
74 |
+
| GRANITE-20B-FUNCTIONCALLING | 20B | 90.4% | 77.8% | 78.9% | 59.2% | 77.3% | 58.0% | 94.9% | 92.7% | 94.5% | 75.1% | 87.2% | 72.6% |
|
75 |
+
| xlam-7b-fc-r | 7B | 90.0% | 80.7% | 72.5% | 64.2% | 67.3% | 59.0% | 79.0% | 76.9% | 54.1% | 57.5% | 72.6% | 67.7% |
|
76 |
+
| xlam-1b-fc-r | 1.3B | 94.9% | 83.7% | 91.8% | 64.3% | 64.9% | 50.6% | 90.7% | 80.4% | 64.4% | 54.8% | 81.3% | 66.8% |
|
77 |
+
| Hammer-7b | 7B | 93.5% | 85.8% | 82.9% | 66.4% | 82.3% | 59.9% | 97.4% | 91.7% | 92.5% | 77.4% | 89.7% | 76.2% |
|
78 |
+
| Hammer-4b | 4B | 91.6% | 81.5% | 77.6% | 61.0% | 85.1% | 57.0% | 96.4% | 92.4% | 81.7% | 64.9% | 86.5% | 71.4% |
|
79 |
+
| Hammer-1.5b | 1.5B | 82.1% | 72.3% | 79.8% | 59.7% | 80.9% | 53.5% | 95.6% | 88.6% | 79.9% | 56.9% | 83.7% | 66.2% |
|
80 |
+
| Hammer2.0-0.5B | 0.5B | 81.2% | 67.8% | 62.9% | 52.0% | 79.1% | 50.9% | 94.9% | 83.8% | 74.7% | 49.0% | 78.5% | 60.7% |
|
81 |
+
| Hammer2.0-1.5B | 1.5B | 90.2% | 80.4% | 82.9% | 63.8% | 86.2% | 59.5% | 97.5% | 92.5% | 86.4% | 65.5% | 88.6% | 72.4% |
|
82 |
+
| Hammer2.0-3B | 3B | 93.6% | 84.3% | 83.7% | 59.0% | 83.1% | 58.8% | 95.3% | 91.2% | 92.5% | 70.5% | 89.6% | 72.8% |
|
83 |
+
| Hammer2.0-7B | 7B | 91.0% | 82.1% | 82.5% | 65.1% | 85.2% | 59.6% | 96.8% | 92.7% | 93.0% | 80.5% | 89.7% | 76.0% |
|
84 |
|
85 |
## Requiements
|
86 |
The code of Hammer2.0-7b has been in the latest Hugging face transformers and we advise you to install `transformers>=4.37.0`.
|