teknium commited on
Commit
0a29d8b
1 Parent(s): 174f590

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -0
README.md CHANGED
@@ -235,6 +235,26 @@ Average: 41.65
235
  | | |mc2 |0.5911|± |0.0158|
236
  ```
237
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
238
  # Inference Code
239
 
240
  Here is example code using HuggingFace Transformers to inference the model (note: in 4bit, it will require around 5GB of VRAM)
 
235
  | | |mc2 |0.5911|± |0.0158|
236
  ```
237
 
238
+ # Function Calling Evaluations
239
+
240
+ We worked with Fireworks.AI on evaluations by starting off with their Function Calling eval dataset, fixing some unsolveable ones, and generating a second eval dataset for JSON mode.
241
+
242
+ ## Function Calling Accuracy: 91%
243
+
244
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/XF3Zii4-QhE2yjWwHr_v4.png)
245
+
246
+ ## JSON Mode Accuracy: 84%
247
+
248
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/8H2iyjh5wyP2FtLq2LCed.png)
249
+
250
+ Run the evaluator yourself using @interstellarninja's codebase here:
251
+ https://github.com/interstellarninja/function-calling-eval
252
+
253
+ You can find the evaluation datasets here:
254
+ https://huggingface.co/datasets/NousResearch/func-calling-eval
255
+ https://huggingface.co/datasets/NousResearch/json-mode-eval
256
+
257
+
258
  # Inference Code
259
 
260
  Here is example code using HuggingFace Transformers to inference the model (note: in 4bit, it will require around 5GB of VRAM)