nph4rd commited on
Commit
912b72d
1 Parent(s): df4102b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -1
README.md CHANGED
@@ -49,4 +49,10 @@ We used the [WaveUI](https://huggingface.co/datasets/agentsea/wave-ui) dataset f
49
 
50
  ## Evaluation
51
 
52
- We will release a full evaluation report soon. Stay tuned! :)
 
 
 
 
 
 
 
49
 
50
  ## Evaluation
51
 
52
+ We calculated the mean IoU over 1024 examples of the test set using 3 different closed-source models: Gemini Pro 1.5, Claude Sonnet 3.5 and GPT 4o. We also ran this same calculation using the PaliGemma WaveUI fine-tunes. We obtained the following values:
53
+
54
+ - Gemini 1.5: 0.12
55
+ - Claude: 0.05
56
+ - GPT: 0.05
57
+ - **PaliGemma Widgetcap+WaveUI 448: 0.40**
58
+ - PaliGemma WaveUI 896: 0.49