add eval tables
Browse files
README.md
CHANGED
@@ -171,7 +171,38 @@ We perform checkpoint selection based on validation sets of TODO, and select the
|
|
171 |
|
172 |
TODO: beautiful plots of shots scaling laws.
|
173 |
|
174 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
175 |
|
176 |
|
177 |
# Technical Specifications
|
|
|
171 |
|
172 |
TODO: beautiful plots of shots scaling laws.
|
173 |
|
174 |
+
| 80B IDEFIX | 0-shots | 4-shots | 8-shots | 16-shots | 32-shots |
|
175 |
+
|:-----------------------|:----------|:----------|:----------|:-----------|:-----------|
|
176 |
+
| VQAv2 | 60.0% | 63.4% | 64.5% | 65.4% | 66.0% |
|
177 |
+
| OKVQA | 45.2% | 52.3% | 55.2% | 56.8% | 58.0% |
|
178 |
+
| TextVQA | 30.9% | 34.7% | 35.4% | 36.3% | 37.0% |
|
179 |
+
| TextCaps | 56.8% | 77.9% | 82.5% | 85.2% | 86.1% |
|
180 |
+
| Coco | 91.8% | 109.3% | 113.9% | 116.6% | 116.5% |
|
181 |
+
| NoCaps | 65.0% | 101.1% | 104.7% | 105.6% | 106.3% |
|
182 |
+
| Flickr | 53.7% | 68.9% | 74.3% | 76.8% | 78.9% |
|
183 |
+
| ImageNet1k | 74.3% | | | | |
|
184 |
+
| VizWiz | 36.0% | 45.8% | 49.3% | 51.5% | 52.6% |
|
185 |
+
| VisDial (NDCG) | 48.8% | 48.6% | 48.1% | | |
|
186 |
+
| HatefulMemes (ROC AUC) | 60.6% | 58.7% | 57.8% | 56.0% | 54.3% |
|
187 |
+
| ScienceQA (accuracy) | 68.9% | 66.3% | - | - | - |
|
188 |
+
| RenderedSST2 | 60.5% | 63.9% | 64.3% | 66.9% | 68.0% |
|
189 |
+
|
190 |
+
|
191 |
+
| 9B IDEFIX | 0-shots | 4-shots | 8-shots | 16-shots | 32-shots |
|
192 |
+
|:-----------------------|:----------|:----------|:----------|:-----------|:-----------|
|
193 |
+
| VQAv2 | 50.9% | 55.6% | 56.4% | 57.2% | 57.9% |
|
194 |
+
| OKVQA | 38.4% | 45.8% | 47.3% | 49.0% | 50.4% |
|
195 |
+
| TextVQA | 25.9% | 26.8% | 26.8% | 28.1% | 28.2% |
|
196 |
+
| TextCaps | 25.4% | 60.9% | 63.7% | 68.0% | 69.7% |
|
197 |
+
| Coco | 46.0% | 88.9% | 96.9% | 99.6% | 101.5% |
|
198 |
+
| NoCaps | 36.8% | 78.5% | 84.3% | 87.2% | 88.6% |
|
199 |
+
| Flickr | 27.3% | 52.2% | 60.3% | 65.0% | 66.0% |
|
200 |
+
| ImageNet1k | 70.7% | | | | |
|
201 |
+
| VizWiz | 35.5% | 42.0% | 42.8% | 45.0% | 45.9% |
|
202 |
+
| VisDial (NDCG) | 48.7% | 48.1% | 47.5% | | |
|
203 |
+
| HatefulMemes (ROC AUC) | 51.7% | 52.6% | 52.3% | 52.5% | 53.1% |
|
204 |
+
| ScienceQA (accuracy) | 44.2% | 41.6% | - | - | - |
|
205 |
+
| RenderedSST2 | 61.8% | 60.6% | 66.8% | 66.0% | 63.4% |
|
206 |
|
207 |
|
208 |
# Technical Specifications
|