VictorSanh
commited on
Commit
•
65edf33
1
Parent(s):
7f05f40
trying that
Browse files
README.md
CHANGED
@@ -158,9 +158,9 @@ We perform checkpoint selection based on validation sets of VQAv2, TextVQA, OKVQ
|
|
158 |
|
159 |
As opposed to Flamingo, we did not train IDEFICS on video-text pairs datasets, and as such, we did not evaluate the model on video-text benchmarks like Flamingo did. We leave that evaluation for a future iteration.
|
160 |
|
161 |
-
<img src="./assets/Figure_Evals_IDEFIX.png" width="55%">
|
162 |
|
163 |
-
| Model | Shots |
|
164 |
|:-----------|--------:|---------------------:|---------------------:|-----------------------:|----------------------:|-------------------:|---------------:|-----------------:|-----------------:|-----------------:|-------------------------:|-----------------------:|--------------------------:|----------------------------------:|
|
165 |
| IDEFIX 80B | 0 | 60.0 | 45.2 | 30.9 | 36.0 | 56.8 | 91.8 | 65.0 | 53.7 | 48.8 | 60.6 | 68.9 | 60.5 | 8.0 (18.8/22.5) |
|
166 |
| | 4 | 63.6 | 52.4 | 34.4 | 40.4 | 72.7 | 110.3 | 99.6 | 73.7 | 48.4 | 57.8 | 58.9 | 66.6 | - |
|
@@ -174,8 +174,12 @@ As opposed to Flamingo, we did not train IDEFICS on video-text pairs datasets, a
|
|
174 |
| | 16 | 57.0 | 48.4 | 27.9 | 42.6 | 67.4 | 99.7 | 89.4 | 64.5 | - | 50.9 | - | 67.8 | - |
|
175 |
| | 32 | 57.9 | 49.6 | 28.3 | 43.7 | 68.1 | 98.0 | 90.5 | 64.4 | - | 49.8 | - | 67.0 | - |
|
176 |
|
177 |
-
|
178 |
-
|
|
|
|
|
|
|
|
|
179 |
|:-----------|--------:|-----------:|
|
180 |
| IDEFIX 80B | 16, 1k support set | 65.4 |
|
181 |
| | 16, RICES 5k support set | 72.9 |
|
@@ -198,11 +202,7 @@ Fairness Evaluations:
|
|
198 |
| | 16 | 95.8 | 43.0 | 46.1 |
|
199 |
| | 32 | 96.1 | 35.1 | 44.9 |
|
200 |
|
201 |
-
We also report results where the priming samples are selected to be similar (i.e. close in a vector space) to the queried instance.
|
202 |
|
203 |
-
TODO: table with rices shots
|
204 |
-
|
205 |
-
We note that since we trained on PMD which contains COCO, the evaluation numbers on COCO are not directly comparable with Flamingo and OpenFlamingo since they did not explicitely have this dataset in the training mixture.
|
206 |
|
207 |
# Technical Specifications
|
208 |
|
|
|
158 |
|
159 |
As opposed to Flamingo, we did not train IDEFICS on video-text pairs datasets, and as such, we did not evaluate the model on video-text benchmarks like Flamingo did. We leave that evaluation for a future iteration.
|
160 |
|
161 |
+
<!-- <img src="./assets/Figure_Evals_IDEFIX.png" width="55%"> <img width=120/> -->
|
162 |
|
163 |
+
| Model | Shots | VQAv2<br>OE VQA acc.<br> | OKVQA<br>OE VQA acc.<br> | TextVQA<br>OE VQA acc.<br> | VizWiz<br>OE VQA acc.<br> | TextCaps<br>CIDEr<br> | Coco<br>CIDEr<br> | NoCaps<br>CIDEr | Flickr<br>CIDEr | VisDial<br>NDCG | HatefulMemes<br>ROC AUC | ScienceQA<br>acc. | RenderedSST2<br>acc. | Winoground<br>group (text/image) |
|
164 |
|:-----------|--------:|---------------------:|---------------------:|-----------------------:|----------------------:|-------------------:|---------------:|-----------------:|-----------------:|-----------------:|-------------------------:|-----------------------:|--------------------------:|----------------------------------:|
|
165 |
| IDEFIX 80B | 0 | 60.0 | 45.2 | 30.9 | 36.0 | 56.8 | 91.8 | 65.0 | 53.7 | 48.8 | 60.6 | 68.9 | 60.5 | 8.0 (18.8/22.5) |
|
166 |
| | 4 | 63.6 | 52.4 | 34.4 | 40.4 | 72.7 | 110.3 | 99.6 | 73.7 | 48.4 | 57.8 | 58.9 | 66.6 | - |
|
|
|
174 |
| | 16 | 57.0 | 48.4 | 27.9 | 42.6 | 67.4 | 99.7 | 89.4 | 64.5 | - | 50.9 | - | 67.8 | - |
|
175 |
| | 32 | 57.9 | 49.6 | 28.3 | 43.7 | 68.1 | 98.0 | 90.5 | 64.4 | - | 49.8 | - | 67.0 | - |
|
176 |
|
177 |
+
We note that since we trained on PMD which contains COCO, the evaluation numbers on COCO are not directly comparable with Flamingo and OpenFlamingo since they did not explicitely have this dataset in the training mixture.
|
178 |
+
|
179 |
+
For ImageNet-1k, we also report results where the priming samples are selected to be similar (i.e. close in a vector space) to the queried instance.
|
180 |
+
|
181 |
+
ImageNet-1k Evaluation:
|
182 |
+
| Model | Shots | ImageNet-1k |
|
183 |
|:-----------|--------:|-----------:|
|
184 |
| IDEFIX 80B | 16, 1k support set | 65.4 |
|
185 |
| | 16, RICES 5k support set | 72.9 |
|
|
|
202 |
| | 16 | 95.8 | 43.0 | 46.1 |
|
203 |
| | 32 | 96.1 | 35.1 | 44.9 |
|
204 |
|
|
|
205 |
|
|
|
|
|
|
|
206 |
|
207 |
# Technical Specifications
|
208 |
|