add per bucket std description
Browse files
README.md
CHANGED
@@ -358,13 +358,14 @@ The [notebook](https://huggingface.co/spaces/HuggingFaceM4/m4-bias-eval/blob/mai
|
|
358 |
|
359 |
Besides, we also computed the classification accuracy on FairFace for both the base and instructed models:
|
360 |
|
361 |
-
| Model | Shots | <nobr>FairFaceGender<br>acc. (std)</nobr> | <nobr>FairFaceRace<br>acc. (std)</nobr> | <nobr>FairFaceAge<br>acc. (std)</nobr> |
|
362 |
| :--------------------- | --------: | ----------------------------: | --------------------------: | -------------------------: |
|
363 |
| IDEFICS 80B | 0 | 95.8 (1.0) | 64.1 (16.1) | 51.0 (2.9) |
|
364 |
| IDEFICS 9B | 0 | 94.4 (2.2) | 55.3 (13.0) | 45.1 (2.9) |
|
365 |
| IDEFICS 80B Instruct | 0 | 95.7 (2.4) | 63.4 (25.6) | 47.1 (2.9) |
|
366 |
| IDEFICS 9B Instruct | 0 | 92.7 (6.3) | 59.6 (22.2) | 43.9 (3.9) |
|
367 |
|
|
|
368 |
## Other limitations
|
369 |
|
370 |
- The model currently will offer medical diagnosis when prompted to do so. For example, the prompt `Does this X-ray show any medical problems?` along with an image of a chest X-ray returns `Yes, the X-ray shows a medical problem, which appears to be a collapsed lung.`
|
|
|
358 |
|
359 |
Besides, we also computed the classification accuracy on FairFace for both the base and instructed models:
|
360 |
|
361 |
+
| Model | Shots | <nobr>FairFaceGender<br>acc. (std*)</nobr> | <nobr>FairFaceRace<br>acc. (std*)</nobr> | <nobr>FairFaceAge<br>acc. (std*)</nobr> |
|
362 |
| :--------------------- | --------: | ----------------------------: | --------------------------: | -------------------------: |
|
363 |
| IDEFICS 80B | 0 | 95.8 (1.0) | 64.1 (16.1) | 51.0 (2.9) |
|
364 |
| IDEFICS 9B | 0 | 94.4 (2.2) | 55.3 (13.0) | 45.1 (2.9) |
|
365 |
| IDEFICS 80B Instruct | 0 | 95.7 (2.4) | 63.4 (25.6) | 47.1 (2.9) |
|
366 |
| IDEFICS 9B Instruct | 0 | 92.7 (6.3) | 59.6 (22.2) | 43.9 (3.9) |
|
367 |
|
368 |
+
*Per bucket standard deviation. Each bucket represents a combination of race and gender from the [FairFace](https://huggingface.co/datasets/HuggingFaceM4/FairFace) dataset.
|
369 |
## Other limitations
|
370 |
|
371 |
- The model currently will offer medical diagnosis when prompted to do so. For example, the prompt `Does this X-ray show any medical problems?` along with an image of a chest X-ray returns `Yes, the X-ray shows a medical problem, which appears to be a collapsed lung.`
|