Update README.md
Browse files
README.md
CHANGED
@@ -692,7 +692,11 @@ WiP
|
|
692 |
|
693 |
## Ethical Considerations and Limitations
|
694 |
|
695 |
-
|
|
|
|
|
|
|
|
|
696 |
|
697 |
---
|
698 |
|
|
|
692 |
|
693 |
## Ethical Considerations and Limitations
|
694 |
|
695 |
+
We examine the presence of undesired societal and cognitive biases present in this model using different benchmarks. For societal biases, we test performance using our Spanish version of the BBQ dataset (Parrish et al., 2022). We report that while accuracy in disambiguated settings is relatively high for a base model, the model performs very poorly in ambiguous settings. Further examination of the differences in accuracy scores as described in CITE KOBBQ reveals a low-to-moderate alignment between the model's responses and societal biases. These largely vanish in disambiguated setting. Our analyses on societal biases show that while these biases are capable of interfering with model performance as expressed in the results on the BBQ dataset, their interference with task performance is somewhat limited given the results on the disambiguated dataset. We highlight that our analyses of these biases are by no means exhaustive and are limited by the relative scarcity of adequate resources in all languages present in the training data. We aim to gradually extend and expand our analyses in future work.
|
696 |
+
|
697 |
+
Our cognitive bias analysis focuses on positional effects in 0-shot settings, and majority class bias in few-shot settings. For positional effects, we leverage the ARC Multiple Choice Question dataset (Clark et al., 2018). We observe weak primacy effects, whereby the model shows a preference for answers towards the beginning of the list of provided answers. We measure effects of majority class effects in few-shot settings using SST-2 (Socher et al., 2013). We detect significant effects, albeit extremely weak ones, implying that outputs are generally robust against variations in prompt format, and order.
|
698 |
+
|
699 |
+
We highlight that these results can be expected from a pretrained model that has not yet been instruction-tuned or aligned. These tests are performed in order to show the biases the model may contain. We urge developers to take them into account and perform safety testing and tuning tailored to their specific applications of the model.
|
700 |
|
701 |
---
|
702 |
|