metadata
license: bigscience-openrail-m
datasets:
- trivia_qa
language:
- en
tags:
- trl
- transformers
- rlhf
starcoderbase-triviaqa
This model is baesed on https://huggingface.co/bigcode/starcoderbase and is fine-tuned on the TriviaQA dataset using reinforcement learning via TRL's TextEnvironment
(https://github.com/huggingface/trl/pull/424).
Out of Scope Use
- Replacing human expertise
Bias, Risks, and Limitations
- Inherits bias, risks, and limitations from the LLaMA model, as described in the LLaMA Model Card Bias Evaluation and Ethical Considerations.
- Retains biases present in the Stack Exchange dataset. Per the latest developer survey for Stack Overflow, which constitutes a significant part of the StackExchange data, most users who answered the survey identified themselves as White or European, men, between 25 and 34 years old, and based in the US (with a significant part of responders from India).
- May generate answers that are incorrect or misleading.
- May copy answers from the training data verbatim.
- May generate language that is hateful or promotes discrimination (example).
- May generate language that is offensive to direct or indirect users or to people or groups mentioned.
Recommendations
- Answers should be validated through the use of external sources.
- Disparities between the data contributors and the direct and indirect users of the technology should inform developers in assessing what constitutes an appropriate use case.
- Further research is needed to attribute model generations to sources in the training data, especially in cases where the model copies answers from the training data.