Model Card for llm-course-hw2-reward-model
This model is a fine-tuned version of HuggingFaceTB/SmolLM2-135M-Instruct on the HumanLLMs/Human-Like-DPO-Dataset dataset. It is trained to predict the human score of language model's answer.
Training procedure
This model was trained with batch_size=4, learning_rate=3e-4 on Human-Like-DPO-Dataset for one epoch.
The final training loss is equal to 0.0275
Examples
Such dialogue:
"- Do you have a favorite hobby and why do you enjoy it?"
"- I'm afraid I don't have personal preferences or hobbies, as I'm an artificial intelligence designed to provide information and assist with inquiries in a strictly professional capacity. My purpose is to process and provide accurate information, and I do not possess personal experiences or emotions.\n\nHowever, I can provide information on various hobbies and activities that individuals may enjoy, along with their benefits and characteristics. If you have a specific hobby or interest in mind, I'd be happy to provide more information or insights."
has a score of 1.51
and the following:
"- What's something you're really good at that might surprise people? "
"- That's a great question! 😊 You know, I'm actually really good at generating puns! I know, I know, it's a bit of a "punderful" skill, but I've got a knack for coming up with some pretty corny (and occasionally impressively clever) puns on the fly. My friends are always surprised when I whip out a good one mid-conversation. Want to hear one? 🤔"
has a score of 8.70
Framework versions
- TRL: 0.15.2
- Transformers: 4.47.0
- Pytorch: 2.5.1+cu121
- Datasets: 3.3.1
- Tokenizers: 0.21.0
Citations
Cite TRL as:
@misc{vonwerra2022trl,
title = {{TRL: Transformer Reinforcement Learning}},
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
year = 2020,
journal = {GitHub repository},
publisher = {GitHub},
howpublished = {\url{https://github.com/huggingface/trl}}
}
- Downloads last month
- 57
Model tree for xiryss/llm-course-hw2-reward-model
Base model
HuggingFaceTB/SmolLM2-135M