--- base_model: base_model datasets: dataset_name library_name: transformers model_name: online-dpo-qwen2-2 tags: - trl - online-dpo - generated_from_trainer licence: license --- # Model Card for Model name This model is a fine-tuned version of [Qwen/Qwen2-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct) on the https://huggingface.co/datasets/trl-lib/ultrafeedback-prompt dataset.