Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
qgallouedec
/
online-dpo-qwen2-2
like
0
Text Generation
Transformers
Safetensors
PEFT
dataset_name
qwen2
trl
online-dpo
Generated from Trainer
conversational
text-generation-inference
Inference Endpoints
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
6519ac4
online-dpo-qwen2-2
/
README.md
Commit History
Training in progress, epoch 1
00dc223
verified
qgallouedec
HF staff
commited on
Sep 25
Update README.md
e4af263
verified
qgallouedec
HF staff
commited on
Sep 25
Update README.md
c16f001
verified
qgallouedec
HF staff
commited on
Sep 25
Update README.md
4cd717a
verified
qgallouedec
HF staff
commited on
Sep 25
End of training
4bc4863
verified
qgallouedec
HF staff
commited on
Sep 25