Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
sfulay
/
zephyr-7b-dpo-full-ultrabin-reward-scale-1-rpo
like
0
Safetensors
mistral
trl
dpo
Generated from Trainer
License:
apache-2.0
Model card
Files
Files and versions
Community
Train
Deploy
82e0b88
zephyr-7b-dpo-full-ultrabin-reward-scale-1-rpo
Commit History
Training in progress, step 200
82e0b88
verified
sfulay
commited on
Aug 15, 2024
Training in progress, step 100
f04d5f1
verified
sfulay
commited on
Aug 15, 2024
initial commit
35543fc
verified
sfulay
commited on
Aug 14, 2024