zephyr-7b-dpo-full-gpt_consistent-reward-scale-1 / model-00003-of-00003.safetensors

Commit History

Training in progress, step 436
905b239
verified

sfulay commited on

Training in progress, step 400
b9f4a21
verified

sfulay commited on

Training in progress, step 300
92bb885
verified

sfulay commited on

Training in progress, step 200
775b078
verified

sfulay commited on