Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
RyanYr
/
self-correct_ministral8Bit_mMQA_dpo_iter2
like
0
Text Generation
Transformers
Safetensors
mistral
Generated from Trainer
trl
dpo
conversational
text-generation-inference
Inference Endpoints
arxiv:
2305.18290
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
self-correct_ministral8Bit_mMQA_dpo_iter2
Commit History
Model save
592d75c
verified
RyanYr
commited on
about 23 hours ago
Training in progress, step 51
51dc796
verified
RyanYr
commited on
about 23 hours ago
Training in progress, step 48
827856e
verified
RyanYr
commited on
about 23 hours ago
Training in progress, step 42
b07071b
verified
RyanYr
commited on
about 23 hours ago
Training in progress, step 36
e13a2ab
verified
RyanYr
commited on
about 23 hours ago
Training in progress, step 30
9fb6ffa
verified
RyanYr
commited on
about 24 hours ago
Training in progress, step 24
191e247
verified
RyanYr
commited on
about 24 hours ago
Training in progress, step 18
5f682ac
verified
RyanYr
commited on
about 24 hours ago
Training in progress, step 12
12289ad
verified
RyanYr
commited on
about 24 hours ago
Training in progress, step 6
a93a1f4
verified
RyanYr
commited on
about 24 hours ago
initial commit
970e7f9
verified
RyanYr
commited on
1 day ago