Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
Shotaro30678
/
response_generator_DPO
like
0
Text Generation
PEFT
Safetensors
Shotaro30678/rlhf-RG-trl-style-v3
English
llama
trl
unsloth
conversational
4-bit precision
bitsandbytes
License:
apache-2.0
Model card
Files
Files and versions
Community
Train
Use this model
main
response_generator_DPO
Commit History
Update README.md
967ac6f
verified
Shotaro30678
commited on
Aug 26
Update README.md
77baa3a
verified
Shotaro30678
commited on
Aug 26
Update README.md
2910e15
verified
Shotaro30678
commited on
Aug 26
Update README.md
4adb4e3
verified
Shotaro30678
commited on
Aug 26
Update README.md
caf3580
verified
Shotaro30678
commited on
Aug 26
Upload model trained with Unsloth
14e94e9
verified
Shotaro30678
commited on
Aug 26
Trained with Unsloth
471ab71
verified
Shotaro30678
commited on
Aug 26
Trained with Unsloth
a38c73a
verified
Shotaro30678
commited on
Aug 26
Upload README.md with huggingface_hub
090bed7
verified
Shotaro30678
commited on
Aug 26
initial commit
decc354
verified
Shotaro30678
commited on
Aug 26