lomahony/pythia-2.8b-helpful-sft-epoch2
Text Generation
•
Updated
•
15
Pythia-2.8b supervised finetuned and DPO finetuned with the helpful subset of Anthropic-hh-rlhf dataset for a second epoch.