vincentmin/opt-125m-eli5-rl-finetune-128-8-8-1.4e-5_ada Reinforcement Learning • Updated Apr 10, 2023
dshin/flan-t5-ppo-user-a-allenai-prosocial-dialog-testing-upload Reinforcement Learning • Updated Apr 12, 2023 • 1