Max
reciprocate
·
AI & ML interests
Reward models
Organizations
reciprocate/mistral-7b-gsm8k-code-rm
Text Classification
•
Updated
•
24
•
3
reciprocate/mistral-7b-rm
Text Classification
•
Updated
•
16
•
2
reciprocate/rm_beluga-7b_hh-full
Text Classification
•
Updated
•
20
reciprocate/rm-llama2-7b-gsm8k
Text Generation
•
Updated
•
18
reciprocate/llama2-7b-gsm8k
Text Generation
•
Updated
•
21
•
1
reciprocate/shepherd-13b
Text Generation
•
Updated
•
27
•
1
reciprocate/tiny-llama
Text Generation
•
Updated
•
94
•
2
reciprocate/vicuna-13b_rm_oasst-hh
Text Classification
•
Updated
•
21
reciprocate/openllama-13b-rlhf-v0
Text Generation
•
Updated
•
20
reciprocate/openllama-13b_rm_oasst-hh
Text Classification
•
Updated
•
24
reciprocate/ultra-annotated-200k
Viewer
•
Updated
•
208k
•
70
reciprocate/dpo-objective-v0.2
Viewer
•
Updated
•
384
•
52
reciprocate/tinygsm_interpreter_1M
Viewer
•
Updated
•
1M
•
75
Viewer
•
Updated
•
541
•
54
reciprocate/dpo_mix-zero-math-untoxic
Viewer
•
Updated
•
6.91k
•
65
reciprocate/dpo_mix-7k_untoxic
Viewer
•
Updated
•
7.29k
•
58
•
2
reciprocate/tinygsm_mixtral_12M
Viewer
•
Updated
•
12M
•
258
•
1
reciprocate/dpo_ultra-capybara-code_filtered-best
Viewer
•
Updated
•
35.2k
•
53
•
1
Viewer
•
Updated
•
6.17k
•
83
•
2
reciprocate/dpo_ultra-capybara_filtered-best
Viewer
•
Updated
•
25.6k
•
68