Max's picture

Building on HF

Max PRO

reciprocate

·

maxreciprocate

AI & ML interests

Reward models

Organizations

models 18

reciprocate/mistral-7b-gsm8k-code-rm

Text Classification • 7B • Updated Mar 24, 2024 • 6 • 3

reciprocate/mistral-7b-rm

Text Classification • Updated Feb 15, 2024 • 6 • 2

reciprocate/rm_beluga-7b_hh-full

Text Classification • Updated Sep 25, 2023 • 3

reciprocate/rm-llama2-7b-gsm8k

Text Generation • Updated Sep 14, 2023 • 7 •

reciprocate/llama2-7b-gsm8k

Text Generation • Updated Aug 29, 2023 • 4 • • 1

reciprocate/shepherd-13b

Text Generation • Updated Aug 24, 2023 • 3 • • 1

reciprocate/tiny-llama

Text Generation • Updated Aug 6, 2023 • 3

reciprocate/vicuna-13b_rm_oasst-hh

Text Classification • Updated Jun 27, 2023 • 6

reciprocate/openllama-13b-rlhf-v0

Text Generation • Updated Jun 22, 2023 • 3

reciprocate/openllama-13b_rm_oasst-hh

Text Classification • Updated Jun 21, 2023 • 3

datasets 35

reciprocate/kaggle-lmarena-synth-50k

Viewer • Updated Mar 23, 2025 • 50.7k • 22

reciprocate/ultra-annotated-200k

Viewer • Updated Sep 1, 2024 • 208k • 21

reciprocate/dpo-objective-v0.2

Viewer • Updated May 14, 2024 • 384 • 12

reciprocate/tinygsm_interpreter_1M

Viewer • Updated May 6, 2024 • 1M • 16

reciprocate/dpo_untoxic

Viewer • Updated Apr 7, 2024 • 541 • 36

reciprocate/dpo_mix-zero-math-untoxic

Viewer • Updated Mar 29, 2024 • 6.91k • 17

reciprocate/dpo_mix-7k_untoxic

Viewer • Updated Mar 26, 2024 • 7.29k • 71 • 2

reciprocate/tinygsm_mixtral_12M

Viewer • Updated Mar 24, 2024 • 12M • 77 • 1

reciprocate/dpo_ultra-capybara-code_filtered-best

Viewer • Updated Mar 19, 2024 • 35.2k • 11 • 1

reciprocate/tinygsm_dpo

Viewer • Updated Mar 15, 2024 • 6.17k • 64 • 2

View 35 datasets