Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
Languages
Licenses
Other
1
Inference status
Reset Inference status
Warm
Cold
Frozen
Misc
Reset Misc
reward-trainer
Inference Endpoints
AutoTrain Compatible
text-generation-inference
4-bit precision
Eval Results
8-bit precision
Misc with no match
Merge
custom_code
text-embeddings-inference
Carbon Emissions
Mixture of Experts
Apply filters
Models
368
Full-text search
Edit filters
Sort: Trending
Active filters:
reward-trainer
Clear all
vwxyzjn/rm_zephyr_new2
Text Classification
•
Updated
May 6, 2024
•
14
MahmoudMohamed/Reward_Model
Text Classification
•
Updated
May 8, 2024
•
13
Holarissun/RM-TLDR_human_loraR64_-1_gemma7b_lr1e-05_bs2_g4
Updated
May 9, 2024
Holarissun/RM-TLDR_human_loraR64_-1_gemma7b_lr1.41e-05_bs2_g4
Updated
May 11, 2024
Holarissun/RM-TLDR_contrast_loraR64_-1_gemma2b_lr1.41e-05_bs2_g4
Updated
May 12, 2024
Holarissun/RM-TLDR_contrast_loraR64_-1_gemma2b_lr5e-06_bs2_g4
Updated
May 12, 2024
Holarissun/RM-TLDR_contrast_loraR64_-1_gemma2b_lr1e-06_bs2_g4
Updated
May 12, 2024
Holarissun/RM-TLDR_contrast_loraR64_-1_gemma2b_lr5e-05_bs2_g4
Updated
May 12, 2024
Holarissun/RM-TLDR_contrast_loraR32_-1_gemma2b_lr5e-05_bs2_g4
Updated
May 12, 2024
Holarissun/RM-TLDR_gpt3_loraR64_-1_gemma2b_lr5e-06_bs2_g4
Updated
May 12, 2024
Holarissun/RM-TLDR_gpt3_loraR64_-1_gemma2b_lr1.41e-05_bs2_g4
Updated
May 12, 2024
Holarissun/RM-TLDR_gpt3_loraR64_-1_gemma2b_lr1e-06_bs2_g4
Updated
May 12, 2024
Holarissun/RM-TLDR_gpt3_loraR64_-1_gemma2b_lr5e-05_bs2_g4
Updated
May 12, 2024
thorirhrafn/gpt1B_reward_model3
Updated
May 13, 2024
•
4
vwxyzjn/rm
Text Classification
•
Updated
Jun 20, 2024
•
16
vwxyzjn/rm1
Text Classification
•
Updated
May 21, 2024
•
16
calkp/reward_model
Text Classification
•
Updated
May 22, 2024
•
15
ianmiller314/results
Text Classification
•
Updated
May 24, 2024
•
25
mnoukhov/pythia410m-rm-tldr
Text Classification
•
Updated
Jun 2, 2024
•
11
damienbenveniste/HW2-reward
Text Classification
•
Updated
Jun 14, 2024
•
35
DownwardSpiral33/2c2-reward
Text Classification
•
Updated
Jun 7, 2024
•
15
DownwardSpiral33/2c6-d6-reward
Text Classification
•
Updated
Jun 7, 2024
•
14
DownwardSpiral33/2c2-reward-medium
Text Classification
•
Updated
Jun 7, 2024
•
15
DownwardSpiral33/2c6-reward
Text Classification
•
Updated
Jun 7, 2024
•
15
gsdas/temp_model
Text Classification
•
Updated
Jun 8, 2024
•
26
SiMajid/working
Updated
Jul 21, 2024
•
17
RCODI/deberta-v3-large-reward-model
Text Classification
•
Updated
Jun 12, 2024
•
8
just1nseo/reward_modeling_openchat
Updated
Jun 12, 2024
•
3
santiviquez/reward_modeling_anthropic_hh
Text Classification
•
Updated
Jun 13, 2024
•
17
mnoukhov/pythia160m-rm-tldr
Text Classification
•
Updated
Jun 18, 2024
•
20
Previous
1
2
3
4
5
6
...
13
Next