raymond's picture

4

raymond

raymond1113

AI & ML interests

None yet

Recent Activity

upvoted a collection 5 days ago

upvoted an article 22 days ago

Illustrating Reinforcement Learning from Human Feedback (RLHF)

upvoted a paper about 2 months ago

Pairwise RM: Perform Best-of-N Sampling with Knockout Tournament

View all activity

Organizations

None yet

raymond1113's activity

upvoted a collection 5 days ago

DeepSeek-R1

8 items • Updated Jan 21 • 571

upvoted an article 22 days ago

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

Dec 9, 2022

• 197

upvoted a paper about 2 months ago

Pairwise RM: Perform Best-of-N Sampling with Knockout Tournament

Paper • 2501.13007 • Published Jan 22 • 20

upvoted a paper 5 months ago

RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style

Paper • 2410.16184 • Published Oct 21, 2024 • 24