REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper • 2501.03262 • Published 21 days ago • 87
Reward Bench Collection Datasets, spaces, and models for the reward model benchmark! • 5 items • Updated 18 days ago • 9
view article Article Accelerated Inference with Optimum and Transformers Pipelines May 10, 2022 • 2