WPO: Enhancing RLHF with Weighted Preference Optimization Paper • 2406.11827 • Published Jun 17, 2024 • 17
trl-internal-testing/tiny-Qwen3VLForConditionalGeneration Image-to-Text • 3.43M • Updated 4 days ago • 10.2k
trl-internal-testing/tiny-Qwen2_5_VLForConditionalGeneration Image-to-Text • 3.86M • Updated 4 days ago • 192k
trl-internal-testing/tiny-Qwen2VLForConditionalGeneration Image-to-Text • 3.54M • Updated 4 days ago • 45.9k
trl-internal-testing/tiny-LlavaNextForConditionalGeneration Image-to-Text • 2.7M • Updated 3 days ago • 47.7k
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 9 days ago • 74