MM-RLHF: The Next Step Forward in Multimodal LLM Alignment Paper • 2502.10391 • Published 27 days ago • 32
DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging Paper • 2407.01470 • Published Jul 1, 2024 • 6