Generative AI Act II: Test Time Scaling Drives Cognition Engineering Paper • 2504.13828 • Published 4 days ago • 13
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme Paper • 2504.02587 • Published 19 days ago • 30
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme Paper • 2504.02587 • Published 19 days ago • 30
FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios Paper • 2307.13528 • Published Jul 25, 2023
Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation of LLMs as Evaluators via Agent Debate Paper • 2401.16788 • Published Jan 30, 2024 • 1
Align on the Fly: Adapting Chatbot Behavior to Established Norms Paper • 2312.15907 • Published Dec 26, 2023 • 1