R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model Paper • 2503.05132 • Published 7 days ago • 47
R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts Paper • 2502.20395 • Published 14 days ago • 44
BenTo: Benchmark Task Reduction with In-Context Transferability Paper • 2410.13804 • Published Oct 17, 2024 • 20
Do great minds think alike? Investigating Human-AI Complementarity in Question Answering with CAIMIRA Paper • 2410.06524 • Published Oct 9, 2024 • 4
CAIMIRA Paper & Data Collection Question Answering Datasets on Quizbowl questions and their progressive clues from various competitions. • 5 items • Updated Nov 9, 2024 • 1
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models Paper • 2404.18796 • Published Apr 29, 2024 • 69