DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published 17 days ago • 112
Being-0: A Humanoid Robotic Agent with Vision-Language Models and Modular Skills Paper • 2503.12533 • Published 19 days ago • 61
GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding Paper • 2503.10596 • Published 22 days ago • 18
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper • 2503.07920 • Published 25 days ago • 95
INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge Paper • 2411.19799 • Published Nov 29, 2024 • 14
Multi-Level Knowledge Distillation for Out-of-Distribution Detection in Text Paper • 2211.11300 • Published Nov 21, 2022 • 1
Taking Notes Brings Focus? Towards Multi-Turn Multimodal Dialogue Learning Paper • 2503.07002 • Published 26 days ago • 38
EVEv2: Improved Baselines for Encoder-Free Vision-Language Models Paper • 2502.06788 • Published Feb 10 • 12
Taming Teacher Forcing for Masked Autoregressive Video Generation Paper • 2501.12389 • Published Jan 21 • 10
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario Paper • 2501.10132 • Published Jan 17 • 20