Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models Paper • 2503.21380 • Published 26 days ago • 37
FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis Paper • 2503.13265 • Published Mar 17 • 15
An Empirical Study on Eliciting and Improving R1-like Reasoning Models Paper • 2503.04548 • Published Mar 6 • 8
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning Paper • 2503.05592 • Published Mar 7 • 25
Virgo: A Preliminary Exploration on Reproducing o1-like MLLM Paper • 2501.01904 • Published Jan 3 • 34
Progressive Multimodal Reasoning via Active Retrieval Paper • 2412.14835 • Published Dec 19, 2024 • 74
OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain Paper • 2412.13018 • Published Dec 17, 2024 • 42
Toward General Instruction-Following Alignment for Retrieval-Augmented Generation Paper • 2410.09584 • Published Oct 12, 2024 • 49
Very Large-Scale Multi-Agent Simulation in AgentScope Paper • 2407.17789 • Published Jul 25, 2024 • 34