-
MLLM-as-a-Judge for Image Safety without Human Labeling
Paper • 2501.00192 • Published • 30 -
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Paper • 2501.00958 • Published • 107 -
Xmodel-2 Technical Report
Paper • 2412.19638 • Published • 27 -
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Paper • 2412.18925 • Published • 101
Collections
Discover the best community collections!
Collections including paper arxiv:2503.21460
-
ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition
Paper • 2503.21248 • Published • 19 -
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models
Paper • 2503.21380 • Published • 36 -
Large Language Model Agent: A Survey on Methodology, Applications and Challenges
Paper • 2503.21460 • Published • 73
-
Natural Language Reinforcement Learning
Paper • 2411.14251 • Published • 31 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 29 -
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
Paper • 2503.16219 • Published • 46 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 49
-
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL
Paper • 2503.07536 • Published • 84 -
Large Language Model Agent: A Survey on Methodology, Applications and Challenges
Paper • 2503.21460 • Published • 73 -
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems
Paper • 2504.01990 • Published • 232
-
RuCCoD: Towards Automated ICD Coding in Russian
Paper • 2502.21263 • Published • 131 -
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 117 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 44 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 25
-
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
Paper • 2502.14499 • Published • 190 -
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines
Paper • 2502.14739 • Published • 103 -
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?
Paper • 2502.14502 • Published • 89 -
PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC
Paper • 2502.14282 • Published • 20
-
Phi-4 Technical Report
Paper • 2412.08905 • Published • 116 -
Evaluating and Aligning CodeLLMs on Human Preference
Paper • 2412.05210 • Published • 51 -
Evaluating Language Models as Synthetic Data Generators
Paper • 2412.03679 • Published • 49 -
Yi-Lightning Technical Report
Paper • 2412.01253 • Published • 29
-
Minstrel: Structural Prompt Generation with Multi-Agents Coordination for Non-AI Experts
Paper • 2409.13449 • Published • 11 -
Large Language Model Agent: A Survey on Methodology, Applications and Challenges
Paper • 2503.21460 • Published • 73 -
Unified Multimodal Discrete Diffusion
Paper • 2503.20853 • Published • 8 -
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion
Paper • 2503.11576 • Published • 92
-
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 35 -
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Paper • 2409.11406 • Published • 28 -
Diffusion Models Are Real-Time Game Engines
Paper • 2408.14837 • Published • 126 -
Segment Anything with Multiple Modalities
Paper • 2408.09085 • Published • 23