Mobile-Agent-V: Learning Mobile Device Operation Through Video-Guided Multi-Agent Collaboration Paper • 2502.17110 • Published Feb 24 • 13
Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam Paper • 2502.17055 • Published Feb 24 • 18
Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models Paper • 2502.16033 • Published Feb 22 • 18
Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning Paper • 2502.17407 • Published Feb 24 • 26
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment Paper • 2502.16894 • Published Feb 24 • 29
CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models Paper • 2502.16614 • Published Feb 23 • 27
GCC: Generative Color Constancy via Diffusing a Color Checker Paper • 2502.17435 • Published Feb 24 • 28
Alias-Free Latent Diffusion Models:Improving Fractional Shift Equivariance of Diffusion Latent Space Paper • 2503.09419 • Published Mar 12 • 6
VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary Paper • 2503.09402 • Published Mar 12 • 7
Quantizing Large Language Models for Code Generation: A Differentiated Replication Paper • 2503.07103 • Published Mar 10 • 8
More Documents, Same Length: Isolating the Challenge of Multiple Documents in RAG Paper • 2503.04388 • Published Mar 6 • 16
RewardSDS: Aligning Score Distillation via Reward-Weighted Sampling Paper • 2503.09601 • Published Mar 12 • 15
GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training Paper • 2503.08525 • Published Mar 11 • 17
Reangle-A-Video: 4D Video Generation as Video-to-Video Translation Paper • 2503.09151 • Published Mar 12 • 32
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published Mar 12 • 71