Submitted by apryc1 77 One-Step Residual Shifting Diffusion for Image Super-Resolution via Distillation · 9 authors 2
Submitted by yangsui 50 Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models · 12 authors 2
Submitted by ZeqiangLai 32 Unleashing Vecset Diffusion Model for Fast Shape Generation · 13 authors 3
Submitted by richardaecn 28 Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning · 45 authors 2
Submitted by zhwang4ai 25 JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse · 5 authors 2
Submitted by MingleiShi 23 DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers · 13 authors 5
Submitted by akhaliq 21 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity · 6 authors 5
Submitted by Huan-WhoRegisteredMyName 19 Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models · 5 authors 3
Submitted by akhaliq 18 Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning · 16 authors 4
Submitted by quyanh 17 Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't · 2 authors 2
Submitted by QizhiPei 17 MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion · 9 authors 2
Submitted by DyrusQZ 14 LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds · 11 authors 2
Submitted by Ningyu 11 CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners · 7 authors 2
Submitted by mathfinder 11 Expert Race: A Flexible Routing Strategy for Scaling Diffusion Transformer with Mixture of Experts · 7 authors 2
Submitted by akhaliq 8 MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance · 6 authors 2
Submitted by rexleeppp 8 NuiScene: Exploring Efficient Generation of Unbounded Outdoor Scenes · 3 authors 2
Submitted by zhenglin 7 Zero-1-to-A: Zero-Shot One Image to Animatable Head Avatars Using Video Diffusion · 4 authors 2
Submitted by pierrechambon 7 BigO(Bench) -- Can LLMs Generate Code with Controlled Time and Space Complexity? · 4 authors 2
Submitted by kpzhang996 6 CLS-RL: Image Classification with Rule-Based Reinforcement Learning · 5 authors 2
Submitted by lyc0930 6 Towards Unified Latent Space for 3D Molecular Latent Diffusion Modeling · 7 authors 2
Submitted by guolinke 5 Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens · 8 authors 2
Submitted by ynhe 5 Make Your Training Flexible: Towards Deployment-Efficient Video Models · 6 authors 2
Submitted by Gofinge 4 Sonata: Self-Supervised Learning of Reliable Point Representations · 10 authors 2
Submitted by c-juhwan 4 See-Saw Modality Balance: See Gradient, and Sew Impaired Vision-Language Balance to Mitigate Dominant Modality Bias · 5 authors 2
Submitted by BestWishYsh 4 MagicID: Hybrid Preference Optimization for ID-Consistent and Dynamic-Preserved Video Customization · 7 authors 2
Submitted by kpzhang996 3 Improving Autoregressive Image Generation through Coarse-to-Fine Token Prediction · 3 authors 2
Submitted by UVSKKR 3 Deceptive Humor: A Synthetic Multilingual Benchmark Dataset for Bridging Fabricated Claims with Humorous Content · 3 authors 2
Submitted by HJGO 3 VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Joint Modeling · 6 authors 2
Submitted by lxxiao 3 MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space · 10 authors 2
Submitted by MAGAer13 2 Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning · 5 authors 2
Submitted by ab9mamun 1 AIMI: Leveraging Future Knowledge and Personalization in Sparse Event Forecasting for Treatment Adherence · 3 authors 2
Submitted by Devy1 1 Why Personalizing Deep Learning-Based Code Completion Tools Matters · 3 authors 2
Submitted by Zilence006 1 Where do Large Vision-Language Models Look at when Answering Questions? · 9 authors 2
Submitted by wljungbergh - GASP: Unifying Geometric and Semantic Self-Supervised Pre-training for Autonomous Driving · 9 authors 2