Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control Paper • 2503.14492 • Published 1 day ago • 12
7DGS: Unified Spatial-Temporal-Angular Gaussian Splatting Paper • 2503.07946 • Published 9 days ago • 1
Learning Continuous Mesh Representation with Spherical Implicit Surface Paper • 2301.04695 • Published Jan 11, 2023 • 1
DDGS-CT: Direction-Disentangled Gaussian Splatting for Realistic Volume Rendering Paper • 2406.02518 • Published Jun 4, 2024 • 1
6DGS: Enhanced Direction-Aware Gaussian Splatting for Volumetric Rendering Paper • 2410.04974 • Published Oct 7, 2024 • 1
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Paper • 2503.11647 • Published 5 days ago • 108
Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo Paper • 2503.09799 • Published 7 days ago • 12
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models Paper • 2503.10437 • Published 6 days ago • 27
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published 14 days ago • 81
MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning Paper • 2502.19634 • Published 21 days ago • 60
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 27 days ago • 130
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16 • 145
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published Feb 7 • 124
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published Dec 13, 2024 • 140
PBADet: A One-Stage Anchor-Free Approach for Part-Body Association Paper • 2402.07814 • Published Feb 12, 2024 • 1
Learning Continuous Mesh Representation with Spherical Implicit Surface Paper • 2301.04695 • Published Jan 11, 2023 • 1