InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions Paper • 2412.09596 • Published 13 days ago • 90
LSceneLLM: Enhancing Large 3D Scene Understanding Using Adaptive Visual Preferences Paper • 2412.01292 • Published 23 days ago • 11
NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images Paper • 2412.03517 • Published 21 days ago • 18
4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion Paper • 2412.04462 • Published 20 days ago • 7
2DGS-Room: Seed-Guided 2D Gaussian Splatting with Geometric Constrains for High-Fidelity Indoor Scene Reconstruction Paper • 2412.03428 • Published 21 days ago • 10
PanoDreamer: 3D Panorama Synthesis from a Single Image Paper • 2412.04827 • Published 19 days ago • 10
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper • 2412.05271 • Published 19 days ago • 121
SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints Paper • 2412.07760 • Published 15 days ago • 49
IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation Paper • 2402.08682 • Published Feb 13 • 12