Learnings from Scaling Visual Tokenizers for Reconstruction and Generation Paper • 2501.09755 • Published 18 days ago • 33
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published Dec 13, 2024 • 139
LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity Paper • 2412.09856 • Published Dec 13, 2024 • 10