Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment Paper • 2412.19326 • Published 14 days ago • 18
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding Paper • 2403.15377 • Published Mar 22, 2024 • 22