VideoPrism: A Foundational Visual Encoder for Video Understanding Paper • 2402.13217 • Published Feb 20 • 23
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions Paper • 2402.17485 • Published Feb 27 • 190
MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies Paper • 2403.01422 • Published Mar 3 • 26
World Model on Million-Length Video And Language With RingAttention Paper • 2402.08268 • Published Feb 13 • 37
Valley: Video Assistant with Large Language model Enhanced abilitY Paper • 2306.07207 • Published Jun 12, 2023 • 2
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models Paper • 2306.05424 • Published Jun 8, 2023 • 7