MotiF: Making Text Count in Image Animation with Motion Focal Loss Paper • 2412.16153 • Published 7 days ago • 5
MotiF: Making Text Count in Image Animation with Motion Focal Loss Paper • 2412.16153 • Published 7 days ago • 5 • 2
Vamos: Versatile Action Models for Video Understanding Paper • 2311.13627 • Published Nov 22, 2023 • 2
Goal-Conditioned Predictive Coding as an Implicit Planner for Offline Reinforcement Learning Paper • 2307.03406 • Published Jul 7, 2023 • 1
Do Pre-trained Vision-Language Models Encode Object States? Paper • 2409.10488 • Published Sep 16 • 1
Vamos: Versatile Action Models for Video Understanding Paper • 2311.13627 • Published Nov 22, 2023 • 2
AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos? Paper • 2307.16368 • Published Jul 31, 2023 • 11
AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos? Paper • 2307.16368 • Published Jul 31, 2023 • 11