Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation Paper • 2503.24379 • Published 4 days ago • 65
GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors Paper • 2504.01016 • Published 3 days ago • 24
Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals Paper • 2503.19953 • Published 10 days ago • 3
MusicInfuser: Making Video Diffusion Listen and Dance Paper • 2503.14505 • Published 17 days ago • 11
LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds Paper • 2503.10625 • Published 22 days ago • 28
OmnimatteZero: Training-free Real-time Omnimatte with Pre-trained Video Diffusion Models Paper • 2503.18033 • Published 13 days ago • 23
Can Vision-Language Models Answer Face to Face Questions in the Real-World? Paper • 2503.19356 • Published 11 days ago • 2
FRESA:Feedforward Reconstruction of Personalized Skinned Avatars from Few Images Paper • 2503.19207 • Published 11 days ago • 4
DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis Paper • 2503.15667 • Published 16 days ago • 8
PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos Paper • 2503.17973 • Published 13 days ago • 7
Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation Paper • 2503.14905 • Published 17 days ago • 19