AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation Paper • 2412.15191 • Published 7 days ago • 5 • 2
AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation Paper • 2412.15191 • Published 7 days ago • 5
Mind the Time: Temporally-Controlled Multi-Event Video Generation Paper • 2412.05263 • Published 20 days ago • 10
Mind the Time: Temporally-Controlled Multi-Event Video Generation Paper • 2412.05263 • Published 20 days ago • 10
4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion Paper • 2412.04462 • Published 21 days ago • 7
AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers Paper • 2411.18673 • Published 29 days ago • 8
VIMI: Grounding Video Generation through Multi-modal Instruction Paper • 2407.06304 • Published Jul 8 • 9
Hierarchical Patch Diffusion Models for High-Resolution Video Generation Paper • 2406.07792 • Published Jun 12 • 13
4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion Models Paper • 2406.07472 • Published Jun 11 • 11
GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement Paper • 2406.05649 • Published Jun 9 • 8
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers Paper • 2402.19479 • Published Feb 29 • 32
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis Paper • 2402.14797 • Published Feb 22 • 19
Diffusion Priors for Dynamic View Synthesis from Monocular Videos Paper • 2401.05583 • Published Jan 10 • 8
HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion Paper • 2310.08579 • Published Oct 12, 2023 • 15