Less-to-More Generalization: Unlocking More Controllability by In-Context Generation Paper • 2504.02160 • Published 7 days ago • 28
JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization Paper • 2503.23377 • Published 11 days ago • 49
SkyReels-A2: Compose Anything in Video Diffusion Transformers Paper • 2504.02436 • Published 7 days ago • 32
Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation Paper • 2503.24379 • Published 10 days ago • 72
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance Paper • 2504.01724 • Published 8 days ago • 60
GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors Paper • 2504.01016 • Published 9 days ago • 28
ManipTrans: Efficient Dexterous Bimanual Manipulation Transfer via Residual Learning Paper • 2503.21860 • Published 14 days ago • 3
MoCha: Towards Movie-Grade Talking Character Synthesis Paper • 2503.23307 • Published 11 days ago • 109
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes Paper • 2503.23461 • Published 11 days ago • 90
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published 29 days ago • 68
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency Paper • 2503.20785 • Published 15 days ago • 20
DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis Paper • 2503.15667 • Published 21 days ago • 8
VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning Paper • 2503.13444 • Published 24 days ago • 15
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields Paper • 2503.20776 • Published 15 days ago • 8
Modifying Large Language Model Post-Training for Diverse Creative Writing Paper • 2503.17126 • Published 20 days ago • 34
TaoAvatar: Real-Time Lifelike Full-Body Talking Avatars for Augmented Reality via 3D Gaussian Splatting Paper • 2503.17032 • Published 20 days ago • 23