Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation Paper • 2504.02542 • Published 2 days ago • 13
SkyReels-A2: Compose Anything in Video Diffusion Transformers Paper • 2504.02436 • Published 2 days ago • 18
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation Paper • 2504.02782 • Published 2 days ago • 39
CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models Paper • 2503.18886 • Published 12 days ago • 20
Boost Your Own Human Image Generation Model via Direct Preference Optimization with AI Feedback Paper • 2405.20216 • Published May 30, 2024 • 15
Wan: Open and Advanced Large-Scale Video Generative Models Paper • 2503.20314 • Published 10 days ago • 46
Concat-ID: Towards Universal Identity-Preserving Video Synthesis Paper • 2503.14151 • Published 18 days ago • 10
OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting Paper • 2503.08677 • Published 25 days ago • 27
DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation Paper • 2503.10618 • Published 23 days ago • 17
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Paper • 2503.11647 • Published 22 days ago • 129
FlowTok: Flowing Seamlessly Across Text and Image Tokens Paper • 2503.10772 • Published 23 days ago • 18
Edit Transfer: Learning Image Editing via Vision In-Context Relations Paper • 2503.13327 • Published 19 days ago • 28
Personalize Anything for Free with Diffusion Transformer Paper • 2503.12590 • Published 20 days ago • 42
DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation Paper • 2503.06053 • Published 28 days ago • 136
CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era Paper • 2503.12329 • Published 20 days ago • 24
GHOST 2.0: generative high-fidelity one shot transfer of heads Paper • 2502.18417 • Published Feb 25 • 65