SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue Paper • 2605.30993 • Published 9 days ago • 56
Bernini: Latent Semantic Planning for Video Diffusion Paper • 2605.22344 • Published 17 days ago • 17
Running on Zero Agents Featured 1.1k InfiniteYou-FLUX 📸 1.1k Flexible Photo Recrafting While Preserving Your Identity
OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation Paper • 2604.11804 • Published Apr 13 • 72
DreamLite: A Lightweight On-Device Unified Model for Image Generation and Editing Paper • 2603.28713 • Published Mar 30 • 23
HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images Paper • 2603.02210 • Published Mar 2 • 30