EasyV2V: A High-quality Instruction-based Video Editing Framework Paper • 2512.16920 • Published 11 days ago • 17
Omni-Attribute: Open-vocabulary Attribute Encoder for Visual Concept Personalization Paper • 2512.10955 • Published 18 days ago • 6
Omni-Attribute: Open-vocabulary Attribute Encoder for Visual Concept Personalization Paper • 2512.10955 • Published 18 days ago • 6
EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing Paper • 2512.06065 • Published 24 days ago • 28
Omni-ID: Holistic Identity Representation Designed for Generative Tasks Paper • 2412.09694 • Published Dec 12, 2024
Canvas-to-Image: Compositional Image Generation with Multimodal Controls Paper • 2511.21691 • Published Nov 26 • 35
LayerComposer: Interactive Personalized T2I via Spatially-Aware Layered Canvas Paper • 2510.20820 • Published Oct 23 • 10
Hala Technical Report: Building Arabic-Centric Instruction & Translation Models at Scale Paper • 2509.14008 • Published Sep 17 • 88
Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets Paper • 2505.07747 • Published May 12 • 61
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models Paper • 2502.10458 • Published Feb 12 • 38
Wonderland: Navigating 3D Scenes from a Single Image Paper • 2412.12091 • Published Dec 16, 2024 • 16
AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers Paper • 2411.18673 • Published Nov 27, 2024 • 8
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency Paper • 2409.02634 • Published Sep 4, 2024 • 97