Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers Paper β’ 2501.03931 β’ Published 5 days ago β’ 11
Mogo: RQ Hierarchical Causal Transformer for High-Quality 3D Human Motion Generation Paper β’ 2412.07797 β’ Published Dec 5, 2024 β’ 11
StyleMaster: Stylize Your Video with Artistic Generation and Translation Paper β’ 2412.07744 β’ Published Dec 10, 2024 β’ 19
LAION-SG: An Enhanced Large-Scale Dataset for Training Complex Image-Text Models with Structural Annotations Paper β’ 2412.08580 β’ Published Dec 11, 2024 β’ 45
Neural LightRig: Unlocking Accurate Object Normal and Material Estimation with Multi-Light Diffusion Paper β’ 2412.09593 β’ Published Dec 12, 2024 β’ 18
You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale Paper β’ 2412.06699 β’ Published Dec 9, 2024 β’ 11
SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters Paper β’ 2412.00174 β’ Published Nov 29, 2024 β’ 22
CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM Paper β’ 2411.04954 β’ Published Nov 7, 2024 β’ 8
StdGEN: Semantic-Decomposed 3D Character Generation from Single Images Paper β’ 2411.05738 β’ Published Nov 8, 2024 β’ 14
KMM: Key Frame Mask Mamba for Extended Motion Generation Paper β’ 2411.06481 β’ Published Nov 10, 2024 β’ 4
OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision Paper β’ 2411.07199 β’ Published Nov 11, 2024 β’ 46
MagicQuill: An Intelligent Interactive Image Editing System Paper β’ 2411.09703 β’ Published Nov 14, 2024 β’ 63
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement Paper β’ 2411.06558 β’ Published Nov 10, 2024 β’ 34