Running on Zero 85 👁 MV Adapter I2MV Anime Generate 768x768 multi-view images using anime-style model
Octavius: Mitigating Task Interference in MLLMs via LoRA-MoE Paper • 2311.02684 • Published Nov 5, 2023
Bamboo: Building Mega-Scale Vision Dataset Continually with Human-Machine Synergy Paper • 2203.07845 • Published Mar 15, 2022
MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control Paper • 2403.12037 • Published Mar 18 • 1
Assessment of Multimodal Large Language Models in Alignment with Human Values Paper • 2403.17830 • Published Mar 26
From Parts to Whole: A Unified Reference Framework for Controllable Human Image Generation Paper • 2404.15267 • Published Apr 23 • 4
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion Paper • 2406.03184 • Published Jun 5 • 19
WorldSimBench: Towards Video Generation Models as World Simulators Paper • 2410.18072 • Published Oct 23 • 18
MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation Paper • 2412.03558 • Published 22 days ago • 14
MV-Adapter: Multi-view Consistent Image Generation Made Easy Paper • 2412.03632 • Published 22 days ago • 22
Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection Paper • 2412.04455 • Published 21 days ago • 35
MV-Adapter: Multi-view Consistent Image Generation Made Easy Paper • 2412.03632 • Published 22 days ago • 22
Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection Paper • 2412.04455 • Published 21 days ago • 35
MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation Paper • 2412.03558 • Published 22 days ago • 14
WorldSimBench: Towards Video Generation Models as World Simulators Paper • 2410.18072 • Published Oct 23 • 18
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion Paper • 2406.03184 • Published Jun 5 • 19
CityDreamer: Compositional Generative Model of Unbounded 3D Cities Paper • 2309.00610 • Published Sep 1, 2023 • 18
ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis Paper • 2103.05630 • Published Mar 9, 2021
LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark Paper • 2306.06687 • Published Jun 11, 2023 • 1
VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic Scene Graph Prediction in Point Cloud Paper • 2303.14408 • Published Mar 25, 2023