CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training Paper • 2504.13161 • Published 5 days ago • 86
DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers Paper • 2503.14487 • Published Mar 18 • 27
BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing Paper • 2503.13434 • Published Mar 17 • 26
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published Feb 20 • 143
RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning Paper • 2502.13144 • Published Feb 18 • 40
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models Paper • 2502.10458 • Published Feb 12 • 35 • 3
LeC$^2$O-NeRF: Learning Continuous and Compact Large-Scale Occupancy for Urban Scenes Paper • 2411.11374 • Published Nov 18, 2024
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models Paper • 2502.10458 • Published Feb 12 • 35 • 3
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models Paper • 2502.10458 • Published Feb 12 • 35
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models Paper • 2502.10458 • Published Feb 12 • 35