Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield Paper • 2511.22677 • Published 29 days ago • 28
CASA Collection CASA: Cross-Attention as Self-Attention for Efficient Vision-Language Fusion on long context streaming inputs • 6 items • Updated 3 days ago • 4
3D-RE-GEN: 3D Reconstruction of Indoor Scenes with a Generative Framework Paper • 2512.17459 • Published 8 days ago • 11
MeshSplatting: Differentiable Rendering with Opaque Meshes Paper • 2512.06818 • Published 19 days ago • 10
V-RGBX: Video Editing with Accurate Controls over Intrinsic Properties Paper • 2512.11799 • Published 14 days ago • 29
VQRAE: Representation Quantization Autoencoders for Multimodal Understanding, Generation and Reconstruction Paper • 2511.23386 • Published 28 days ago • 15
EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing Paper • 2512.06065 • Published 21 days ago • 28
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published 29 days ago • 213
Real-time Vision Models Collection A collection of real-time detectors. • 19 items • Updated Nov 23 • 22