Deliberation in Latent Space via Differentiable Cache Augmentation Paper • 2412.17747 • Published 1 day ago • 14
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper • 2412.03555 • Published 20 days ago • 118
Mimir: Improving Video Diffusion Models for Precise Text Understanding Paper • 2412.03085 • Published 20 days ago • 12
ShowUI: One Vision-Language-Action Model for GUI Visual Agent Paper • 2411.17465 • Published 28 days ago • 76