SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published 11 days ago • 161
4D-Bench: Benchmarking Multi-modal Large Language Models for 4D Object Understanding Paper • 2503.17827 • Published 27 days ago • 8
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields Paper • 2503.20776 • Published 23 days ago • 8
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation Paper • 2502.14846 • Published Feb 20 • 13
Articulate-Anything: Automatic Modeling of Articulated Objects via a Vision-Language Foundation Model Paper • 2410.13882 • Published Oct 3, 2024
MiRAGeNews: Multimodal Realistic AI-Generated News Detection Paper • 2410.09045 • Published Oct 11, 2024 • 4
SliderSpace: Decomposing the Visual Capabilities of Diffusion Models Paper • 2502.01639 • Published Feb 3 • 25
Confidence-Building Measures for Artificial Intelligence: Workshop Proceedings Paper • 2308.00862 • Published Aug 1, 2023
D2PO: Discriminator-Guided DPO with Response Evaluation Models Paper • 2405.01511 • Published May 2, 2024
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback Paper • 2406.09279 • Published Jun 13, 2024 • 3
WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs Paper • 2406.18495 • Published Jun 26, 2024 • 13
Towards a Framework for Openness in Foundation Models: Proceedings from the Columbia Convening on Openness in Artificial Intelligence Paper • 2405.15802 • Published May 17, 2024
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models Paper • 2409.17146 • Published Sep 25, 2024 • 114