Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages Paper • 2410.16153 • Published 26 days ago • 42
AutoTrain: No-code training for state-of-the-art models Paper • 2410.15735 • Published 26 days ago • 56
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio Paper • 2410.12787 • Published about 1 month ago • 30
LEOPARD : A Vision Language Model For Text-Rich Multi-Image Tasks Paper • 2410.01744 • Published Oct 2 • 25
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models Paper • 2410.14059 • Published 29 days ago • 52
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models Paper • 2409.17146 • Published Sep 25 • 101
Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination Paper • 2411.03823 • Published 10 days ago • 43