Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions Paper • 2412.08737 • Published 14 days ago • 51
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions Paper • 2412.08737 • Published 14 days ago • 51
ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer Paper • 2412.07720 • Published 15 days ago • 30
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems Paper • 2402.14008 • Published Feb 21
GUICourse: From General Vision Language Models to Versatile GUI Agents Paper • 2406.11317 • Published Jun 17 • 1
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback Paper • 2309.10691 • Published Sep 19, 2023 • 4
CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets Paper • 2309.17428 • Published Sep 29, 2023 • 1
RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness Paper • 2405.17220 • Published May 27 • 1