When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning Paper • 2504.01005 • Published 7 days ago • 15
OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement Paper • 2503.17352 • Published 18 days ago • 21
STIV: Scalable Text and Image Conditioned Video Generation Paper • 2412.07730 • Published Dec 10, 2024 • 74
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory Paper • 2410.10813 • Published Oct 14, 2024 • 11
Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models Paper • 2410.05269 • Published Oct 7, 2024 • 3
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding Paper • 2406.09411 • Published Jun 13, 2024 • 20
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? Paper • 2403.14624 • Published Mar 21, 2024 • 53
view article Article Introducing ConTextual: How well can your Multimodal model jointly reason over text and image in text-rich scenes? By rohan598 and 4 others • Mar 5, 2024 • 4
ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models Paper • 2401.13311 • Published Jan 24, 2024 • 11
TrustLLM: Trustworthiness in Large Language Models Paper • 2401.05561 • Published Jan 10, 2024 • 70
VideoCon: Robust Video-Language Alignment via Contrast Captions Paper • 2311.10111 • Published Nov 15, 2023 • 9
Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs Paper • 2311.05657 • Published Nov 9, 2023 • 32
Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs Paper • 2311.05657 • Published Nov 9, 2023 • 32
RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment Paper • 2307.12950 • Published Jul 24, 2023 • 10
DesCo: Learning Object Recognition with Rich Language Descriptions Paper • 2306.14060 • Published Jun 24, 2023 • 1