ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation Paper • 2501.06598 • Published 15 days ago • 1
Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models Paper • 2501.05767 • Published 16 days ago • 28
StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding Paper • 2411.03628 • Published Nov 6, 2024 • 2
Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models Paper • 2308.13437 • Published Aug 25, 2023 • 4
LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer Paper • 2412.13871 • Published Dec 18, 2024 • 18
LLMtimesMapReduce: Simplified Long-Sequence Processing using Large Language Models Paper • 2410.09342 • Published Oct 12, 2024 • 38