ShieldAgent: Shielding Agents via Verifiable Safety Policy Reasoning Paper • 2503.22738 • Published 13 days ago • 14
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation Paper • 2504.02782 • Published 5 days ago • 52
Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation Paper • 2503.14905 • Published 21 days ago • 19
Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation Paper • 2503.14905 • Published 21 days ago • 19
Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation Paper • 2503.14905 • Published 21 days ago • 19 • 3
MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding Paper • 2503.13964 • Published 22 days ago • 18
Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More Paper • 2502.11494 • Published Feb 17
LEGION: Learning to Ground and Explain for Synthetic Image Detection Paper • 2503.15264 • Published 20 days ago • 20
LEGION: Learning to Ground and Explain for Synthetic Image Detection Paper • 2503.15264 • Published 20 days ago • 20
LEGION: Learning to Ground and Explain for Synthetic Image Detection Paper • 2503.15264 • Published 20 days ago • 20 • 2
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation Paper • 2412.02592 • Published Dec 3, 2024 • 22
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation Paper • 2412.02592 • Published Dec 3, 2024 • 22
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception Paper • 2410.12628 • Published Oct 16, 2024 • 37
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases Paper • 2407.12784 • Published Jul 17, 2024 • 52
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation? Paper • 2407.04842 • Published Jul 5, 2024 • 57