DSBench: How Far Are Data Science Agents to Becoming Data Science Experts? Paper โข 2409.07703 โข Published Sep 12, 2024 โข 67
FAITHSCORE: Evaluating Hallucinations in Large Vision-Language Models Paper โข 2311.01477 โข Published Nov 2, 2023 โข 1