ActionStudio: A Lightweight Framework for Data and Training of Large Action Models Paper • 2503.22673 • Published 11 days ago • 12
MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases Paper • 2406.10290 • Published Jun 12, 2024
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems Paper • 2407.01370 • Published Jul 1, 2024 • 89
APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets Paper • 2406.18518 • Published Jun 26, 2024 • 25
InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks Paper • 2401.05507 • Published Jan 10, 2024 • 1
XLCoST: A Benchmark Dataset for Cross-lingual Code Intelligence Paper • 2206.08474 • Published Jun 16, 2022
APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets Paper • 2406.18518 • Published Jun 26, 2024 • 25