FeatureBench: Benchmarking Agentic Coding for Complex Feature Development Paper โข 2602.10975 โข Published Feb 11 โข 19
WizardLMTeam/WizardLM_evol_instruct_V2_196k Viewer โข Updated Mar 10, 2024 โข 143k โข 2.57k โข 246
Running Featured 585 LLM-Perf Leaderboard ๐ 585 Explore LLM performance across hardware configurations
Running on CPU Upgrade 13.9k Open LLM Leaderboard ๐ 13.9k Track, rank and evaluate open LLMs and chatbots
Running 1.5k Big Code Models Leaderboard ๐ 1.5k Explore and submit code model evaluations on a leaderboard