--- title: PoCLeaderboard emoji: 🏆 colorFrom: green colorTo: pink sdk: gradio sdk_version: 5.4.0 app_file: app.py pinned: false license: mit short_description: Example Leaderboard --- This Space provides an interactive leaderboard for comparing language model performance across various benchmarks and custom tasks. ## Features - Automated model evaluation using lm-evaluation-harness - Support for standard and custom benchmarks - Interactive visualization of results - Daily automated evaluations - Easy submission of new models and custom tasks ## Usage 1. Visit the Space to view current leaderboard 2. Submit new models for evaluation 3. Create custom evaluation tasks 4. Track performance trends over time ## Custom Task Format ```json { "examples": [ { "input": "question or prompt", "ideal": "expected answer", "metrics": ["accuracy", "f1"] } ] } ```