alozowski's picture
alozowski HF staff
submission-fix (#1065)
23c96f8 verified
|
raw
history blame
9.77 kB

Backend - Open LLM Leaderboard πŸ†

FastAPI backend for the Open LLM Leaderboard. This service is part of a larger architecture that includes a React frontend. For complete project installation, see the main README.

✨ Features

  • πŸ“Š REST API for LLM models leaderboard management
  • πŸ—³οΈ Voting and ranking system
  • πŸ”„ HuggingFace Hub integration
  • πŸš€ Caching and performance optimizations

πŸ— Architecture

flowchart TD
    Client(["**Frontend**<br><br>React Application"]) --> API["**API Server**<br><br>FastAPI REST Endpoints"]

    subgraph Backend
        API --> Core["**Core Layer**<br><br>β€’ Middleware<br>β€’ Cache<br>β€’ Rate Limiting"]
        Core --> Services["**Services Layer**<br><br>β€’ Business Logic<br>β€’ Data Processing"]

        subgraph Services Layer
            Services --> Models["**Model Service**<br><br>β€’ Model Submission<br>β€’ Evaluation Pipeline"]
            Services --> Votes["**Vote Service**<br><br>β€’ Vote Management<br>β€’ Data Synchronization"]
            Services --> Board["**Leaderboard Service**<br><br>β€’ Rankings<br>β€’ Performance Metrics"]
        end

        Models --> Cache["**Cache Layer**<br><br>β€’ In-Memory Store<br>β€’ Auto Invalidation"]
        Votes --> Cache
        Board --> Cache

        Models --> HF["**HuggingFace Hub**<br><br>β€’ Models Repository<br>β€’ Datasets Access"]
        Votes --> HF
        Board --> HF
    end

    style Client fill:#f9f,stroke:#333,stroke-width:2px
    style Models fill:#bbf,stroke:#333,stroke-width:2px
    style Votes fill:#bbf,stroke:#333,stroke-width:2px
    style Board fill:#bbf,stroke:#333,stroke-width:2px
    style HF fill:#bfb,stroke:#333,stroke-width:2px

πŸ› οΈ HuggingFace Datasets

The application uses several datasets on the HuggingFace Hub:

1. Requests Dataset ({HF_ORGANIZATION}/requests)

  • Operations:
    • πŸ“€ POST /api/models/submit: Adds a JSON file for each new model submission
    • πŸ“₯ GET /api/models/status: Reads files to get models status
  • Format: One JSON file per model with submission details
  • Updates: On each new model submission

2. Votes Dataset ({HF_ORGANIZATION}/votes)

  • Operations:
    • πŸ“€ POST /api/votes/{model_id}: Adds a new vote
    • πŸ“₯ GET /api/votes/model/{provider}/{model}: Reads model votes
    • πŸ“₯ GET /api/votes/user/{user_id}: Reads user votes
  • Format: JSONL with one vote per line
  • Sync: Bidirectional between local cache and Hub

3. Contents Dataset ({HF_ORGANIZATION}/contents)

  • Operations:
    • πŸ“₯ GET /api/leaderboard: Reads raw data
    • πŸ“₯ GET /api/leaderboard/formatted: Reads and formats data
  • Format: Main dataset containing all scores and metrics
  • Updates: Automatic after model evaluations

4. Official Providers Dataset ({HF_ORGANIZATION}/official-providers)

  • Operations:
    • πŸ“₯ Read-only access for highlighted models
  • Format: List of models selected by maintainers
  • Updates: Manual by maintainers

πŸ›  Local Development

Prerequisites

Standalone Installation (without Docker)

# Install dependencies
poetry install

# Setup configuration
cp .env.example .env

# Start development server
poetry run uvicorn app.asgi:app --host 0.0.0.0 --port 7860 --reload

Server will be available at http://localhost:7860

βš™οΈ Configuration

Variable Description Default
ENVIRONMENT Environment (development/production) development
HF_TOKEN HuggingFace authentication token -
PORT Server port 7860
LOG_LEVEL Logging level (INFO/DEBUG/WARNING) INFO
CORS_ORIGINS Allowed CORS origins ["*"]
CACHE_TTL Cache Time To Live in seconds 300

πŸ”§ Middleware

The backend uses several middleware layers for optimal performance and security:

  • CORS Middleware: Handles Cross-Origin Resource Sharing
  • GZIP Middleware: Compresses responses > 500 bytes
  • Rate Limiting: Prevents API abuse
  • Caching: In-memory caching with automatic invalidation

πŸ“ Logging

The application uses a structured logging system with:

  • Formatted console output
  • Different log levels per component
  • Request/Response logging
  • Performance metrics
  • Error tracking

πŸ“ File Structure

backend/
β”œβ”€β”€ app/                  # Source code
β”‚   β”œβ”€β”€ api/             # Routes and endpoints
β”‚   β”‚   └── endpoints/   # Endpoint handlers
β”‚   β”œβ”€β”€ core/           # Configurations
β”‚   β”œβ”€β”€ services/       # Business logic
β”‚   └── utils/          # Utilities
└── tests/              # Tests

πŸ“š API

Swagger documentation available at http://localhost:7860/docs

Main Endpoints & Data Structures

Leaderboard

  • GET /api/leaderboard/formatted - Formatted data with computed fields and metadata

    Response {
      models: [{
        id: string,  // eval_name
        model: {
          name: string,  // fullname
          sha: string,  // Model sha
          precision: string,  // e.g. "fp16", "int8"
          type: string,  // e.g. "fined-tuned-on-domain-specific-dataset"
          weight_type: string,
          architecture: string,
          average_score: number,
          has_chat_template: boolean
        },
        evaluations: {
          ifeval: {
            name: "IFEval",
            value: number,  // Raw score
            normalized_score: number
          },
          bbh: {
            name: "BBH",
            value: number,
            normalized_score: number
          },
          math: {
            name: "MATH Level 5",
            value: number,
            normalized_score: number
          },
          gpqa: {
            name: "GPQA",
            value: number,
            normalized_score: number
          },
          musr: {
            name: "MUSR",
            value: number,
            normalized_score: number
          },
          mmlu_pro: {
            name: "MMLU-PRO",
            value: number,
            normalized_score: number
          }
        },
        features: {
          is_not_available_on_hub: boolean,
          is_merged: boolean,
          is_moe: boolean,
          is_flagged: boolean,
          is_official_provider: boolean
        },
        metadata: {
          upload_date: string,
          submission_date: string,
          generation: string,
          base_model: string,
          hub_license: string,
          hub_hearts: number,
          params_billions: number,
          co2_cost: number  // COβ‚‚ cost in kg
        }
      }]
    }
    
  • GET /api/leaderboard - Raw data from the HuggingFace dataset

    Response {
      models: [{
        eval_name: string,
        Precision: string,
        Type: string,
        "Weight type": string,
        Architecture: string,
        Model: string,
        fullname: string,
        "Model sha": string,
        "Average ⬆️": number,
        "Hub License": string,
        "Hub ❀️": number,
        "#Params (B)": number,
        "Available on the hub": boolean,
        Merged: boolean,
        MoE: boolean,
        Flagged: boolean,
        "Chat Template": boolean,
        "COβ‚‚ cost (kg)": number,
        "IFEval Raw": number,
        IFEval: number,
        "BBH Raw": number,
        BBH: number,
        "MATH Lvl 5 Raw": number,
        "MATH Lvl 5": number,
        "GPQA Raw": number,
        GPQA: number,
        "MUSR Raw": number,
        MUSR: number,
        "MMLU-PRO Raw": number,
        "MMLU-PRO": number,
        "Maintainer's Highlight": boolean,
        "Upload To Hub Date": string,
        "Submission Date": string,
        Generation: string,
        "Base Model": string
      }]
    }
    

Models

  • GET /api/models/status - Get all models grouped by status

    Response {
      pending: [{
        name: string,
        submitter: string,
        revision: string,
        wait_time: string,
        submission_time: string,
        status: "PENDING" | "EVALUATING" | "FINISHED",
        precision: string
      }],
      evaluating: Array<Model>,
      finished: Array<Model>
    }
    
  • GET /api/models/pending - Get pending models only

  • POST /api/models/submit - Submit model

    Request {
      user_id: string,
      model_id: string,
      base_model?: string,
      precision?: string,
      model_type: string
    }
    
    Response {
      status: string,
      message: string
    }
    
  • GET /api/models/{model_id}/status - Get model status

Votes

  • POST /api/votes/{model_id} - Vote

    Request {
      vote_type: "up" | "down",
      user_id: string  // HuggingFace username
    }
    
    Response {
      success: boolean,
      message: string
    }
    
  • GET /api/votes/model/{provider}/{model} - Get model votes

    Response {
      total_votes: number,
      up_votes: number,
      down_votes: number
    }
    
  • GET /api/votes/user/{user_id} - Get user votes

    Response Array<{
      model_id: string,
      vote_type: string,
      timestamp: string
    }>
    

πŸ”’ Authentication

The backend uses HuggingFace token-based authentication for secure API access. Make sure to:

  1. Set your HF_TOKEN in the .env file
  2. Include the token in API requests via Bearer authentication
  3. Keep your token secure and never commit it to version control

πŸš€ Performance

The backend implements several optimizations:

  • In-memory caching with configurable TTL (Time To Live)
  • Batch processing for model evaluations
  • Rate limiting for API endpoints
  • Efficient database queries with proper indexing
  • Automatic cache invalidation for votes