Chapter-Llama Models

This repository contains the model checkpoints used in the paper "Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs" (CVPR 2025).

Models Overview

Chapter-Llama is based on fine-tuned Llama-3.1-8B-Instruct with LoRA adapters. We provide three main model variants:

  1. asr-10k: Model trained with ASR from 10k videos of the VidChapter-7M dataset

    • Used for our Speech-based frame selector
    • Input: Only speech transcripts with timestamps
  2. captions_asr-10k: Model trained with Captions+ASR from 10k videos

    • Our primary model used for most experiments
    • Input: Both speech transcripts and visual captions with timestamps
  3. captions_asr-1k: Model trained with Captions+ASR from 1k videos

    • Smaller training set variant
    • Input: Both speech transcripts and visual captions with timestamps

Model Performance

Our best model achieves 45.3 F1 score on the VidChapters-7M benchmark, substantially outperforming previous state-of-the-art methods.

Usage

The models can be downloaded and used with the Chapter-Llama codebase:

# Download model LoRA adapters
python tools/download/models.py "asr-10k" --local_dir "."
python tools/download/models.py "captions_asr-10k" --local_dir "."
python tools/download/models.py "captions_asr-1k" --local_dir "."

# Inference on a single video
python inference.py /path/to/your/video.mp4

Model Architecture

  • Base model: Llama-3.1-8B-Instruct
  • Adaptation: LoRA fine-tuning
  • Input format: Text tokens representing ASR and/or frame captions with timestamps
  • Output format: Timestamps for chapter boundaries and free-form chapter titles

Citation

If you use these models in your work, please cite our paper:

@article{ventura25chapter,
    title     = {{Chapter-Llama}: Efficient Chaptering in Hour-Long Videos with {LLM}s},
    author    = {Lucas Ventura and Antoine Yang and Cordelia Schmid and G{\"u}l Varol},
    journal   = {CVPR},
    year      = {2025}
}

Links

License

These models are distributed under an MIT License. Please check the repository for more details.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for lucas-ventura/chapter-llama

Finetuned
(1127)
this model

Space using lucas-ventura/chapter-llama 1