Clip Alignment With Language

This folder contains the CAL model described in the paper

@article{Escorcia2019TemporalLO,
  title={Temporal Localization of Moments in Video Collections with Natural Language},
  author={Victor Escorcia and Mattia Soldan and Josef Sivic and Bernard Ghanem and Bryan Russell},
  journal={ArXiv},
  year={2019},
  volume={abs/1907.12763}
}

It also resembles the MCN model in

@article{Hendricks2017LocalizingMI,
  title={Localizing Moments in Video with Natural Language},
  author={Lisa Anne Hendricks and Oliver Wang and Eli Shechtman and Josef Sivic and Trevor Darrell and Bryan C. Russell},
  journal={2017 IEEE International Conference on Computer Vision (ICCV)},
  year={2017},
  pages={5804-5813}
}

Disclaimer: This code is implemented by Jie Lei for the TVR dataset, it does not guarantee the reproducibility of the original authors' results.