Clip Alignment With Language
This folder contains the CAL model described in the paper
@article{Escorcia2019TemporalLO,
title={Temporal Localization of Moments in Video Collections with Natural Language},
author={Victor Escorcia and Mattia Soldan and Josef Sivic and Bernard Ghanem and Bryan Russell},
journal={ArXiv},
year={2019},
volume={abs/1907.12763}
}
It also resembles the MCN model in
@article{Hendricks2017LocalizingMI,
title={Localizing Moments in Video with Natural Language},
author={Lisa Anne Hendricks and Oliver Wang and Eli Shechtman and Josef Sivic and Trevor Darrell and Bryan C. Russell},
journal={2017 IEEE International Conference on Computer Vision (ICCV)},
year={2017},
pages={5804-5813}
}
Disclaimer: This code is implemented by Jie Lei for the TVR dataset, it does not guarantee the reproducibility of the original authors' results.