|
# Clip Alignment With Language |
|
This folder contains the CAL model described in the paper |
|
``` |
|
@article{Escorcia2019TemporalLO, |
|
title={Temporal Localization of Moments in Video Collections with Natural Language}, |
|
author={Victor Escorcia and Mattia Soldan and Josef Sivic and Bernard Ghanem and Bryan Russell}, |
|
journal={ArXiv}, |
|
year={2019}, |
|
volume={abs/1907.12763} |
|
} |
|
``` |
|
|
|
It also resembles the MCN model in |
|
``` |
|
@article{Hendricks2017LocalizingMI, |
|
title={Localizing Moments in Video with Natural Language}, |
|
author={Lisa Anne Hendricks and Oliver Wang and Eli Shechtman and Josef Sivic and Trevor Darrell and Bryan C. Russell}, |
|
journal={2017 IEEE International Conference on Computer Vision (ICCV)}, |
|
year={2017}, |
|
pages={5804-5813} |
|
} |
|
``` |
|
|
|
Disclaimer: This code is implemented by [Jie Lei](http://www.cs.unc.edu/~jielei/) for the TVR dataset, |
|
it does not guarantee the reproducibility of the original authors' results. |