Liangrj5
commited on
Commit
·
6dd9459
1
Parent(s):
23c6c0b
init
Browse files- LICENSE +21 -0
- README.md +38 -3
- __init__.py +0 -0
- run_top01.sh +25 -0
- run_top20.sh +24 -0
- run_top40.sh +24 -0
LICENSE
ADDED
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
MIT License
|
2 |
+
|
3 |
+
Copyright (c) 2020 Jie Lei
|
4 |
+
|
5 |
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6 |
+
of this software and associated documentation files (the "Software"), to deal
|
7 |
+
in the Software without restriction, including without limitation the rights
|
8 |
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9 |
+
copies of the Software, and to permit persons to whom the Software is
|
10 |
+
furnished to do so, subject to the following conditions:
|
11 |
+
|
12 |
+
The above copyright notice and this permission notice shall be included in all
|
13 |
+
copies or substantial portions of the Software.
|
14 |
+
|
15 |
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16 |
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17 |
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18 |
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19 |
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20 |
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
21 |
+
SOFTWARE.
|
README.md
CHANGED
@@ -1,3 +1,38 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# XML_RVMR
|
2 |
+
|
3 |
+
This repository contains the XML model for the baseline of the Ranked Video Moment Retrieval (RVMR) task. The associated paper is titled "Video Moment Retrieval in Practical Setting: A Dataset of Ranked Moments for Imprecise Queries."
|
4 |
+
|
5 |
+
The main repository of the paper is [TVR-Ranking](https://huggingface.co/axgroup/TVR-Ranking), and this model is adapted from [TVRetrieval](https://github.com/jayleicn/TVRetrieval.git).
|
6 |
+
|
7 |
+
Annotations and features can be downloaded from [TVR-Ranking](https://huggingface.co/axgroup/TVR-Ranking). The environment setup is the same as for RelocNet_RVMR, as detailed in the [TVR-Ranking](https://huggingface.co/axgroup/TVR-Ranking) repository.
|
8 |
+
|
9 |
+
|
10 |
+
## Performance
|
11 |
+
|
12 |
+
| **Model** | **Train Set Top N** | **IoU=0.3** | **IoU=0.5** | **IoU=0.7** |
|
13 |
+
|-----------|---------------------|--------------|--------------|--------------|
|
14 |
+
| | | **Val** | **Test** | **Val** | **Test** | **Val** | **Test** |
|
15 |
+
| **NDCG@10** | | | | |
|
16 |
+
| XML | 1 | 0.1016 | 0.0917 | 0.0747 | 0.0660 | 0.0244 | 0.0268 |
|
17 |
+
| XML | 20 | 0.2226 | 0.2135 | 0.1623 | 0.1567 | 0.0580 | 0.0627 |
|
18 |
+
| XML | 40 | 0.2002 | 0.2044 | 0.1461 | 0.1502 | 0.0541 | 0.0589 |
|
19 |
+
| **NDCG@20** | | | | |
|
20 |
+
| XML | 1 | 0.1010 | 0.0923 | 0.0737 | 0.0662 | 0.0258 | 0.0269 |
|
21 |
+
| XML | 20 | 0.2331 | 0.2243 | 0.1700 | 0.1650 | 0.0627 | 0.0664 |
|
22 |
+
| XML | 40 | 0.2114 | 0.2167 | 0.1530 | 0.1590 | 0.0583 | 0.0635 |
|
23 |
+
| **NDCG@40** | | | | |
|
24 |
+
| XML | 1 | 0.1077 | 0.1016 | 0.0775 | 0.0727 | 0.0273 | 0.0294 |
|
25 |
+
| XML | 20 | 0.2580 | 0.2512 | 0.1874 | 0.1853 | 0.0705 | 0.0753 |
|
26 |
+
| XML | 40 | 0.2408 | 0.2432 | 0.1740 | 0.1791 | 0.0666 | 0.0720 |
|
27 |
+
|
28 |
+
|
29 |
+
|
30 |
+
## Quick Start
|
31 |
+
|
32 |
+
Modify the path in `run_top20.sh` and then execute the script:
|
33 |
+
|
34 |
+
```sh
|
35 |
+
sh run_top20.sh
|
36 |
+
```
|
37 |
+
|
38 |
+
Feel free to contribute or raise issues for any problems encountered.
|
__init__.py
ADDED
File without changes
|
run_top01.sh
ADDED
@@ -0,0 +1,25 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
python baselines/crossmodal_moment_localization/train.py \
|
2 |
+
--train_path data/TVR_Ranking/train_top01.json \
|
3 |
+
--val_path data/TVR_Ranking/val.json \
|
4 |
+
--test_path data/TVR_Ranking/test.json \
|
5 |
+
--corpus_path data/TVR_Ranking/video_corpus.json \
|
6 |
+
--desc_bert_path data/features/query_bert.h5 \
|
7 |
+
--vid_feat_path data/features/tvr_i3d_rgb600_avg_cl-1.5.h5 \
|
8 |
+
--sub_bert_path data/features/tvr_sub_pretrained_w_sub_query_max_cl-1.5.h5\
|
9 |
+
--dset_name=tvr \
|
10 |
+
--eval_split_name=val \
|
11 |
+
--nms_thd=-1 \
|
12 |
+
--results_root=results \
|
13 |
+
--clip_length=1.5 \
|
14 |
+
--vid_feat_size=1024 \
|
15 |
+
--ctx_mode=video_sub_tef \
|
16 |
+
--max_ctx_l=128 \
|
17 |
+
--max_pred_l=16 \
|
18 |
+
--eval_num_per_epoch=0.05 \
|
19 |
+
--n_epoch=4000 \
|
20 |
+
--exp_id=top01 \
|
21 |
+
--model_name=XML \
|
22 |
+
--lr=0.001
|
23 |
+
|
24 |
+
# qsub -I -l select=1:ngpus=1 -P gs_slab -q gpu8
|
25 |
+
# cd 11_TVR-Ranking/TVRetrieval/; conda activate py11; sh run_top01.sh
|
run_top20.sh
ADDED
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
python baselines/crossmodal_moment_localization/train.py \
|
2 |
+
--train_path data/TVR_Ranking/train_top20.json \
|
3 |
+
--val_path data/TVR_Ranking/val.json \
|
4 |
+
--test_path data/TVR_Ranking/test.json \
|
5 |
+
--corpus_path data/TVR_Ranking/video_corpus.json \
|
6 |
+
--desc_bert_path data/features/query_bert.h5 \
|
7 |
+
--vid_feat_path data/features/tvr_i3d_rgb600_avg_cl-1.5.h5 \
|
8 |
+
--sub_bert_path data/features/tvr_sub_pretrained_w_sub_query_max_cl-1.5.h5\
|
9 |
+
--dset_name=tvr \
|
10 |
+
--eval_split_name=val \
|
11 |
+
--nms_thd=-1 \
|
12 |
+
--results_root=results \
|
13 |
+
--clip_length=1.5 \
|
14 |
+
--vid_feat_size=1024 \
|
15 |
+
--ctx_mode=video_sub_tef \
|
16 |
+
--max_ctx_l=128 \
|
17 |
+
--max_pred_l=16 \
|
18 |
+
--eval_num_per_epoch=1 \
|
19 |
+
--n_epoch=200 \
|
20 |
+
--exp_id=top20 \
|
21 |
+
--model_name=XML
|
22 |
+
|
23 |
+
# qsub -I -l select=1:ngpus=1 -P gs_slab -q gpu8
|
24 |
+
# cd 11_TVR-Ranking/TVRetrieval/; conda activate py11; sh run_top20.sh
|
run_top40.sh
ADDED
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
python baselines/crossmodal_moment_localization/train.py \
|
2 |
+
--train_path data/TVR_Ranking/train_top40.json \
|
3 |
+
--val_path data/TVR_Ranking/val.json \
|
4 |
+
--test_path data/TVR_Ranking/test.json \
|
5 |
+
--corpus_path data/TVR_Ranking/video_corpus.json \
|
6 |
+
--desc_bert_path data/features/query_bert.h5 \
|
7 |
+
--vid_feat_path data/features/tvr_i3d_rgb600_avg_cl-1.5.h5 \
|
8 |
+
--sub_bert_path data/features/tvr_sub_pretrained_w_sub_query_max_cl-1.5.h5\
|
9 |
+
--dset_name=tvr \
|
10 |
+
--eval_split_name=val \
|
11 |
+
--nms_thd=-1 \
|
12 |
+
--results_root=results \
|
13 |
+
--clip_length=1.5 \
|
14 |
+
--vid_feat_size=1024 \
|
15 |
+
--ctx_mode=video_sub_tef \
|
16 |
+
--max_ctx_l=128 \
|
17 |
+
--max_pred_l=16 \
|
18 |
+
--eval_num_per_epoch=2 \
|
19 |
+
--n_epoch=100 \
|
20 |
+
--exp_id=top40 \
|
21 |
+
--model_name=XML
|
22 |
+
|
23 |
+
# qsub -I -l select=1:ngpus=1 -P gs_slab -q gpu8
|
24 |
+
# cd 11_TVR-Ranking/TVRetrieval/; conda activate py11; sh run_top40.sh
|