Liangrj5 commited on
Commit
6dd9459
·
1 Parent(s): 23c6c0b
Files changed (6) hide show
  1. LICENSE +21 -0
  2. README.md +38 -3
  3. __init__.py +0 -0
  4. run_top01.sh +25 -0
  5. run_top20.sh +24 -0
  6. run_top40.sh +24 -0
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2020 Jie Lei
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
README.md CHANGED
@@ -1,3 +1,38 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # XML_RVMR
2
+
3
+ This repository contains the XML model for the baseline of the Ranked Video Moment Retrieval (RVMR) task. The associated paper is titled "Video Moment Retrieval in Practical Setting: A Dataset of Ranked Moments for Imprecise Queries."
4
+
5
+ The main repository of the paper is [TVR-Ranking](https://huggingface.co/axgroup/TVR-Ranking), and this model is adapted from [TVRetrieval](https://github.com/jayleicn/TVRetrieval.git).
6
+
7
+ Annotations and features can be downloaded from [TVR-Ranking](https://huggingface.co/axgroup/TVR-Ranking). The environment setup is the same as for RelocNet_RVMR, as detailed in the [TVR-Ranking](https://huggingface.co/axgroup/TVR-Ranking) repository.
8
+
9
+
10
+ ## Performance
11
+
12
+ | **Model** | **Train Set Top N** | **IoU=0.3** | **IoU=0.5** | **IoU=0.7** |
13
+ |-----------|---------------------|--------------|--------------|--------------|
14
+ | | | **Val** | **Test** | **Val** | **Test** | **Val** | **Test** |
15
+ | **NDCG@10** | | | | |
16
+ | XML | 1 | 0.1016 | 0.0917 | 0.0747 | 0.0660 | 0.0244 | 0.0268 |
17
+ | XML | 20 | 0.2226 | 0.2135 | 0.1623 | 0.1567 | 0.0580 | 0.0627 |
18
+ | XML | 40 | 0.2002 | 0.2044 | 0.1461 | 0.1502 | 0.0541 | 0.0589 |
19
+ | **NDCG@20** | | | | |
20
+ | XML | 1 | 0.1010 | 0.0923 | 0.0737 | 0.0662 | 0.0258 | 0.0269 |
21
+ | XML | 20 | 0.2331 | 0.2243 | 0.1700 | 0.1650 | 0.0627 | 0.0664 |
22
+ | XML | 40 | 0.2114 | 0.2167 | 0.1530 | 0.1590 | 0.0583 | 0.0635 |
23
+ | **NDCG@40** | | | | |
24
+ | XML | 1 | 0.1077 | 0.1016 | 0.0775 | 0.0727 | 0.0273 | 0.0294 |
25
+ | XML | 20 | 0.2580 | 0.2512 | 0.1874 | 0.1853 | 0.0705 | 0.0753 |
26
+ | XML | 40 | 0.2408 | 0.2432 | 0.1740 | 0.1791 | 0.0666 | 0.0720 |
27
+
28
+
29
+
30
+ ## Quick Start
31
+
32
+ Modify the path in `run_top20.sh` and then execute the script:
33
+
34
+ ```sh
35
+ sh run_top20.sh
36
+ ```
37
+
38
+ Feel free to contribute or raise issues for any problems encountered.
__init__.py ADDED
File without changes
run_top01.sh ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ python baselines/crossmodal_moment_localization/train.py \
2
+ --train_path data/TVR_Ranking/train_top01.json \
3
+ --val_path data/TVR_Ranking/val.json \
4
+ --test_path data/TVR_Ranking/test.json \
5
+ --corpus_path data/TVR_Ranking/video_corpus.json \
6
+ --desc_bert_path data/features/query_bert.h5 \
7
+ --vid_feat_path data/features/tvr_i3d_rgb600_avg_cl-1.5.h5 \
8
+ --sub_bert_path data/features/tvr_sub_pretrained_w_sub_query_max_cl-1.5.h5\
9
+ --dset_name=tvr \
10
+ --eval_split_name=val \
11
+ --nms_thd=-1 \
12
+ --results_root=results \
13
+ --clip_length=1.5 \
14
+ --vid_feat_size=1024 \
15
+ --ctx_mode=video_sub_tef \
16
+ --max_ctx_l=128 \
17
+ --max_pred_l=16 \
18
+ --eval_num_per_epoch=0.05 \
19
+ --n_epoch=4000 \
20
+ --exp_id=top01 \
21
+ --model_name=XML \
22
+ --lr=0.001
23
+
24
+ # qsub -I -l select=1:ngpus=1 -P gs_slab -q gpu8
25
+ # cd 11_TVR-Ranking/TVRetrieval/; conda activate py11; sh run_top01.sh
run_top20.sh ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ python baselines/crossmodal_moment_localization/train.py \
2
+ --train_path data/TVR_Ranking/train_top20.json \
3
+ --val_path data/TVR_Ranking/val.json \
4
+ --test_path data/TVR_Ranking/test.json \
5
+ --corpus_path data/TVR_Ranking/video_corpus.json \
6
+ --desc_bert_path data/features/query_bert.h5 \
7
+ --vid_feat_path data/features/tvr_i3d_rgb600_avg_cl-1.5.h5 \
8
+ --sub_bert_path data/features/tvr_sub_pretrained_w_sub_query_max_cl-1.5.h5\
9
+ --dset_name=tvr \
10
+ --eval_split_name=val \
11
+ --nms_thd=-1 \
12
+ --results_root=results \
13
+ --clip_length=1.5 \
14
+ --vid_feat_size=1024 \
15
+ --ctx_mode=video_sub_tef \
16
+ --max_ctx_l=128 \
17
+ --max_pred_l=16 \
18
+ --eval_num_per_epoch=1 \
19
+ --n_epoch=200 \
20
+ --exp_id=top20 \
21
+ --model_name=XML
22
+
23
+ # qsub -I -l select=1:ngpus=1 -P gs_slab -q gpu8
24
+ # cd 11_TVR-Ranking/TVRetrieval/; conda activate py11; sh run_top20.sh
run_top40.sh ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ python baselines/crossmodal_moment_localization/train.py \
2
+ --train_path data/TVR_Ranking/train_top40.json \
3
+ --val_path data/TVR_Ranking/val.json \
4
+ --test_path data/TVR_Ranking/test.json \
5
+ --corpus_path data/TVR_Ranking/video_corpus.json \
6
+ --desc_bert_path data/features/query_bert.h5 \
7
+ --vid_feat_path data/features/tvr_i3d_rgb600_avg_cl-1.5.h5 \
8
+ --sub_bert_path data/features/tvr_sub_pretrained_w_sub_query_max_cl-1.5.h5\
9
+ --dset_name=tvr \
10
+ --eval_split_name=val \
11
+ --nms_thd=-1 \
12
+ --results_root=results \
13
+ --clip_length=1.5 \
14
+ --vid_feat_size=1024 \
15
+ --ctx_mode=video_sub_tef \
16
+ --max_ctx_l=128 \
17
+ --max_pred_l=16 \
18
+ --eval_num_per_epoch=2 \
19
+ --n_epoch=100 \
20
+ --exp_id=top40 \
21
+ --model_name=XML
22
+
23
+ # qsub -I -l select=1:ngpus=1 -P gs_slab -q gpu8
24
+ # cd 11_TVR-Ranking/TVRetrieval/; conda activate py11; sh run_top40.sh