Homie0609 commited on
Commit
10ffea5
1 Parent(s): 3186d13

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +137 -3
README.md CHANGED
@@ -1,3 +1,137 @@
1
- ---
2
- license: cc-by-sa-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-sa-4.0
3
+ datasets:
4
+ - Homie0609/MatchTime
5
+ language:
6
+ - en
7
+ tags:
8
+ - sports
9
+ - soccer
10
+ ---
11
+
12
+ ## Requirements
13
+ - Python >= 3.8 (Recommend to use [Anaconda](https://www.anaconda.com/download/#linux) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html))
14
+ - [PyTorch >= 2.0.0](https://pytorch.org/) (If use A100)
15
+ - transformers >= 4.42.3
16
+ - pycocoevalcap >= 1.2
17
+
18
+ A suitable [conda](https://conda.io/) environment named `matchtime` can be created and activated with:
19
+ ```
20
+ cd MatchTime
21
+ conda env create -f environment.yaml
22
+ conda activate matchtime
23
+ ```
24
+
25
+ ## Training
26
+ Before training, make sure you have prepared [features](https://pypi.org/project/SoccerNet/) and caption [data]((https://drive.google.com/drive/folders/14tb6lV2nlTxn3VygwAPdmtKm7v0Ss8wG)), and put them into according folders. The structure after collating should be like:
27
+ ``````
28
+ └─ MatchTime
29
+ ├─ dataset
30
+ │ ├─ MatchTime
31
+ │ │ ├─ valid
32
+ │ │ └─ train
33
+ │ │ ├─ england_epl_2014-2015
34
+ │ │ ... ├─ 2015-02-21 - 18-00 Chelsea 1 - 1 Burnley
35
+ │ │ ... └─ Labels-caption.json
36
+ │ │
37
+ │ ├─ SN-Caption
38
+ │ └─ SN-Caption-test-align
39
+ │ ├─ england_epl_2015-2016
40
+ │ ... ├─ 2015-08-16 - 18-00 Manchester City 3 - 0 Chelsea
41
+ │ ... └─ Labels-caption_with_gt.json
42
+
43
+ ├─ features
44
+ │ ├─ baidu_soccer_embeddings
45
+ │ │ ├─ england_epl_2014-2015
46
+ ... │ ... ├─ 2015-02-21 - 18-00 Chelsea 1 - 1 Burnley
47
+ │ ... ├─ 1_baidu_soccer_embeddings.npy
48
+ │ └─ 2_baidu_soccer_embeddings.npy
49
+ ├─ C3D_PCA512
50
+ ...
51
+ ``````
52
+ with the format of features is adjusted by
53
+ ```
54
+ python ./features/preprocess.py directory_path_of_feature
55
+ ```
56
+ After preparing the data and features, you can pre-train (or finetune) with the following terminal command (Check hyper-parameters at the bottom of *train.py*):
57
+ ```
58
+ python train.py
59
+ ```
60
+ ## Inference
61
+
62
+ We provide two types of inference:
63
+
64
+ #### For all test set
65
+
66
+ You can generate a *.csv* file with the following code to test the ***MatchVoice*** model with the following code (Check hyper-parameters at the bottom of *inference.py*)
67
+
68
+ ```
69
+ python inference.py
70
+ ```
71
+
72
+ There is a sample of this type of inference in *./inference_result/sample.csv*.
73
+
74
+ #### For Single Video
75
+
76
+ We also provide a version for predict the commentary single video (for our checkpoints, use 30s video)
77
+ ```
78
+ python inference_single_video_CLIP.py single_video_path
79
+ ```
80
+ Here we only provide the version of CLIP feature (using VIT/B-32), for crop the CLIP feature, please check [here](https://github.com/openai/CLIP). CLIP features are not the one with best performance but are the most friendly for new new videos.
81
+
82
+ ## Alignment
83
+
84
+ Before doing alignment, you should download videos from [here](https://www.soccer-net.org/data) (224p is enough) and make it in the following format:
85
+
86
+ ``````
87
+ └─ MatchTime
88
+ ├─ videos_224p
89
+ ... ├─ england_epl_2014-2015
90
+ ... ├─ 2015-02-21 - 18-00 Chelsea 1 - 1 Burnley
91
+ ... ├─ 1_224.mkv
92
+ └─ 2_224p.mkv
93
+ ``````
94
+
95
+ ### Pre-process (Coarse Align)
96
+
97
+ We need to use [WhisperX](https://github.com/m-bain/whisperX) and [LLaMA3](https://huggingface.co/docs/transformers/model_doc/llama3)(as agent) to finish coarse alignment with following steps:
98
+
99
+ *WhisperX ASR:*
100
+ ```
101
+ python ./alignment/soccer_whisperx.py --process_directory video_folder(eg. ./videos_224p/england_epl_2014-2015) --output_directory output_folder(eg. ./ASR_results/england_epl_2014-2015)
102
+ ```
103
+ *Transform to Events:*
104
+ ```
105
+ python ./alignment/soccer_asr2events.py --base_path ASR_results_folder(eg. ./ASR_results/england_epl_2014-2015) --output_dir envent_results_folder(eg. ./event_results/england_epl_2014-2015)
106
+ ```
107
+
108
+ *Align from Events:*
109
+ ```
110
+ python ./alignment/soccer_align_from_event.py --event_path envent_results_folder(eg. ./event_results/england_epl_2014-2015) --output_dir output_directory(eg. ./pre-processed/england_epl_2014-2015)
111
+ ```
112
+
113
+ More details could be checked in paper.
114
+
115
+ ### Contrastive Learning (Fine-grained Align)
116
+
117
+ After downloading checkpoints from [here](https://huggingface.co/Homie0609/MatchTime/tree/main). Use the following code to finish alignment with contrastive learning:
118
+ ```
119
+ python ./alignment/do_alignment.py
120
+ ```
121
+ By changing the hyper-parameter ***finding_words***, you can freely align from ASR, enent, or original SN-Caption.
122
+
123
+ Also, you can directly use alignment model by
124
+ ```
125
+ from alignment.matchtime_model import ContrastiveLearningModel
126
+ ```
127
+
128
+ ## Evaluation
129
+ We provide codes for evaluate the prediction results:
130
+ ```
131
+ # for single csv file
132
+ python ./evaluation/scoer_single.py --csv_path ./inference_result/sample.csv
133
+ # for many csv files to record scores in a new csv file
134
+ python ./evaluation/scoer_group.py
135
+ # for gpt score (need OpenAI API Key)
136
+ python ./evaluation/scoer_gpt.py ./inference_result/sample.csv
137
+ ```