jbilcke-hf HF Staff commited on
Commit
873d3e4
·
1 Parent(s): 5719070
Files changed (1) hide show
  1. docs/torch/README_for_torchcodec.md +226 -0
docs/torch/README_for_torchcodec.md ADDED
@@ -0,0 +1,226 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [**Installation**](#installing-torchcodec) | [**Simple Example**](#using-torchcodec) | [**Detailed Example**](https://pytorch.org/torchcodec/stable/generated_examples/) | [**Documentation**](https://pytorch.org/torchcodec) | [**Contributing**](CONTRIBUTING.md) | [**License**](#license)
2
+
3
+ # TorchCodec
4
+
5
+ TorchCodec is a Python library for decoding video and audio data into PyTorch
6
+ tensors, on CPU and CUDA GPU. It also supports audio encoding, and video
7
+ encoding will come soon! It aims to be fast, easy to use, and well integrated
8
+ into the PyTorch ecosystem. If you want to use PyTorch to train ML models on
9
+ videos and audio, TorchCodec is how you turn these into data.
10
+
11
+ We achieve these capabilities through:
12
+
13
+ * Pythonic APIs that mirror Python and PyTorch conventions.
14
+ * Relying on [FFmpeg](https://www.ffmpeg.org/) to do the decoding and encoding.
15
+ TorchCodec uses the version of FFmpeg you already have installed. FFmpeg is a
16
+ mature library with broad coverage available on most systems. It is, however,
17
+ not easy to use. TorchCodec abstracts FFmpeg's complexity to ensure it is used
18
+ correctly and efficiently.
19
+ * Returning data as PyTorch tensors, ready to be fed into PyTorch transforms
20
+ or used directly to train models.
21
+
22
+ ## Using TorchCodec
23
+
24
+ Here's a condensed summary of what you can do with TorchCodec. For more detailed
25
+ examples, [check out our
26
+ documentation](https://pytorch.org/torchcodec/stable/generated_examples/)!
27
+
28
+ #### Decoding
29
+
30
+ ```python
31
+ from torchcodec.decoders import VideoDecoder
32
+
33
+ device = "cpu" # or e.g. "cuda" !
34
+ decoder = VideoDecoder("path/to/video.mp4", device=device)
35
+
36
+ decoder.metadata
37
+ # VideoStreamMetadata:
38
+ # num_frames: 250
39
+ # duration_seconds: 10.0
40
+ # bit_rate: 31315.0
41
+ # codec: h264
42
+ # average_fps: 25.0
43
+ # ... (truncated output)
44
+
45
+ # Simple Indexing API
46
+ decoder[0] # uint8 tensor of shape [C, H, W]
47
+ decoder[0 : -1 : 20] # uint8 stacked tensor of shape [N, C, H, W]
48
+
49
+ # Indexing, with PTS and duration info:
50
+ decoder.get_frames_at(indices=[2, 100])
51
+ # FrameBatch:
52
+ # data (shape): torch.Size([2, 3, 270, 480])
53
+ # pts_seconds: tensor([0.0667, 3.3367], dtype=torch.float64)
54
+ # duration_seconds: tensor([0.0334, 0.0334], dtype=torch.float64)
55
+
56
+ # Time-based indexing with PTS and duration info
57
+ decoder.get_frames_played_at(seconds=[0.5, 10.4])
58
+ # FrameBatch:
59
+ # data (shape): torch.Size([2, 3, 270, 480])
60
+ # pts_seconds: tensor([ 0.4671, 10.3770], dtype=torch.float64)
61
+ # duration_seconds: tensor([0.0334, 0.0334], dtype=torch.float64)
62
+ ```
63
+
64
+ #### Clip sampling
65
+
66
+ ```python
67
+
68
+ from torchcodec.samplers import clips_at_regular_timestamps
69
+
70
+ clips_at_regular_timestamps(
71
+ decoder,
72
+ seconds_between_clip_starts=1.5,
73
+ num_frames_per_clip=4,
74
+ seconds_between_frames=0.1
75
+ )
76
+ # FrameBatch:
77
+ # data (shape): torch.Size([9, 4, 3, 270, 480])
78
+ # pts_seconds: tensor([[ 0.0000, 0.0667, 0.1668, 0.2669],
79
+ # [ 1.4681, 1.5682, 1.6683, 1.7684],
80
+ # [ 2.9696, 3.0697, 3.1698, 3.2699],
81
+ # ... (truncated), dtype=torch.float64)
82
+ # duration_seconds: tensor([[0.0334, 0.0334, 0.0334, 0.0334],
83
+ # [0.0334, 0.0334, 0.0334, 0.0334],
84
+ # [0.0334, 0.0334, 0.0334, 0.0334],
85
+ # ... (truncated), dtype=torch.float64)
86
+ ```
87
+
88
+ You can use the following snippet to generate a video with FFmpeg and tryout
89
+ TorchCodec:
90
+
91
+ ```bash
92
+ fontfile=/usr/share/fonts/dejavu-sans-mono-fonts/DejaVuSansMono-Bold.ttf
93
+ output_video_file=/tmp/output_video.mp4
94
+
95
+ ffmpeg -f lavfi -i \
96
+ color=size=640x400:duration=10:rate=25:color=blue \
97
+ -vf "drawtext=fontfile=${fontfile}:fontsize=30:fontcolor=white:x=(w-text_w)/2:y=(h-text_h)/2:text='Frame %{frame_num}'" \
98
+ ${output_video_file}
99
+ ```
100
+
101
+ ## Installing TorchCodec
102
+ ### Installing CPU-only TorchCodec
103
+
104
+ 1. Install the latest stable version of PyTorch following the
105
+ [official instructions](https://pytorch.org/get-started/locally/). For other
106
+ versions, refer to the table below for compatibility between versions of
107
+ `torch` and `torchcodec`.
108
+
109
+ 2. Install FFmpeg, if it's not already installed. Linux distributions usually
110
+ come with FFmpeg pre-installed. TorchCodec supports all major FFmpeg versions
111
+ in [4, 7].
112
+
113
+ If FFmpeg is not already installed, or you need a more recent version, an
114
+ easy way to install it is to use `conda`:
115
+
116
+ ```bash
117
+ conda install "ffmpeg<8"
118
+ # or
119
+ conda install "ffmpeg<8" -c conda-forge
120
+ ```
121
+
122
+ 3. Install TorchCodec:
123
+
124
+ ```bash
125
+ pip install torchcodec
126
+ ```
127
+
128
+ The following table indicates the compatibility between versions of
129
+ `torchcodec`, `torch` and Python.
130
+
131
+ | `torchcodec` | `torch` | Python |
132
+ | ------------------ | ------------------ | ------------------- |
133
+ | `main` / `nightly` | `main` / `nightly` | `>=3.10`, `<=3.13` |
134
+ | `0.6` | `2.8` | `>=3.9`, `<=3.13` |
135
+ | `0.5` | `2.7` | `>=3.9`, `<=3.13` |
136
+ | `0.4` | `2.7` | `>=3.9`, `<=3.13` |
137
+ | `0.3` | `2.7` | `>=3.9`, `<=3.13` |
138
+ | `0.2` | `2.6` | `>=3.9`, `<=3.13` |
139
+ | `0.1` | `2.5` | `>=3.9`, `<=3.12` |
140
+ | `0.0.3` | `2.4` | `>=3.8`, `<=3.12` |
141
+
142
+ ### Installing CUDA-enabled TorchCodec
143
+
144
+ First, make sure you have a GPU that has NVDEC hardware that can decode the
145
+ format you want. Refer to Nvidia's GPU support matrix for more details
146
+ [here](https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new).
147
+
148
+ 1. Install Pytorch corresponding to your CUDA Toolkit using the
149
+ [official instructions](https://pytorch.org/get-started/locally/). You'll
150
+ need the `libnpp` and `libnvrtc` CUDA libraries, which are usually part of
151
+ the CUDA Toolkit.
152
+
153
+ 2. Install or compile FFmpeg with NVDEC support.
154
+ TorchCodec with CUDA should work with FFmpeg versions in [4, 7].
155
+
156
+ If FFmpeg is not already installed, or you need a more recent version, an
157
+ easy way to install it is to use `conda`:
158
+
159
+ ```bash
160
+ conda install "ffmpeg<8"
161
+ # or
162
+ conda install "ffmpeg<8" -c conda-forge
163
+ ```
164
+
165
+ If you are building FFmpeg from source you can follow Nvidia's guide to
166
+ configuring and installing FFmpeg with NVDEC support
167
+ [here](https://docs.nvidia.com/video-technologies/video-codec-sdk/12.0/ffmpeg-with-nvidia-gpu/index.html).
168
+
169
+ After installing FFmpeg make sure it has NVDEC support when you list the supported
170
+ decoders:
171
+
172
+ ```bash
173
+ ffmpeg -decoders | grep -i nvidia
174
+ # This should show a line like this:
175
+ # V..... h264_cuvid Nvidia CUVID H264 decoder (codec h264)
176
+ ```
177
+
178
+ To check that FFmpeg libraries work with NVDEC correctly you can decode a sample video:
179
+
180
+ ```bash
181
+ ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i test/resources/nasa_13013.mp4 -f null -
182
+ ```
183
+
184
+ 3. Install TorchCodec by passing in an `--index-url` parameter that corresponds
185
+ to your CUDA Toolkit version, example:
186
+
187
+ ```bash
188
+ # This corresponds to CUDA Toolkit version 12.6. It should be the same one
189
+ # you used when you installed PyTorch (If you installed PyTorch with pip).
190
+ pip install torchcodec --index-url=https://download.pytorch.org/whl/cu126
191
+ ```
192
+
193
+ Note that without passing in the `--index-url` parameter, `pip` installs
194
+ the CPU-only version of TorchCodec.
195
+
196
+ ## Benchmark Results
197
+
198
+ The following was generated by running [our benchmark script](./benchmarks/decoders/generate_readme_data.py) on a lightly loaded 22-core machine with an Nvidia A100 with
199
+ 5 [NVDEC decoders](https://docs.nvidia.com/video-technologies/video-codec-sdk/12.1/nvdec-application-note/index.html#).
200
+
201
+ ![benchmark_results](./benchmarks/decoders/benchmark_readme_chart.png)
202
+
203
+ The top row is a [Mandelbrot](https://ffmpeg.org/ffmpeg-filters.html#mandelbrot) video
204
+ generated from FFmpeg that has a resolution of 1280x720 at 60 fps and is 120 seconds long.
205
+ The bottom row is [promotional video from NASA](https://download.pytorch.org/torchaudio/tutorial-assets/stream-api/NASAs_Most_Scientifically_Complex_Space_Observatory_Requires_Precision-MP4_small.mp4)
206
+ that has a resolution of 960x540 at 29.7 fps and is 206 seconds long. Both videos were
207
+ encoded with libx264 and yuv420p pixel format. All decoders, except for TorchVision, used FFmpeg 6.1.2. TorchVision used FFmpeg 4.2.2.
208
+
209
+ For TorchCodec, the "approx" label means that it was using [approximate mode](https://pytorch.org/torchcodec/stable/generated_examples/approximate_mode.html)
210
+ for seeking.
211
+
212
+ ## Contributing
213
+
214
+ We welcome contributions to TorchCodec! Please see our [contributing
215
+ guide](CONTRIBUTING.md) for more details.
216
+
217
+ ## License
218
+
219
+ TorchCodec is released under the [BSD 3 license](./LICENSE).
220
+
221
+ However, TorchCodec may be used with code not written by Meta which may be
222
+ distributed under different licenses.
223
+
224
+ For example, if you build TorchCodec with ENABLE_CUDA=1 or use the CUDA-enabled
225
+ release of torchcodec, please review CUDA's license here:
226
+ [Nvidia licenses](https://docs.nvidia.com/cuda/eula/index.html).