Updated README.md (#3)
Browse files- Updated README.md (b748ec5e5cd4b9f525a94cc544f0be34bbcc22e2)
Co-authored-by: Iurii Makarov <[email protected]>
README.md
CHANGED
|
@@ -1,3 +1,69 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: cc-by-nc-4.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: cc-by-nc-4.0
|
| 3 |
+
tags:
|
| 4 |
+
- CoTracker
|
| 5 |
+
- vision
|
| 6 |
+
- cotracker
|
| 7 |
+
---
|
| 8 |
+
# Point tracking with CoTracker3
|
| 9 |
+
|
| 10 |
+
|
| 11 |
+
|
| 12 |
+
**CoTracker3** is a fast transformer-based model that was introduced in [CoTracker3: Simpler and Better Point Tracking by Pseudo-Labelling Real Videos](https://arxiv.org/abs/2410.11831).
|
| 13 |
+
It can track any point in a video and brings to tracking some of the benefits of Optical Flow.
|
| 14 |
+
You could read more about the paper on our [webpage](https://cotracker3.github.io/). Code is available [here](https://github.com/facebookresearch/co-tracker).
|
| 15 |
+
|
| 16 |
+
CoTracker can track:
|
| 17 |
+
|
| 18 |
+
- **Any pixel** in a video
|
| 19 |
+
- A **quasi-dense** set of pixels together
|
| 20 |
+
- Points can be manually selected or sampled on a grid in any video frame
|
| 21 |
+
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
## How to use
|
| 25 |
+
Here is how to use this model in the **offline mode**:
|
| 26 |
+
|
| 27 |
+
```pip install imageio[ffmpeg]```, then:
|
| 28 |
+
```python
|
| 29 |
+
import torch
|
| 30 |
+
# Download the video
|
| 31 |
+
url = 'https://github.com/facebookresearch/co-tracker/raw/refs/heads/main/assets/apple.mp4'
|
| 32 |
+
|
| 33 |
+
import imageio.v3 as iio
|
| 34 |
+
frames = iio.imread(url, plugin="FFMPEG") # plugin="pyav"
|
| 35 |
+
|
| 36 |
+
device = 'cuda'
|
| 37 |
+
grid_size = 10
|
| 38 |
+
video = torch.tensor(frames).permute(0, 3, 1, 2)[None].float().to(device) # B T C H W
|
| 39 |
+
|
| 40 |
+
# Run Offline CoTracker:
|
| 41 |
+
cotracker = torch.hub.load("facebookresearch/co-tracker", "cotracker3_offline").to(device)
|
| 42 |
+
pred_tracks, pred_visibility = cotracker(video, grid_size=grid_size) # B T N 2, B T N 1
|
| 43 |
+
```
|
| 44 |
+
and in the **online mode**:
|
| 45 |
+
```python
|
| 46 |
+
cotracker = torch.hub.load("facebookresearch/co-tracker", "cotracker3_online").to(device)
|
| 47 |
+
|
| 48 |
+
# Run Online CoTracker, the same model with a different API:
|
| 49 |
+
# Initialize online processing
|
| 50 |
+
cotracker(video_chunk=video, is_first_step=True, grid_size=grid_size)
|
| 51 |
+
|
| 52 |
+
# Process the video
|
| 53 |
+
for ind in range(0, video.shape[1] - cotracker.step, cotracker.step):
|
| 54 |
+
pred_tracks, pred_visibility = cotracker(
|
| 55 |
+
video_chunk=video[:, ind : ind + cotracker.step * 2]
|
| 56 |
+
) # B T N 2, B T N 1
|
| 57 |
+
```
|
| 58 |
+
Online processing is more memory-efficient and allows for the processing of longer videos or videos in real-time.
|
| 59 |
+
|
| 60 |
+
## BibTeX entry and citation info
|
| 61 |
+
|
| 62 |
+
```bibtex
|
| 63 |
+
@inproceedings{karaev24cotracker3,
|
| 64 |
+
title = {CoTracker3: Simpler and Better Point Tracking by Pseudo-Labelling Real Videos},
|
| 65 |
+
author = {Nikita Karaev and Iurii Makarov and Jianyuan Wang and Natalia Neverova and Andrea Vedaldi and Christian Rupprecht},
|
| 66 |
+
booktitle = {Proc. {arXiv:2410.11831}},
|
| 67 |
+
year = {2024}
|
| 68 |
+
}
|
| 69 |
+
```
|