Hnabil commited on
Commit
4ed2c63
·
verified ·
1 Parent(s): 71a5cc4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +102 -3
README.md CHANGED
@@ -1,3 +1,102 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ pipeline_tag: depth-estimation
4
+ ---
5
+
6
+ ## Depth Pro: Sharp Monocular Metric Depth in Less Than a Second
7
+
8
+ This software project accompanies the research paper:
9
+ **[Depth Pro: Sharp Monocular Metric Depth in Less Than a Second](https://arxiv.org/abs/2410.02073)**,
10
+ *Aleksei Bochkovskii, Amaël Delaunoy, Hugo Germain, Marcel Santos, Yichao Zhou, Stephan R. Richter, and Vladlen Koltun*.
11
+
12
+ ![](data/depth-pro-teaser.jpg)
13
+
14
+ We present a foundation model for zero-shot metric monocular depth estimation. Our model, Depth Pro, synthesizes high-resolution depth maps with unparalleled sharpness and high-frequency details. The predictions are metric, with absolute scale, without relying on the availability of metadata such as camera intrinsics. And the model is fast, producing a 2.25-megapixel depth map in 0.3 seconds on a standard GPU. These characteristics are enabled by a number of technical contributions, including an efficient multi-scale vision transformer for dense prediction, a training protocol that combines real and synthetic datasets to achieve high metric accuracy alongside fine boundary tracing, dedicated evaluation metrics for boundary accuracy in estimated depth maps, and state-of-the-art focal length estimation from a single image.
15
+
16
+
17
+ The model in this repository is a reference implementation, which has been re-trained. Its performance is close to the model reported in the paper but does not match it exactly.
18
+
19
+ ## Getting Started
20
+
21
+ We recommend setting up a virtual environment. Using e.g. miniconda, the `depth_pro` package can be installed via:
22
+
23
+ ```bash
24
+ conda create -n depth-pro -y python=3.9
25
+ conda activate depth-pro
26
+
27
+ pip install -e .
28
+ ```
29
+
30
+ To download pretrained checkpoints follow the code snippet below:
31
+ ```bash
32
+ source get_pretrained_models.sh # Files will be downloaded to `checkpoints` directory.
33
+ ```
34
+
35
+ ### Running from commandline
36
+
37
+ We provide a helper script to directly run the model on a single image:
38
+ ```bash
39
+ # Run prediction on a single image:
40
+ depth-pro-run -i ./data/example.jpg
41
+ # Run `depth-pro-run -h` for available options.
42
+ ```
43
+
44
+ ### Running from python
45
+
46
+ ```python
47
+ from PIL import Image
48
+ import depth_pro
49
+
50
+ # Load model and preprocessing transform
51
+ model, transform = depth_pro.create_model_and_transforms()
52
+ model.eval()
53
+
54
+ # Load and preprocess an image.
55
+ image, _, f_px = depth_pro.load_rgb(image_path)
56
+ image = transform(image)
57
+
58
+ # Run inference.
59
+ prediction = model.infer(image, f_px=f_px)
60
+ depth = prediction["depth"] # Depth in [m].
61
+ focallength_px = prediction["focallength_px"] # Focal length in pixels.
62
+ ```
63
+
64
+
65
+ ### Evaluation (boundary metrics)
66
+
67
+ Our boundary metrics can be found under `eval/boundary_metrics.py` and used as follows:
68
+
69
+ ```python
70
+ # for a depth-based dataset
71
+ boundary_f1 = SI_boundary_F1(predicted_depth, target_depth)
72
+
73
+ # for a mask-based dataset (image matting / segmentation)
74
+ boundary_recall = SI_boundary_Recall(predicted_depth, target_mask)
75
+ ```
76
+
77
+
78
+ ## Citation
79
+
80
+ If you find our work useful, please cite the following paper:
81
+
82
+ ```bibtex
83
+ @article{Bochkovskii2024:arxiv,
84
+ author = {Aleksei Bochkovskii and Ama\"{e}l Delaunoy and Hugo Germain and Marcel Santos and
85
+ Yichao Zhou and Stephan R. Richter and Vladlen Koltun}
86
+ title = {Depth Pro: Sharp Monocular Metric Depth in Less Than a Second},
87
+ journal = {arXiv},
88
+ year = {2024},
89
+ url = {https://arxiv.org/abs/2410.02073},
90
+ }
91
+ ```
92
+
93
+ ## License
94
+ This sample code is released under the [LICENSE](LICENSE) terms.
95
+
96
+ The model weights are released under the [LICENSE](LICENSE) terms.
97
+
98
+ ## Acknowledgements
99
+
100
+ Our codebase is built using multiple opensource contributions, please see [Acknowledgements](ACKNOWLEDGEMENTS.md) for more details.
101
+
102
+ Please check the paper for a complete list of references and datasets used in this work.