Depth Pro checkpoint
Browse files- README.md +85 -0
- depth_pro.pt +3 -0
README.md
ADDED
|
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apple-ascl
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
# Depth Pro: Sharp Monocular Metric Depth in Less Than a Second
|
| 6 |
+
|
| 7 |
+

|
| 8 |
+
|
| 9 |
+
We present a foundation model for zero-shot metric monocular depth estimation. Our model, Depth Pro, synthesizes high-resolution depth maps with unparalleled sharpness and high-frequency details. The predictions are metric, with absolute scale, without relying on the availability of metadata such as camera intrinsics. And the model is fast, producing a 2.25-megapixel depth map in 0.3 seconds on a standard GPU. These characteristics are enabled by a number of technical contributions, including an efficient multi-scale vision transformer for dense prediction, a training protocol that combines real and synthetic datasets to achieve high metric accuracy alongside fine boundary tracing, dedicated evaluation metrics for boundary accuracy in estimated depth maps, and state-of-the-art focal length estimation from a single image.
|
| 10 |
+
|
| 11 |
+
Depth Pro was introduced in **Depth Pro: Sharp Monocular Metric Depth in Less Than a Second**, by *Aleksei Bochkovskii, Amaël Delaunoy, Hugo Germain, Marcel Santos, Yichao Zhou, Stephan R. Richter, and Vladlen Koltun*.
|
| 12 |
+
|
| 13 |
+
The checkpoint in this repository is a reference implementation, which has been re-trained. Its performance is close to the model reported in the paper but does not match it exactly.
|
| 14 |
+
|
| 15 |
+
## How to Use
|
| 16 |
+
|
| 17 |
+
Please, follow the steps in the [code repository](https://github.com/apple/ml-depth-pro) to set up your environment. Then you can download the checkpoint from the _Files and versions_ tab above, or use the `huggingface-hub` CLI:
|
| 18 |
+
|
| 19 |
+
```bash
|
| 20 |
+
pip install huggingface-hub
|
| 21 |
+
huggingface-cli download --local-dir checkpoints pcuenq/Depth-Pro
|
| 22 |
+
```
|
| 23 |
+
|
| 24 |
+
### Running from commandline
|
| 25 |
+
|
| 26 |
+
The code repo provides a helper script to run the model on a single image:
|
| 27 |
+
|
| 28 |
+
```bash
|
| 29 |
+
# Run prediction on a single image:
|
| 30 |
+
depth-pro-run -i ./data/example.jpg
|
| 31 |
+
# Run `depth-pro-run -h` for available options.
|
| 32 |
+
```
|
| 33 |
+
|
| 34 |
+
### Running from Python
|
| 35 |
+
|
| 36 |
+
```python
|
| 37 |
+
from PIL import Image
|
| 38 |
+
import depth_pro
|
| 39 |
+
|
| 40 |
+
# Load model and preprocessing transform
|
| 41 |
+
model, transform = depth_pro.create_model_and_transforms()
|
| 42 |
+
model.eval()
|
| 43 |
+
|
| 44 |
+
# Load and preprocess an image.
|
| 45 |
+
image, _, f_px = depth_pro.load_rgb(image_path)
|
| 46 |
+
image = transform(image)
|
| 47 |
+
|
| 48 |
+
# Run inference.
|
| 49 |
+
prediction = model.infer(image, f_px=f_px)
|
| 50 |
+
depth = prediction["depth"] # Depth in [m].
|
| 51 |
+
focallength_px = prediction["focallength_px"] # Focal length in pixels.
|
| 52 |
+
```
|
| 53 |
+
|
| 54 |
+
### Evaluation (boundary metrics)
|
| 55 |
+
|
| 56 |
+
Boundary metrics are implemented in `eval/boundary_metrics.py` and can be used as follows:
|
| 57 |
+
|
| 58 |
+
```python
|
| 59 |
+
# for a depth-based dataset
|
| 60 |
+
boundary_f1 = SI_boundary_F1(predicted_depth, target_depth)
|
| 61 |
+
|
| 62 |
+
# for a mask-based dataset (image matting / segmentation)
|
| 63 |
+
boundary_recall = SI_boundary_Recall(predicted_depth, target_mask)
|
| 64 |
+
```
|
| 65 |
+
|
| 66 |
+
|
| 67 |
+
## Citation
|
| 68 |
+
|
| 69 |
+
If you find our work useful, please cite the following paper:
|
| 70 |
+
|
| 71 |
+
```bibtex
|
| 72 |
+
@article{Bochkovskii2024:arxiv,
|
| 73 |
+
author = {Aleksei Bochkovskii and Ama\"{e}l Delaunoy and Hugo Germain and Marcel Santos and
|
| 74 |
+
Yichao Zhou and Stephan R. Richter and Vladlen Koltun}
|
| 75 |
+
title = {Depth Pro: Sharp Monocular Metric Depth in Less Than a Second},
|
| 76 |
+
journal = {arXiv},
|
| 77 |
+
year = {2024},
|
| 78 |
+
}
|
| 79 |
+
```
|
| 80 |
+
|
| 81 |
+
## Acknowledgements
|
| 82 |
+
|
| 83 |
+
Our codebase is built using multiple opensource contributions, please see [Acknowledgements](https://github.com/apple/ml-depth-pro/blob/main/ACKNOWLEDGEMENTS.md) for more details.
|
| 84 |
+
|
| 85 |
+
Please check the paper for a complete list of references and datasets used in this work.
|
depth_pro.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3eb35ca68168ad3d14cb150f8947a4edf85589941661fdb2686259c80685c0ce
|
| 3 |
+
size 1904446787
|