File size: 2,361 Bytes
b7f94fa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
667e2d7
b7f94fa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
---
license: cc-by-4.0
language:
- en
pipeline_tag: image-segmentation
library_name: zim-anything
tags:
- matting
- segmentation
- segment anything
- zero-shot matting
---

# ZIM-Anything-ViTL

## Introduction

πŸš€ Introducing ZIM: Zero-Shot Image Matting – A Step Beyond SAM! πŸš€

While SAM (Segment Anything Model) has redefined zero-shot segmentation with broad applications across multiple fields, it often falls short in delivering high-precision, fine-grained masks. That’s where ZIM comes in.

🌟 What is ZIM? 🌟

ZIM (Zero-Shot Image Matting) is a groundbreaking model developed to set a new standard in precision matting while maintaining strong zero-shot capabilities. Like SAM, ZIM can generalize across diverse datasets and objects in a zero-shot paradigm. But ZIM goes beyond, delivering highly accurate, fine-grained masks that capture intricate details.

πŸ” Get Started with ZIM πŸ”

Ready to elevate your AI projects with unmatched matting quality? Access ZIM on our [project page](https://naver-ai.github.io/ZIM/), [Arxiv](https://huggingface.co/papers/2411.00626), and [Github](https://github.com/naver-ai/ZIM).

## Installation

```bash
pip install zim_anything
```

or

```bash
git clone https://github.com/naver-ai/ZIM.git
cd ZIM; pip install -e .
```


## Usage

1. Make the directory `zim_vit_l_2092`.
2. Download the [encoder](https://huggingface.co/naver-iv/zim-anything-vitl/resolve/main/zim_vit_l_2092/encoder.onnx?download=true) weight and [decoder](https://huggingface.co/naver-iv/zim-anything-vitl/resolve/main/zim_vit_l_2092/decoder.onnx?download=true) weight.
3. Put them under the `zim_vit_b_2092` directory.

```python
from zim_anything import zim_model_registry, ZimPredictor

backbone = "vit_l"
ckpt_p = "zim_vit_l_2092"

model = zim_model_registry[backbone](checkpoint=ckpt_p)
if torch.cuda.is_available():
    model.cuda()

predictor = ZimPredictor(model)
predictor.set_image(<image>)
masks, _, _ = predictor.predict(<input_prompts>)
```

## Citation

If you find this project useful, please consider citing:

```bibtex
@article{kim2024zim,
  title={ZIM: Zero-Shot Image Matting for Anything},
  author={Kim, Beomyoung and Shin, Chanyong and Jeong, Joonhyun and Jung, Hyungsik and Lee, Se-Yun and Chun, Sewhan and Hwang, Dong-Hyun and Yu, Joonsang},
  journal={arXiv preprint arXiv:2411.00626},
  year={2024}
}