Improve model card and add metadata
#1
by
nielsr
HF staff
- opened
README.md
CHANGED
@@ -1,18 +1,24 @@
|
|
1 |
---
|
|
|
|
|
2 |
language: en
|
|
|
|
|
|
|
3 |
tags:
|
4 |
- vision
|
5 |
- segmentation
|
6 |
-
license: other
|
7 |
-
datasets:
|
8 |
-
- reasonseg
|
9 |
---
|
10 |
|
11 |
# Seg-Zero-7B
|
12 |
|
13 |
-
|
|
|
|
|
14 |
|
15 |
-
|
|
|
|
|
16 |
|
17 |
## Usage
|
18 |
|
@@ -25,3 +31,31 @@ model = AutoModelForCausalLM.from_pretrained("Ricky06662/Seg-Zero-7B")
|
|
25 |
tokenizer = AutoTokenizer.from_pretrained("Ricky06662/Seg-Zero-7B")
|
26 |
```
|
27 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
datasets:
|
3 |
+
- reasonseg
|
4 |
language: en
|
5 |
+
license: other
|
6 |
+
pipeline_tag: image-segmentation
|
7 |
+
library_name: transformers
|
8 |
tags:
|
9 |
- vision
|
10 |
- segmentation
|
|
|
|
|
|
|
11 |
---
|
12 |
|
13 |
# Seg-Zero-7B
|
14 |
|
15 |
+
This model is based on the paper [Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement](https://huggingface.co/papers/2503.06520). It uses a decoupled architecture with a reasoning model and a segmentation model. It's trained via reinforcement learning using GRPO without explicit reasoning data, leading to robust zero-shot generalization and emergent test-time reasoning.
|
16 |
+
|
17 |
+
Code: https://github.com/dvlab-research/Seg-Zero
|
18 |
|
19 |
+
## Description
|
20 |
+
|
21 |
+
This is a Seg-Zero-7B model. It introduces a decoupled architecture consisting of a reasoning model and a segmentation model. The reasoning model interprets user intentions, generates explicit reasoning chains, and produces positional prompts, which are subsequently used by the segmentation model to generate pixel-level masks.
|
22 |
|
23 |
## Usage
|
24 |
|
|
|
31 |
tokenizer = AutoTokenizer.from_pretrained("Ricky06662/Seg-Zero-7B")
|
32 |
```
|
33 |
|
34 |
+
## Installation
|
35 |
+
|
36 |
+
```bash
|
37 |
+
git clone https://github.com/dvlab-research/Seg-Zero.git
|
38 |
+
cd Seg-Zero
|
39 |
+
conda create -n seg_zero python=3.11
|
40 |
+
conda activate seg_zero
|
41 |
+
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1
|
42 |
+
pip install -e .
|
43 |
+
pip install sam2
|
44 |
+
pip install matplotlib
|
45 |
+
```
|
46 |
+
|
47 |
+
## Inference
|
48 |
+
|
49 |
+
```bash
|
50 |
+
python inference_scripts/infer.py
|
51 |
+
```
|
52 |
+
|
53 |
+
The default question is:
|
54 |
+
|
55 |
+
> "the unusual object in the image."
|
56 |
+
|
57 |
+
You will get the thinking process in the command line and the mask will be saved in the **inference_scripts** folder. You can also provide your own image_path and text:
|
58 |
+
|
59 |
+
```bash
|
60 |
+
python inference_scripts/infer.py --image_path "your_image_path" --text "your question text"
|
61 |
+
```
|