Ricky06662
/

Seg-Zero-7B

Image Segmentation

image-text-to-text

text-generation-inference

Model card Files Files and versions Community

Improve model card and add metadata

#1

by nielsr HF Staff - opened Mar 11

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +39 -5

README.md CHANGED Viewed

@@ -1,18 +1,24 @@
 ---
 language: en
 tags:
 - vision
 - segmentation
-license: other
-datasets:
-- reasonseg
 ---
 # Seg-Zero-7B
-## Desciption
-This is a Seg-Zero-7B model
 ## Usage
@@ -25,3 +31,31 @@ model = AutoModelForCausalLM.from_pretrained("Ricky06662/Seg-Zero-7B")
 tokenizer = AutoTokenizer.from_pretrained("Ricky06662/Seg-Zero-7B")
 ```

 ---
+datasets:
+- reasonseg
 language: en
+license: other
+pipeline_tag: image-segmentation
+library_name: transformers
 tags:
 - vision
 - segmentation
 ---
 # Seg-Zero-7B
+This model is based on the paper [Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement](https://huggingface.co/papers/2503.06520). It uses a decoupled architecture with a reasoning model and a segmentation model. It's trained via reinforcement learning using GRPO without explicit reasoning data, leading to robust zero-shot generalization and emergent test-time reasoning.
+Code: https://github.com/dvlab-research/Seg-Zero
+## Description
+This is a Seg-Zero-7B model.  It introduces a decoupled architecture consisting of a reasoning model and a segmentation model. The reasoning model interprets user intentions, generates explicit reasoning chains, and produces positional prompts, which are subsequently used by the segmentation model to generate pixel-level masks.
 ## Usage
 tokenizer = AutoTokenizer.from_pretrained("Ricky06662/Seg-Zero-7B")
 ```
+## Installation
+```bash
+git clone https://github.com/dvlab-research/Seg-Zero.git
+cd Seg-Zero
+conda create -n seg_zero python=3.11
+conda activate seg_zero
+pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1
+pip install -e .
+pip install sam2
+pip install matplotlib
+```
+## Inference
+```bash
+python inference_scripts/infer.py
+```
+The default question is:
+> "the unusual object in the image."
+You will get the thinking process in the command line and the mask will be saved in the **inference_scripts** folder.  You can also provide your own image_path and text:
+```bash
+python inference_scripts/infer.py --image_path "your_image_path" --text "your question text"
+```