|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- 2toINF/IVM-Mix-1M |
|
base_model: |
|
- llava-hf/llava-1.5-7b-hf |
|
--- |
|
## Quick Start |
|
|
|
### Install |
|
|
|
1. Clone this repository and navigate to IVM folder |
|
|
|
```bash |
|
git clone https://github.com/2toinf/IVM.git |
|
cd IVM |
|
``` |
|
|
|
2. Install Package |
|
|
|
```bash |
|
conda create -n IVM python=3.10 -y |
|
conda activate IVM |
|
pip install -e . |
|
``` |
|
|
|
### Usage |
|
|
|
```python |
|
from IVM import load, forward_batch |
|
ckpt_path = "IVM-V1.0.bin" # your model path here |
|
model = load(ckpt_path, low_gpu_memory = False) # Set `low_gpu_memory=True` if you don't have enough GPU Memory |
|
image = Image.open("image/demo/robot.jpg") # your image path |
|
instruction = "pick up the red cup and place it on the green pan" |
|
result = forward_batch(model, [image], [instruction], threshold = 0.99) |
|
from matplotlib import pyplot as plt |
|
import numpy as np |
|
plt.imshow((result[0]).astype(np.uint8)) |
|
``` |
|
|
|
Citation |
|
|
|
``` |
|
@misc{zheng2024instructionguided, |
|
title={Instruction-Guided Visual Masking}, |
|
author={Jinliang Zheng and Jianxiong Li and Sijie Cheng and Yinan Zheng and Jiaming Li and Jihao Liu and Yu Liu and Jingjing Liu and Xianyuan Zhan}, |
|
year={2024}, |
|
eprint={2405.19783}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CV} |
|
} |
|
|
|
``` |