Image-Text-to-Text
Transformers
nielsr HF staff commited on
Commit
360a464
·
verified ·
1 Parent(s): 6cb8870

Add/improve model card

Browse files

This PR adds/improves the model card by adding metadata such as `pipeline_tag` and `library_name`. It also ensures that the model card is linked to the paper.

Files changed (1) hide show
  1. README.md +14 -3
README.md CHANGED
@@ -1,3 +1,14 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: image-to-image
5
+ ---
6
+
7
+ This repository contains the model presented in the paper [UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface](https://hf.co/papers/2503.01342).
8
+
9
+ UFO unifies object-level detection, pixel-level segmentation, and image-level vision-language tasks into a single model by transforming all perception targets into the language space. It introduces a novel embedding retrieval approach that relies solely on the language interface to support segmentation tasks.
10
+
11
+ For more details, please refer to the original paper and the GitHub repository:
12
+
13
+ - Paper: [UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface](https://hf.co/papers/2503.01342)
14
+ - GitHub: [https://github.com/nnnth/UFO](https://github.com/nnnth/UFO)