zzxslp
/

som-llava-v1.5-13b-hf

Image-Text-to-Text

Inference Endpoints

Model card Files Files and versions Community

zzxslp commited on May 7, 2024

Commit

7eca548

·

verified ·

1 Parent(s): 7cc166a

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ LLaVA-v1.5 mixed trained with SoM style data (QA+listing).
 The model can understand tag-style visual prompts on the image (e.g., what is the object tagged with id 9?), also gained improved performance on MLLM benchmarks (POPE, MME, SEED, MM-Vet, LLav-wild), even when the input testing images has no tags.
-**Checkout more details on our [github page](https://github.com/zzxslp/SoM-LLaVA) and [paper](https://arxiv.org/abs/2404.16375)!!!**
 ## Getting Started
 If you would like to load our model in huggingface, here is an example script:

 The model can understand tag-style visual prompts on the image (e.g., what is the object tagged with id 9?), also gained improved performance on MLLM benchmarks (POPE, MME, SEED, MM-Vet, LLav-wild), even when the input testing images has no tags.
+**For more information about SoM-LLaVA, check our [github page](https://github.com/zzxslp/SoM-LLaVA) and [paper](https://arxiv.org/abs/2404.16375)!**
 ## Getting Started
 If you would like to load our model in huggingface, here is an example script: