Manli commited on
Commit
4781bd1
·
verified ·
1 Parent(s): 1195d8a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -10
README.md CHANGED
@@ -10,7 +10,7 @@ pipeline_tag: image-text-to-text
10
  `xGen-MM` is a series of the latest foundational Large Multimodal Models (LMMs) developed by Salesforce AI Research. This series advances upon the successful designs of the `BLIP` series, incorporating fundamental enhancements that ensure a more robust and superior foundation. These models have been trained at scale on high-quality image caption datasets and interleaved image-text data.
11
 
12
  In the v1.5 (08/2024) release, we present a series of XGen-MM models including:
13
- - [🤗 xGen-MM-instruct-interleave (our main instruct model)](https://huggingface.co/Salesforce/xgen-mm-phi3-mini-instruct-multi-r-v1.5): `xgen-mm-phi3-mini-instruct-interleave-r-v1.5`
14
  - This model has higher overall scores than [xGen-MM-instruct](https://huggingface.co/Salesforce/xgen-mm-phi3-mini-instruct-singleimg-r-v1.5) on both single-image and multi-image benchmarks.
15
  - [🤗 xGen-MM-base](https://huggingface.co/Salesforce/xgen-mm-phi3-mini-base-r-v1.5): `xgen-mm-phi3-mini-base-r-v1.5`
16
  - [🤗 xGen-MM-instruct](https://huggingface.co/Salesforce/xgen-mm-phi3-mini-instruct-singleimg-r-v1.5): `xgen-mm-phi3-mini-instruct-singleimg-r-v1.5`
@@ -65,14 +65,14 @@ The instruct model is fine-tuned on a mixture of around 1 million samples from m
65
 
66
  <p>
67
  <figure class="half">
68
- <a href="./examples/example-1.png"><img src="./examples/example-1.png"></a>
69
- <a href="./examples/example-2.png"><img src="./examples/example-2.png"></a>
70
  </figure>
71
  </p>
72
 
73
  <p>
74
  <figure>
75
- <a href="./examples/sft-examples.png"><img src="./examples/sft-examples.png"></a>
76
  </figure>
77
  </p>
78
 
@@ -105,12 +105,14 @@ We thank the authors for their open-source implementations.
105
 
106
  # Citation
107
  ```
108
- @article{blip3-xgenmm,
109
- author = {Le Xue, Manli Shu, Anas Awadalla, Jun Wang, An Yan, Senthil Purushwalkam, Honglu Zhou, Viraj Prabhu, Yutong Dai, Michael S Ryoo, Shrikant Kendre, Jieyu Zhang, Can Qin, Shu Zhang, Chia-Chih Chen, Ning Yu, Juntao Tan, Tulika Manoj Awalgaonkar, Shelby Heinecke, Huan Wang, Yejin Choi, Ludwig Schmidt, Zeyuan Chen, Silvio Savarese, Juan Carlos Niebles, Caiming Xiong, Ran Xu},
110
- title = {xGen-MM(BLIP-3): A Family of Open Large Multimodal Models},
111
- journal = {arXiv preprint},
112
- month = {August},
113
- year = {2024},
 
 
114
  }
115
  ```
116
 
 
10
  `xGen-MM` is a series of the latest foundational Large Multimodal Models (LMMs) developed by Salesforce AI Research. This series advances upon the successful designs of the `BLIP` series, incorporating fundamental enhancements that ensure a more robust and superior foundation. These models have been trained at scale on high-quality image caption datasets and interleaved image-text data.
11
 
12
  In the v1.5 (08/2024) release, we present a series of XGen-MM models including:
13
+ - [🤗 xGen-MM-instruct-interleave (our main instruct model)](https://huggingface.co/Salesforce/xgen-mm-phi3-mini-instruct-interleave-r-v1.5): `xgen-mm-phi3-mini-instruct-interleave-r-v1.5`
14
  - This model has higher overall scores than [xGen-MM-instruct](https://huggingface.co/Salesforce/xgen-mm-phi3-mini-instruct-singleimg-r-v1.5) on both single-image and multi-image benchmarks.
15
  - [🤗 xGen-MM-base](https://huggingface.co/Salesforce/xgen-mm-phi3-mini-base-r-v1.5): `xgen-mm-phi3-mini-base-r-v1.5`
16
  - [🤗 xGen-MM-instruct](https://huggingface.co/Salesforce/xgen-mm-phi3-mini-instruct-singleimg-r-v1.5): `xgen-mm-phi3-mini-instruct-singleimg-r-v1.5`
 
65
 
66
  <p>
67
  <figure class="half">
68
+ <a href="https://huggingface.co/Salesforce/xgen-mm-phi3-mini-instruct-interleave-r-v1.5/blob/main/examples/example-1.png"><img src="./examples/example-1.png"></a>
69
+ <a href="https://huggingface.co/Salesforce/xgen-mm-phi3-mini-instruct-interleave-r-v1.5/blob/main/examples/example-2.png"><img src="./examples/example-2.png"></a>
70
  </figure>
71
  </p>
72
 
73
  <p>
74
  <figure>
75
+ <a href="https://huggingface.co/Salesforce/xgen-mm-phi3-mini-instruct-interleave-r-v1.5/blob/main/examples/sft-examples.png"><img src="./examples/sft-examples.png"></a>
76
  </figure>
77
  </p>
78
 
 
105
 
106
  # Citation
107
  ```
108
+ @misc{blip3-xgenmm,
109
+ author = {Le Xue, Manli Shu, Anas Awadalla, Jun Wang, An Yan, Senthil Purushwalkam, Honglu Zhou, Viraj Prabhu, Yutong Dai, Michael S Ryoo, Shrikant Kendre, Jieyu Zhang, Can Qin, Shu Zhang, Chia-Chih Chen, Ning Yu, Juntao Tan, Tulika Manoj Awalgaonkar, Shelby Heinecke, Huan Wang, Yejin Choi, Ludwig Schmidt, Zeyuan Chen, Silvio Savarese, Juan Carlos Niebles, Caiming Xiong, Ran Xu},
110
+ title = {BLIP-3: A Family of Open Large Multimodal Models},
111
+ year = {2024},
112
+ eprint = {2408.08872},
113
+ archivePrefix = {arXiv},
114
+ primaryClass = {cs.CV},
115
+ url = {https://arxiv.org/abs/2408.08872},
116
  }
117
  ```
118