VictorSanh
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -51,7 +51,8 @@ We release under the Apache 2.0 license 2 checkpoints:
|
|
51 |
- **Resources for more information:**
|
52 |
- Description of [OBELICS](https://huggingface.co/datasets/HuggingFaceM4/OBELICS): [OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
|
53 |
](https://huggingface.co/papers/2306.16527)
|
54 |
-
- Paper:
|
|
|
55 |
|
56 |
|
57 |
# Uses
|
@@ -439,6 +440,15 @@ The model is built on top of two pre-trained models: [google/siglip-so400m-patch
|
|
439 |
archivePrefix={arXiv},
|
440 |
primaryClass={cs.IR}
|
441 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
442 |
```
|
443 |
|
444 |
# Acknowledgements
|
|
|
51 |
- **Resources for more information:**
|
52 |
- Description of [OBELICS](https://huggingface.co/datasets/HuggingFaceM4/OBELICS): [OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
|
53 |
](https://huggingface.co/papers/2306.16527)
|
54 |
+
- Paper: [What matters when building vision-language models?
|
55 |
+
](https://huggingface.co/papers/2405.02246)
|
56 |
|
57 |
|
58 |
# Uses
|
|
|
440 |
archivePrefix={arXiv},
|
441 |
primaryClass={cs.IR}
|
442 |
}
|
443 |
+
|
444 |
+
@misc{laurençon2024matters,
|
445 |
+
title={What matters when building vision-language models?},
|
446 |
+
author={Hugo Laurençon and Léo Tronchon and Matthieu Cord and Victor Sanh},
|
447 |
+
year={2024},
|
448 |
+
eprint={2405.02246},
|
449 |
+
archivePrefix={arXiv},
|
450 |
+
primaryClass={cs.CV}
|
451 |
+
}
|
452 |
```
|
453 |
|
454 |
# Acknowledgements
|