Update README.md
Browse files
README.md
CHANGED
@@ -15,7 +15,7 @@ pipeline_tag: feature-extraction
|
|
15 |
**VisRAG** is a novel vision-language model (VLM)-based RAG pipeline. In this pipeline, instead of first parsing the document to obtain text, the document is directly embedded using a VLM as an image and then retrieved to enhance the generation of a VLM.Compared to traditional text-based RAG, **VisRAG** maximizes the retention and utilization of the data information in the original documents, eliminating the information loss introduced during the parsing process.
|
16 |
<p align="center"><img width=800 src="https://github.com/openbmb/VisRAG/blob/master/assets/main_figure.png?raw=true"/></p>
|
17 |
|
18 |
-
## VisRAG
|
19 |
|
20 |
### VisRAG-Ret
|
21 |
**VisRAG-Ret** is a document embedding model built on [MiniCPM-V 2.0](https://huggingface.co/openbmb/MiniCPM-V-2), a vision-language model that integrates [SigLIP](https://huggingface.co/google/siglip-so400m-patch14-384) as the vision encoder and [MiniCPM-2B](https://huggingface.co/openbmb/MiniCPM-2B-sft-bf16) as the language model.
|
@@ -118,8 +118,4 @@ print(scores.tolist())
|
|
118 |
## Contact
|
119 |
|
120 |
- Shi Yu: [email protected]
|
121 |
-
- Chaoyue Tang: [email protected]
|
122 |
-
|
123 |
-
## Citation
|
124 |
-
|
125 |
-
If you use any datasets or models from this organization in your research, please cite the original dataset as follows:
|
|
|
15 |
**VisRAG** is a novel vision-language model (VLM)-based RAG pipeline. In this pipeline, instead of first parsing the document to obtain text, the document is directly embedded using a VLM as an image and then retrieved to enhance the generation of a VLM.Compared to traditional text-based RAG, **VisRAG** maximizes the retention and utilization of the data information in the original documents, eliminating the information loss introduced during the parsing process.
|
16 |
<p align="center"><img width=800 src="https://github.com/openbmb/VisRAG/blob/master/assets/main_figure.png?raw=true"/></p>
|
17 |
|
18 |
+
## VisRAG Pipeline
|
19 |
|
20 |
### VisRAG-Ret
|
21 |
**VisRAG-Ret** is a document embedding model built on [MiniCPM-V 2.0](https://huggingface.co/openbmb/MiniCPM-V-2), a vision-language model that integrates [SigLIP](https://huggingface.co/google/siglip-so400m-patch14-384) as the vision encoder and [MiniCPM-2B](https://huggingface.co/openbmb/MiniCPM-2B-sft-bf16) as the language model.
|
|
|
118 |
## Contact
|
119 |
|
120 |
- Shi Yu: [email protected]
|
121 |
+
- Chaoyue Tang: [email protected]
|
|
|
|
|
|
|
|