Update README.md
Browse files
README.md
CHANGED
@@ -41,7 +41,8 @@ In the paper, We use MiniCPM-V 2.0, MiniCPM-V 2.6 and GPT-4o as the generators.
|
|
41 |
## Training
|
42 |
|
43 |
### VisRAG-Ret
|
44 |
-
Our training dataset of 362,110 Query-Document (Q-D) Pairs for **VisRAG-Ret** is comprised of train sets of openly available academic datasets (34%) and a synthetic dataset made up of pages from web-crawled PDF documents and augmented with VLM-generated (GPT-4o) pseudo-queries (66%).
|
|
|
45 |
|
46 |
### VisRAG-Gen
|
47 |
The generation part does not use any fine-tuning; we directly use off-the-shelf LLMs/VLMs for generation.
|
|
|
41 |
## Training
|
42 |
|
43 |
### VisRAG-Ret
|
44 |
+
Our training dataset of 362,110 Query-Document (Q-D) Pairs for **VisRAG-Ret** is comprised of train sets of openly available academic datasets (34%) and a synthetic dataset made up of pages from web-crawled PDF documents and augmented with VLM-generated (GPT-4o) pseudo-queries (66%). It can be found in the `VisRAG` Collection on Hugging Face, which is referenced at the beginning of this page.
|
45 |
+
|
46 |
|
47 |
### VisRAG-Gen
|
48 |
The generation part does not use any fine-tuning; we directly use off-the-shelf LLMs/VLMs for generation.
|