Update README.md
Browse files
README.md
CHANGED
@@ -11,7 +11,7 @@ tags:
|
|
11 |
---
|
12 |
# ColQwen2: Visual Retriever based on Qwen2-VL-2B-Instruct with ColBERT strategy
|
13 |
|
14 |
-
### This is the base version trained with batch_size 256 instead of 32 for
|
15 |
|
16 |
ColQwen is a model based on a novel model architecture and training strategy based on Vision Language Models (VLMs) to efficiently index documents from their visual features.
|
17 |
It is a [Qwen2-VL-2B](https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct) extension that generates [ColBERT](https://arxiv.org/abs/2004.12832)- style multi-vector representations of text and images.
|
@@ -63,11 +63,11 @@ from PIL import Image
|
|
63 |
from colpali_engine.models import ColQwen2, ColQwen2Processor
|
64 |
|
65 |
model = ColQwen2.from_pretrained(
|
66 |
-
"manu/colqwen2-
|
67 |
torch_dtype=torch.bfloat16,
|
68 |
device_map="cuda:0", # or "mps" if on Apple Silicon
|
69 |
).eval()
|
70 |
-
processor = ColQwen2Processor.from_pretrained("manu/colqwen2-
|
71 |
|
72 |
# Your inputs
|
73 |
images = [
|
|
|
11 |
---
|
12 |
# ColQwen2: Visual Retriever based on Qwen2-VL-2B-Instruct with ColBERT strategy
|
13 |
|
14 |
+
### This is the base version trained with batch_size 256 instead of 32 for 5 epoch and with the updated pad token
|
15 |
|
16 |
ColQwen is a model based on a novel model architecture and training strategy based on Vision Language Models (VLMs) to efficiently index documents from their visual features.
|
17 |
It is a [Qwen2-VL-2B](https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct) extension that generates [ColBERT](https://arxiv.org/abs/2004.12832)- style multi-vector representations of text and images.
|
|
|
63 |
from colpali_engine.models import ColQwen2, ColQwen2Processor
|
64 |
|
65 |
model = ColQwen2.from_pretrained(
|
66 |
+
"manu/colqwen2-v1.0-alpha",
|
67 |
torch_dtype=torch.bfloat16,
|
68 |
device_map="cuda:0", # or "mps" if on Apple Silicon
|
69 |
).eval()
|
70 |
+
processor = ColQwen2Processor.from_pretrained("manu/colqwen2-v1.0-alpha")
|
71 |
|
72 |
# Your inputs
|
73 |
images = [
|