Update README.md
Browse files
README.md
CHANGED
@@ -110,6 +110,30 @@ print(query_vectors)
|
|
110 |
|
111 |
Complete working Colab Notebook is [here](https://colab.research.google.com/drive/1-5WGEYPSBNBg-Z0bGFysyvckFuM8imrg)
|
112 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
113 |
## Evaluation Results
|
114 |
|
115 |
**TL;DR:** Our Jina-ColBERT achieves the competitive retrieval performance with [ColBERTv2](https://huggingface.co/colbert-ir/colbertv2.0) on all benchmarks, and outperforms ColBERTv2 on datasets in where documents have longer context length.
|
@@ -157,14 +181,41 @@ We also evaluate the zero-shot performance on datasets where documents have long
|
|
157 |
| Jina-ColBERT-v1 | 8192 | 8192 | 83.7 |
|
158 |
| Jina-embeddings-v2-base-en | 8192 | 8192 | **85.4** |
|
159 |
|
160 |
-
\* denotes that we truncate the context length to
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
161 |
|
162 |
-
**To summarize, Jina-ColBERT achieves the comparable performance with ColBERTv2 on all benchmarks, and outperforms ColBERTv2 on datasets in where documents have longer context length.**
|
163 |
|
164 |
## Plans
|
165 |
|
166 |
-
|
167 |
-
- We are planning to improve the performance of Jina-ColBERT by fine-tuning on more datasets in the future.
|
168 |
|
169 |
## Other Models
|
170 |
|
|
|
110 |
|
111 |
Complete working Colab Notebook is [here](https://colab.research.google.com/drive/1-5WGEYPSBNBg-Z0bGFysyvckFuM8imrg)
|
112 |
|
113 |
+
### Reranking Using ColBERT
|
114 |
+
|
115 |
+
```python
|
116 |
+
from colbert.modeling.checkpoint import Checkpoint
|
117 |
+
from colbert.infra import ColBERTConfig
|
118 |
+
|
119 |
+
query = ["How to use ColBERT for indexing long documents?"]
|
120 |
+
documents = [
|
121 |
+
"ColBERT is an efficient and effective passage retrieval model.",
|
122 |
+
"Jina-ColBERT is a ColBERT-style model but based on JinaBERT so it can support both 8k context length.",
|
123 |
+
"JinaBERT is a BERT architecture that supports the symmetric bidirectional variant of ALiBi to allow longer sequence length.",
|
124 |
+
"Jina-ColBERT model is trained on MSMARCO passage ranking dataset, following a very similar training procedure with ColBERTv2.",
|
125 |
+
]
|
126 |
+
|
127 |
+
config = ColBERTConfig(query_maxlen=32, doc_maxlen=512)
|
128 |
+
ckpt = Checkpoint(args.reranker, colbert_config=colbert_config)
|
129 |
+
Q = ckpt.queryFromText([all_queries[i]])
|
130 |
+
D = ckpt.docFromText(all_passages, bsize=32)[0]
|
131 |
+
D_mask = torch.ones(D.shape[:2], dtype=torch.long)
|
132 |
+
scores = colbert_score(Q, D, D_mask).flatten().cpu().numpy().tolist()
|
133 |
+
ranking = numpy.argsort(scores)[::-1]
|
134 |
+
print(ranking)
|
135 |
+
```
|
136 |
+
|
137 |
## Evaluation Results
|
138 |
|
139 |
**TL;DR:** Our Jina-ColBERT achieves the competitive retrieval performance with [ColBERTv2](https://huggingface.co/colbert-ir/colbertv2.0) on all benchmarks, and outperforms ColBERTv2 on datasets in where documents have longer context length.
|
|
|
181 |
| Jina-ColBERT-v1 | 8192 | 8192 | 83.7 |
|
182 |
| Jina-embeddings-v2-base-en | 8192 | 8192 | **85.4** |
|
183 |
|
184 |
+
\* denotes that we truncate the context length to 512 for documents. The context length of queries is all 512.
|
185 |
+
|
186 |
+
**To summarize, Jina-ColBERT achieves the comparable retrieval performance with ColBERTv2 on all benchmarks, and outperforms ColBERTv2 on datasets in where documents have longer context length.**
|
187 |
+
|
188 |
+
### Reranking Performance
|
189 |
+
|
190 |
+
We evaluate the reranking performance of ColBERTv2 and Jina-ColBERT on BEIR. We use BM25 as the first-stage retrieval model. The full evaluation code can be found in [this repo](https://github.com/liuqi6777/eval_reranker).
|
191 |
+
|
192 |
+
In summary, Jina-ColBERT outperforms ColBERTv2, even achieving comparable performance with some cross-encoder.
|
193 |
+
|
194 |
+
The best model, jina-reranker, will be open-sourced soon!
|
195 |
+
|
196 |
+
|BM25|ColBERTv2|Jina-ColBERT|MiniLM-L-6-v2|BGE-reranker-base-v1|BGE-reranker-large-v1|Jina-reranker-base-v1|
|
197 |
+
| --- | :---: | :---: | :---: | :---: | :---: | :---: |
|
198 |
+
Arguana |29.99|33.42|33.95|30.67|23.26|25.42|42.59|
|
199 |
+
Climate-Fever |16.51|20.66|21.87|24.70|31.60|31.98|25.49|
|
200 |
+
DBPedia |31.80|42.16|41.43|43.90|41.56|43.79|43.68|
|
201 |
+
FEVER |65.13|81.07|83.49|80.77|87.07|89.11|86.10|
|
202 |
+
FiQA |23.61|35.60|36.68|34.87|33.17|37.70|41.38|
|
203 |
+
HotpotQA |63.30|68.84|68.62|72.65|79.04|79.98|75.61|
|
204 |
+
NFCorpus |33.75|36.69|36.38|36.48|32.71|36.57|37.73|
|
205 |
+
NQ |30.55|51.27|51.01|52.01|53.55|56.81|56.82|
|
206 |
+
Quora |78.86|85.18|82.75|82.45|78.44|81.06|87.31|
|
207 |
+
SCIDOCS |14.90|15.39|16.67|16.28|15.06|16.84|19.56|
|
208 |
+
SciFact |67.89|70.23|70.95|69.53|70.62|74.14|75.01|
|
209 |
+
TREC-COVID |59.47|75.00|76.89|74.45|67.46|74.32|82.09|
|
210 |
+
Webis-touche2020|44.22|32.12|32.56|28.40|34.37|35.66|31.62|
|
211 |
+
Average |43.08|49.82|50.25|49.78|49.84|52.57|**54.23**|
|
212 |
+
|
213 |
+
ColBERT
|
214 |
|
|
|
215 |
|
216 |
## Plans
|
217 |
|
218 |
+
We are planning to improve the performance of Jina-ColBERT by fine-tuning on more datasets in the future.
|
|
|
219 |
|
220 |
## Other Models
|
221 |
|