Sentence Similarity
Safetensors
Japanese
RAGatouille
bert
ColBERT
bclavie commited on
Commit
0e56cca
1 Parent(s): dd07f12

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -1,4 +1,4 @@
1
- ---
2
  inference: false
3
  datasets:
4
  - bclavie/mmarco-japanese-hard-negatives
@@ -8,7 +8,7 @@ language:
8
  pipeline_tag: sentence-similarity
9
  tags:
10
  - ColBERT
11
- ---
12
  Under Construction, please come back in a few days!
13
  工事中です。数日後にまたお越しください。
14
 
@@ -23,9 +23,9 @@ Under Construction, please come back in a few days!
23
  (refer to the technical report for exact evaluation method + code)
24
 
25
  | | JSQuAD | | | MIRACL | | | MrTyDi | | | Average | | |
26
- | ------------------------------------------------------------------------ | ----------------------- | -------------------- | ------ | ----------------------- | -------------------- | ------ | ----------------------- | -------------------- | ------ | ----------------------- | -------------------- | ------ |
27
  | | R@1 | R@5 | R@10 | R@3 | R@5 | R@10 | R@3 | R@5 | R@10 | R@\{1\|3\} | R@5 | R@10 |
28
- | ------------------------------------------------------------------------ | ----------------------- | -------------------- | ------ | ----------------------- | -------------------- | ------ | ----------------------- | -------------------- | ------ | ----------------------- | -------------------- | ------ |
29
  | JaColBERT | **0.906** | **0.968** | 0.978 | 0.464 | 0.546 | 0.645 | 0.744 | 0.781 | 0.821 | **0.705** | 0.765 | 0.813 |
30
  | m-e5-large (in-domain) | | | | | | | | | | | | |
31
  | m-e5-base (in-domain) | *0.838* | *0.955* | 0.973 | **0.482** | **0.553** | 0.632 | **0.777** | **0.815** | 0.857 | 0.699 | **0.775** | 0.820 |
@@ -72,7 +72,7 @@ ColBERT looks slightly more unfriendly than a usual `transformers` model, but a
72
 
73
  In order for the late-interaction retrieval approach used by ColBERT to work, you must first build your index.
74
  Think of it like using an embedding model, like e5, to embed all your documents and storing them in a vector database.
75
- Indexing is the slowest step -- retrieval is extremely quick. There are some tricks to speed it up, but the default settings work fairly well:
76
 
77
  ```python
78
  from colbert import Indexer
 
1
+ -
2
  inference: false
3
  datasets:
4
  - bclavie/mmarco-japanese-hard-negatives
 
8
  pipeline_tag: sentence-similarity
9
  tags:
10
  - ColBERT
11
+ -
12
  Under Construction, please come back in a few days!
13
  工事中です。数日後にまたお越しください。
14
 
 
23
  (refer to the technical report for exact evaluation method + code)
24
 
25
  | | JSQuAD | | | MIRACL | | | MrTyDi | | | Average | | |
26
+ | | | | | | | | | | | | | |
27
  | | R@1 | R@5 | R@10 | R@3 | R@5 | R@10 | R@3 | R@5 | R@10 | R@\{1\|3\} | R@5 | R@10 |
28
+ | | | | | | | | | | | | | |
29
  | JaColBERT | **0.906** | **0.968** | 0.978 | 0.464 | 0.546 | 0.645 | 0.744 | 0.781 | 0.821 | **0.705** | 0.765 | 0.813 |
30
  | m-e5-large (in-domain) | | | | | | | | | | | | |
31
  | m-e5-base (in-domain) | *0.838* | *0.955* | 0.973 | **0.482** | **0.553** | 0.632 | **0.777** | **0.815** | 0.857 | 0.699 | **0.775** | 0.820 |
 
72
 
73
  In order for the late-interaction retrieval approach used by ColBERT to work, you must first build your index.
74
  Think of it like using an embedding model, like e5, to embed all your documents and storing them in a vector database.
75
+ Indexing is the slowest step retrieval is extremely quick. There are some tricks to speed it up, but the default settings work fairly well:
76
 
77
  ```python
78
  from colbert import Indexer