jncraton commited on
Commit
afa1e20
·
1 Parent(s): f25eaa4

Upload 9 files

Browse files
README.md ADDED
@@ -0,0 +1,2868 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - mteb
4
+ - sentence transformers
5
+ model-index:
6
+ - name: bge-small-en
7
+ results:
8
+ - task:
9
+ type: Classification
10
+ dataset:
11
+ type: mteb/amazon_counterfactual
12
+ name: MTEB AmazonCounterfactualClassification (en)
13
+ config: en
14
+ split: test
15
+ revision: e8379541af4e31359cca9fbcf4b00f2671dba205
16
+ metrics:
17
+ - type: accuracy
18
+ value: 74.34328358208955
19
+ - type: ap
20
+ value: 37.59947775195661
21
+ - type: f1
22
+ value: 68.548415491933
23
+ - task:
24
+ type: Classification
25
+ dataset:
26
+ type: mteb/amazon_polarity
27
+ name: MTEB AmazonPolarityClassification
28
+ config: default
29
+ split: test
30
+ revision: e2d317d38cd51312af73b3d32a06d1a08b442046
31
+ metrics:
32
+ - type: accuracy
33
+ value: 93.04527499999999
34
+ - type: ap
35
+ value: 89.60696356772135
36
+ - type: f1
37
+ value: 93.03361469382438
38
+ - task:
39
+ type: Classification
40
+ dataset:
41
+ type: mteb/amazon_reviews_multi
42
+ name: MTEB AmazonReviewsClassification (en)
43
+ config: en
44
+ split: test
45
+ revision: 1399c76144fd37290681b995c656ef9b2e06e26d
46
+ metrics:
47
+ - type: accuracy
48
+ value: 46.08
49
+ - type: f1
50
+ value: 45.66249835363254
51
+ - task:
52
+ type: Retrieval
53
+ dataset:
54
+ type: arguana
55
+ name: MTEB ArguAna
56
+ config: default
57
+ split: test
58
+ revision: None
59
+ metrics:
60
+ - type: map_at_1
61
+ value: 35.205999999999996
62
+ - type: map_at_10
63
+ value: 50.782000000000004
64
+ - type: map_at_100
65
+ value: 51.547
66
+ - type: map_at_1000
67
+ value: 51.554
68
+ - type: map_at_3
69
+ value: 46.515
70
+ - type: map_at_5
71
+ value: 49.296
72
+ - type: mrr_at_1
73
+ value: 35.632999999999996
74
+ - type: mrr_at_10
75
+ value: 50.958999999999996
76
+ - type: mrr_at_100
77
+ value: 51.724000000000004
78
+ - type: mrr_at_1000
79
+ value: 51.731
80
+ - type: mrr_at_3
81
+ value: 46.669
82
+ - type: mrr_at_5
83
+ value: 49.439
84
+ - type: ndcg_at_1
85
+ value: 35.205999999999996
86
+ - type: ndcg_at_10
87
+ value: 58.835
88
+ - type: ndcg_at_100
89
+ value: 62.095
90
+ - type: ndcg_at_1000
91
+ value: 62.255
92
+ - type: ndcg_at_3
93
+ value: 50.255
94
+ - type: ndcg_at_5
95
+ value: 55.296
96
+ - type: precision_at_1
97
+ value: 35.205999999999996
98
+ - type: precision_at_10
99
+ value: 8.421
100
+ - type: precision_at_100
101
+ value: 0.984
102
+ - type: precision_at_1000
103
+ value: 0.1
104
+ - type: precision_at_3
105
+ value: 20.365
106
+ - type: precision_at_5
107
+ value: 14.680000000000001
108
+ - type: recall_at_1
109
+ value: 35.205999999999996
110
+ - type: recall_at_10
111
+ value: 84.211
112
+ - type: recall_at_100
113
+ value: 98.43499999999999
114
+ - type: recall_at_1000
115
+ value: 99.644
116
+ - type: recall_at_3
117
+ value: 61.095
118
+ - type: recall_at_5
119
+ value: 73.4
120
+ - task:
121
+ type: Clustering
122
+ dataset:
123
+ type: mteb/arxiv-clustering-p2p
124
+ name: MTEB ArxivClusteringP2P
125
+ config: default
126
+ split: test
127
+ revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d
128
+ metrics:
129
+ - type: v_measure
130
+ value: 47.52644476278646
131
+ - task:
132
+ type: Clustering
133
+ dataset:
134
+ type: mteb/arxiv-clustering-s2s
135
+ name: MTEB ArxivClusteringS2S
136
+ config: default
137
+ split: test
138
+ revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53
139
+ metrics:
140
+ - type: v_measure
141
+ value: 39.973045724188964
142
+ - task:
143
+ type: Reranking
144
+ dataset:
145
+ type: mteb/askubuntudupquestions-reranking
146
+ name: MTEB AskUbuntuDupQuestions
147
+ config: default
148
+ split: test
149
+ revision: 2000358ca161889fa9c082cb41daa8dcfb161a54
150
+ metrics:
151
+ - type: map
152
+ value: 62.28285314871488
153
+ - type: mrr
154
+ value: 74.52743701358659
155
+ - task:
156
+ type: STS
157
+ dataset:
158
+ type: mteb/biosses-sts
159
+ name: MTEB BIOSSES
160
+ config: default
161
+ split: test
162
+ revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
163
+ metrics:
164
+ - type: cos_sim_pearson
165
+ value: 80.09041909160327
166
+ - type: cos_sim_spearman
167
+ value: 79.96266537706944
168
+ - type: euclidean_pearson
169
+ value: 79.50774978162241
170
+ - type: euclidean_spearman
171
+ value: 79.9144715078551
172
+ - type: manhattan_pearson
173
+ value: 79.2062139879302
174
+ - type: manhattan_spearman
175
+ value: 79.35000081468212
176
+ - task:
177
+ type: Classification
178
+ dataset:
179
+ type: mteb/banking77
180
+ name: MTEB Banking77Classification
181
+ config: default
182
+ split: test
183
+ revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
184
+ metrics:
185
+ - type: accuracy
186
+ value: 85.31493506493506
187
+ - type: f1
188
+ value: 85.2704557977762
189
+ - task:
190
+ type: Clustering
191
+ dataset:
192
+ type: mteb/biorxiv-clustering-p2p
193
+ name: MTEB BiorxivClusteringP2P
194
+ config: default
195
+ split: test
196
+ revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
197
+ metrics:
198
+ - type: v_measure
199
+ value: 39.6837242810816
200
+ - task:
201
+ type: Clustering
202
+ dataset:
203
+ type: mteb/biorxiv-clustering-s2s
204
+ name: MTEB BiorxivClusteringS2S
205
+ config: default
206
+ split: test
207
+ revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
208
+ metrics:
209
+ - type: v_measure
210
+ value: 35.38881249555897
211
+ - task:
212
+ type: Retrieval
213
+ dataset:
214
+ type: BeIR/cqadupstack
215
+ name: MTEB CQADupstackAndroidRetrieval
216
+ config: default
217
+ split: test
218
+ revision: None
219
+ metrics:
220
+ - type: map_at_1
221
+ value: 27.884999999999998
222
+ - type: map_at_10
223
+ value: 39.574
224
+ - type: map_at_100
225
+ value: 40.993
226
+ - type: map_at_1000
227
+ value: 41.129
228
+ - type: map_at_3
229
+ value: 36.089
230
+ - type: map_at_5
231
+ value: 38.191
232
+ - type: mrr_at_1
233
+ value: 34.477999999999994
234
+ - type: mrr_at_10
235
+ value: 45.411
236
+ - type: mrr_at_100
237
+ value: 46.089999999999996
238
+ - type: mrr_at_1000
239
+ value: 46.147
240
+ - type: mrr_at_3
241
+ value: 42.346000000000004
242
+ - type: mrr_at_5
243
+ value: 44.292
244
+ - type: ndcg_at_1
245
+ value: 34.477999999999994
246
+ - type: ndcg_at_10
247
+ value: 46.123999999999995
248
+ - type: ndcg_at_100
249
+ value: 51.349999999999994
250
+ - type: ndcg_at_1000
251
+ value: 53.578
252
+ - type: ndcg_at_3
253
+ value: 40.824
254
+ - type: ndcg_at_5
255
+ value: 43.571
256
+ - type: precision_at_1
257
+ value: 34.477999999999994
258
+ - type: precision_at_10
259
+ value: 8.841000000000001
260
+ - type: precision_at_100
261
+ value: 1.4460000000000002
262
+ - type: precision_at_1000
263
+ value: 0.192
264
+ - type: precision_at_3
265
+ value: 19.742
266
+ - type: precision_at_5
267
+ value: 14.421000000000001
268
+ - type: recall_at_1
269
+ value: 27.884999999999998
270
+ - type: recall_at_10
271
+ value: 59.087
272
+ - type: recall_at_100
273
+ value: 80.609
274
+ - type: recall_at_1000
275
+ value: 95.054
276
+ - type: recall_at_3
277
+ value: 44.082
278
+ - type: recall_at_5
279
+ value: 51.593999999999994
280
+ - task:
281
+ type: Retrieval
282
+ dataset:
283
+ type: BeIR/cqadupstack
284
+ name: MTEB CQADupstackEnglishRetrieval
285
+ config: default
286
+ split: test
287
+ revision: None
288
+ metrics:
289
+ - type: map_at_1
290
+ value: 30.639
291
+ - type: map_at_10
292
+ value: 40.047
293
+ - type: map_at_100
294
+ value: 41.302
295
+ - type: map_at_1000
296
+ value: 41.425
297
+ - type: map_at_3
298
+ value: 37.406
299
+ - type: map_at_5
300
+ value: 38.934000000000005
301
+ - type: mrr_at_1
302
+ value: 37.707
303
+ - type: mrr_at_10
304
+ value: 46.082
305
+ - type: mrr_at_100
306
+ value: 46.745
307
+ - type: mrr_at_1000
308
+ value: 46.786
309
+ - type: mrr_at_3
310
+ value: 43.980999999999995
311
+ - type: mrr_at_5
312
+ value: 45.287
313
+ - type: ndcg_at_1
314
+ value: 37.707
315
+ - type: ndcg_at_10
316
+ value: 45.525
317
+ - type: ndcg_at_100
318
+ value: 49.976
319
+ - type: ndcg_at_1000
320
+ value: 51.94499999999999
321
+ - type: ndcg_at_3
322
+ value: 41.704
323
+ - type: ndcg_at_5
324
+ value: 43.596000000000004
325
+ - type: precision_at_1
326
+ value: 37.707
327
+ - type: precision_at_10
328
+ value: 8.465
329
+ - type: precision_at_100
330
+ value: 1.375
331
+ - type: precision_at_1000
332
+ value: 0.183
333
+ - type: precision_at_3
334
+ value: 19.979
335
+ - type: precision_at_5
336
+ value: 14.115
337
+ - type: recall_at_1
338
+ value: 30.639
339
+ - type: recall_at_10
340
+ value: 54.775
341
+ - type: recall_at_100
342
+ value: 73.678
343
+ - type: recall_at_1000
344
+ value: 86.142
345
+ - type: recall_at_3
346
+ value: 43.230000000000004
347
+ - type: recall_at_5
348
+ value: 48.622
349
+ - task:
350
+ type: Retrieval
351
+ dataset:
352
+ type: BeIR/cqadupstack
353
+ name: MTEB CQADupstackGamingRetrieval
354
+ config: default
355
+ split: test
356
+ revision: None
357
+ metrics:
358
+ - type: map_at_1
359
+ value: 38.038
360
+ - type: map_at_10
361
+ value: 49.922
362
+ - type: map_at_100
363
+ value: 51.032
364
+ - type: map_at_1000
365
+ value: 51.085
366
+ - type: map_at_3
367
+ value: 46.664
368
+ - type: map_at_5
369
+ value: 48.588
370
+ - type: mrr_at_1
371
+ value: 43.95
372
+ - type: mrr_at_10
373
+ value: 53.566
374
+ - type: mrr_at_100
375
+ value: 54.318999999999996
376
+ - type: mrr_at_1000
377
+ value: 54.348
378
+ - type: mrr_at_3
379
+ value: 51.066
380
+ - type: mrr_at_5
381
+ value: 52.649
382
+ - type: ndcg_at_1
383
+ value: 43.95
384
+ - type: ndcg_at_10
385
+ value: 55.676
386
+ - type: ndcg_at_100
387
+ value: 60.126000000000005
388
+ - type: ndcg_at_1000
389
+ value: 61.208
390
+ - type: ndcg_at_3
391
+ value: 50.20400000000001
392
+ - type: ndcg_at_5
393
+ value: 53.038
394
+ - type: precision_at_1
395
+ value: 43.95
396
+ - type: precision_at_10
397
+ value: 8.953
398
+ - type: precision_at_100
399
+ value: 1.2109999999999999
400
+ - type: precision_at_1000
401
+ value: 0.135
402
+ - type: precision_at_3
403
+ value: 22.256999999999998
404
+ - type: precision_at_5
405
+ value: 15.524
406
+ - type: recall_at_1
407
+ value: 38.038
408
+ - type: recall_at_10
409
+ value: 69.15
410
+ - type: recall_at_100
411
+ value: 88.31599999999999
412
+ - type: recall_at_1000
413
+ value: 95.993
414
+ - type: recall_at_3
415
+ value: 54.663
416
+ - type: recall_at_5
417
+ value: 61.373
418
+ - task:
419
+ type: Retrieval
420
+ dataset:
421
+ type: BeIR/cqadupstack
422
+ name: MTEB CQADupstackGisRetrieval
423
+ config: default
424
+ split: test
425
+ revision: None
426
+ metrics:
427
+ - type: map_at_1
428
+ value: 24.872
429
+ - type: map_at_10
430
+ value: 32.912
431
+ - type: map_at_100
432
+ value: 33.972
433
+ - type: map_at_1000
434
+ value: 34.046
435
+ - type: map_at_3
436
+ value: 30.361
437
+ - type: map_at_5
438
+ value: 31.704
439
+ - type: mrr_at_1
440
+ value: 26.779999999999998
441
+ - type: mrr_at_10
442
+ value: 34.812
443
+ - type: mrr_at_100
444
+ value: 35.754999999999995
445
+ - type: mrr_at_1000
446
+ value: 35.809000000000005
447
+ - type: mrr_at_3
448
+ value: 32.335
449
+ - type: mrr_at_5
450
+ value: 33.64
451
+ - type: ndcg_at_1
452
+ value: 26.779999999999998
453
+ - type: ndcg_at_10
454
+ value: 37.623
455
+ - type: ndcg_at_100
456
+ value: 42.924
457
+ - type: ndcg_at_1000
458
+ value: 44.856
459
+ - type: ndcg_at_3
460
+ value: 32.574
461
+ - type: ndcg_at_5
462
+ value: 34.842
463
+ - type: precision_at_1
464
+ value: 26.779999999999998
465
+ - type: precision_at_10
466
+ value: 5.729
467
+ - type: precision_at_100
468
+ value: 0.886
469
+ - type: precision_at_1000
470
+ value: 0.109
471
+ - type: precision_at_3
472
+ value: 13.559
473
+ - type: precision_at_5
474
+ value: 9.469
475
+ - type: recall_at_1
476
+ value: 24.872
477
+ - type: recall_at_10
478
+ value: 50.400999999999996
479
+ - type: recall_at_100
480
+ value: 74.954
481
+ - type: recall_at_1000
482
+ value: 89.56
483
+ - type: recall_at_3
484
+ value: 36.726
485
+ - type: recall_at_5
486
+ value: 42.138999999999996
487
+ - task:
488
+ type: Retrieval
489
+ dataset:
490
+ type: BeIR/cqadupstack
491
+ name: MTEB CQADupstackMathematicaRetrieval
492
+ config: default
493
+ split: test
494
+ revision: None
495
+ metrics:
496
+ - type: map_at_1
497
+ value: 16.803
498
+ - type: map_at_10
499
+ value: 24.348
500
+ - type: map_at_100
501
+ value: 25.56
502
+ - type: map_at_1000
503
+ value: 25.668000000000003
504
+ - type: map_at_3
505
+ value: 21.811
506
+ - type: map_at_5
507
+ value: 23.287
508
+ - type: mrr_at_1
509
+ value: 20.771
510
+ - type: mrr_at_10
511
+ value: 28.961
512
+ - type: mrr_at_100
513
+ value: 29.979
514
+ - type: mrr_at_1000
515
+ value: 30.046
516
+ - type: mrr_at_3
517
+ value: 26.555
518
+ - type: mrr_at_5
519
+ value: 28.060000000000002
520
+ - type: ndcg_at_1
521
+ value: 20.771
522
+ - type: ndcg_at_10
523
+ value: 29.335
524
+ - type: ndcg_at_100
525
+ value: 35.188
526
+ - type: ndcg_at_1000
527
+ value: 37.812
528
+ - type: ndcg_at_3
529
+ value: 24.83
530
+ - type: ndcg_at_5
531
+ value: 27.119
532
+ - type: precision_at_1
533
+ value: 20.771
534
+ - type: precision_at_10
535
+ value: 5.4350000000000005
536
+ - type: precision_at_100
537
+ value: 0.9480000000000001
538
+ - type: precision_at_1000
539
+ value: 0.13
540
+ - type: precision_at_3
541
+ value: 11.982
542
+ - type: precision_at_5
543
+ value: 8.831
544
+ - type: recall_at_1
545
+ value: 16.803
546
+ - type: recall_at_10
547
+ value: 40.039
548
+ - type: recall_at_100
549
+ value: 65.83200000000001
550
+ - type: recall_at_1000
551
+ value: 84.478
552
+ - type: recall_at_3
553
+ value: 27.682000000000002
554
+ - type: recall_at_5
555
+ value: 33.535
556
+ - task:
557
+ type: Retrieval
558
+ dataset:
559
+ type: BeIR/cqadupstack
560
+ name: MTEB CQADupstackPhysicsRetrieval
561
+ config: default
562
+ split: test
563
+ revision: None
564
+ metrics:
565
+ - type: map_at_1
566
+ value: 28.345
567
+ - type: map_at_10
568
+ value: 37.757000000000005
569
+ - type: map_at_100
570
+ value: 39.141
571
+ - type: map_at_1000
572
+ value: 39.262
573
+ - type: map_at_3
574
+ value: 35.183
575
+ - type: map_at_5
576
+ value: 36.592
577
+ - type: mrr_at_1
578
+ value: 34.649
579
+ - type: mrr_at_10
580
+ value: 43.586999999999996
581
+ - type: mrr_at_100
582
+ value: 44.481
583
+ - type: mrr_at_1000
584
+ value: 44.542
585
+ - type: mrr_at_3
586
+ value: 41.29
587
+ - type: mrr_at_5
588
+ value: 42.642
589
+ - type: ndcg_at_1
590
+ value: 34.649
591
+ - type: ndcg_at_10
592
+ value: 43.161
593
+ - type: ndcg_at_100
594
+ value: 48.734
595
+ - type: ndcg_at_1000
596
+ value: 51.046
597
+ - type: ndcg_at_3
598
+ value: 39.118
599
+ - type: ndcg_at_5
600
+ value: 41.022
601
+ - type: precision_at_1
602
+ value: 34.649
603
+ - type: precision_at_10
604
+ value: 7.603
605
+ - type: precision_at_100
606
+ value: 1.209
607
+ - type: precision_at_1000
608
+ value: 0.157
609
+ - type: precision_at_3
610
+ value: 18.319
611
+ - type: precision_at_5
612
+ value: 12.839
613
+ - type: recall_at_1
614
+ value: 28.345
615
+ - type: recall_at_10
616
+ value: 53.367
617
+ - type: recall_at_100
618
+ value: 76.453
619
+ - type: recall_at_1000
620
+ value: 91.82000000000001
621
+ - type: recall_at_3
622
+ value: 41.636
623
+ - type: recall_at_5
624
+ value: 46.760000000000005
625
+ - task:
626
+ type: Retrieval
627
+ dataset:
628
+ type: BeIR/cqadupstack
629
+ name: MTEB CQADupstackProgrammersRetrieval
630
+ config: default
631
+ split: test
632
+ revision: None
633
+ metrics:
634
+ - type: map_at_1
635
+ value: 22.419
636
+ - type: map_at_10
637
+ value: 31.716
638
+ - type: map_at_100
639
+ value: 33.152
640
+ - type: map_at_1000
641
+ value: 33.267
642
+ - type: map_at_3
643
+ value: 28.74
644
+ - type: map_at_5
645
+ value: 30.48
646
+ - type: mrr_at_1
647
+ value: 28.310999999999996
648
+ - type: mrr_at_10
649
+ value: 37.039
650
+ - type: mrr_at_100
651
+ value: 38.09
652
+ - type: mrr_at_1000
653
+ value: 38.145
654
+ - type: mrr_at_3
655
+ value: 34.437
656
+ - type: mrr_at_5
657
+ value: 36.024
658
+ - type: ndcg_at_1
659
+ value: 28.310999999999996
660
+ - type: ndcg_at_10
661
+ value: 37.41
662
+ - type: ndcg_at_100
663
+ value: 43.647999999999996
664
+ - type: ndcg_at_1000
665
+ value: 46.007
666
+ - type: ndcg_at_3
667
+ value: 32.509
668
+ - type: ndcg_at_5
669
+ value: 34.943999999999996
670
+ - type: precision_at_1
671
+ value: 28.310999999999996
672
+ - type: precision_at_10
673
+ value: 6.963
674
+ - type: precision_at_100
675
+ value: 1.1860000000000002
676
+ - type: precision_at_1000
677
+ value: 0.154
678
+ - type: precision_at_3
679
+ value: 15.867999999999999
680
+ - type: precision_at_5
681
+ value: 11.507000000000001
682
+ - type: recall_at_1
683
+ value: 22.419
684
+ - type: recall_at_10
685
+ value: 49.28
686
+ - type: recall_at_100
687
+ value: 75.802
688
+ - type: recall_at_1000
689
+ value: 92.032
690
+ - type: recall_at_3
691
+ value: 35.399
692
+ - type: recall_at_5
693
+ value: 42.027
694
+ - task:
695
+ type: Retrieval
696
+ dataset:
697
+ type: BeIR/cqadupstack
698
+ name: MTEB CQADupstackRetrieval
699
+ config: default
700
+ split: test
701
+ revision: None
702
+ metrics:
703
+ - type: map_at_1
704
+ value: 24.669249999999998
705
+ - type: map_at_10
706
+ value: 33.332583333333325
707
+ - type: map_at_100
708
+ value: 34.557833333333335
709
+ - type: map_at_1000
710
+ value: 34.67141666666666
711
+ - type: map_at_3
712
+ value: 30.663166666666662
713
+ - type: map_at_5
714
+ value: 32.14883333333333
715
+ - type: mrr_at_1
716
+ value: 29.193833333333334
717
+ - type: mrr_at_10
718
+ value: 37.47625
719
+ - type: mrr_at_100
720
+ value: 38.3545
721
+ - type: mrr_at_1000
722
+ value: 38.413166666666676
723
+ - type: mrr_at_3
724
+ value: 35.06741666666667
725
+ - type: mrr_at_5
726
+ value: 36.450666666666656
727
+ - type: ndcg_at_1
728
+ value: 29.193833333333334
729
+ - type: ndcg_at_10
730
+ value: 38.505416666666676
731
+ - type: ndcg_at_100
732
+ value: 43.81125
733
+ - type: ndcg_at_1000
734
+ value: 46.09558333333333
735
+ - type: ndcg_at_3
736
+ value: 33.90916666666667
737
+ - type: ndcg_at_5
738
+ value: 36.07666666666666
739
+ - type: precision_at_1
740
+ value: 29.193833333333334
741
+ - type: precision_at_10
742
+ value: 6.7251666666666665
743
+ - type: precision_at_100
744
+ value: 1.1058333333333332
745
+ - type: precision_at_1000
746
+ value: 0.14833333333333332
747
+ - type: precision_at_3
748
+ value: 15.554166666666665
749
+ - type: precision_at_5
750
+ value: 11.079250000000002
751
+ - type: recall_at_1
752
+ value: 24.669249999999998
753
+ - type: recall_at_10
754
+ value: 49.75583333333332
755
+ - type: recall_at_100
756
+ value: 73.06908333333332
757
+ - type: recall_at_1000
758
+ value: 88.91316666666667
759
+ - type: recall_at_3
760
+ value: 36.913250000000005
761
+ - type: recall_at_5
762
+ value: 42.48641666666666
763
+ - task:
764
+ type: Retrieval
765
+ dataset:
766
+ type: BeIR/cqadupstack
767
+ name: MTEB CQADupstackStatsRetrieval
768
+ config: default
769
+ split: test
770
+ revision: None
771
+ metrics:
772
+ - type: map_at_1
773
+ value: 24.044999999999998
774
+ - type: map_at_10
775
+ value: 30.349999999999998
776
+ - type: map_at_100
777
+ value: 31.273
778
+ - type: map_at_1000
779
+ value: 31.362000000000002
780
+ - type: map_at_3
781
+ value: 28.508
782
+ - type: map_at_5
783
+ value: 29.369
784
+ - type: mrr_at_1
785
+ value: 26.994
786
+ - type: mrr_at_10
787
+ value: 33.12
788
+ - type: mrr_at_100
789
+ value: 33.904
790
+ - type: mrr_at_1000
791
+ value: 33.967000000000006
792
+ - type: mrr_at_3
793
+ value: 31.365
794
+ - type: mrr_at_5
795
+ value: 32.124
796
+ - type: ndcg_at_1
797
+ value: 26.994
798
+ - type: ndcg_at_10
799
+ value: 34.214
800
+ - type: ndcg_at_100
801
+ value: 38.681
802
+ - type: ndcg_at_1000
803
+ value: 40.926
804
+ - type: ndcg_at_3
805
+ value: 30.725
806
+ - type: ndcg_at_5
807
+ value: 31.967000000000002
808
+ - type: precision_at_1
809
+ value: 26.994
810
+ - type: precision_at_10
811
+ value: 5.215
812
+ - type: precision_at_100
813
+ value: 0.807
814
+ - type: precision_at_1000
815
+ value: 0.108
816
+ - type: precision_at_3
817
+ value: 12.986
818
+ - type: precision_at_5
819
+ value: 8.712
820
+ - type: recall_at_1
821
+ value: 24.044999999999998
822
+ - type: recall_at_10
823
+ value: 43.456
824
+ - type: recall_at_100
825
+ value: 63.675000000000004
826
+ - type: recall_at_1000
827
+ value: 80.05499999999999
828
+ - type: recall_at_3
829
+ value: 33.561
830
+ - type: recall_at_5
831
+ value: 36.767
832
+ - task:
833
+ type: Retrieval
834
+ dataset:
835
+ type: BeIR/cqadupstack
836
+ name: MTEB CQADupstackTexRetrieval
837
+ config: default
838
+ split: test
839
+ revision: None
840
+ metrics:
841
+ - type: map_at_1
842
+ value: 15.672
843
+ - type: map_at_10
844
+ value: 22.641
845
+ - type: map_at_100
846
+ value: 23.75
847
+ - type: map_at_1000
848
+ value: 23.877000000000002
849
+ - type: map_at_3
850
+ value: 20.219
851
+ - type: map_at_5
852
+ value: 21.648
853
+ - type: mrr_at_1
854
+ value: 18.823
855
+ - type: mrr_at_10
856
+ value: 26.101999999999997
857
+ - type: mrr_at_100
858
+ value: 27.038
859
+ - type: mrr_at_1000
860
+ value: 27.118
861
+ - type: mrr_at_3
862
+ value: 23.669
863
+ - type: mrr_at_5
864
+ value: 25.173000000000002
865
+ - type: ndcg_at_1
866
+ value: 18.823
867
+ - type: ndcg_at_10
868
+ value: 27.176000000000002
869
+ - type: ndcg_at_100
870
+ value: 32.42
871
+ - type: ndcg_at_1000
872
+ value: 35.413
873
+ - type: ndcg_at_3
874
+ value: 22.756999999999998
875
+ - type: ndcg_at_5
876
+ value: 25.032
877
+ - type: precision_at_1
878
+ value: 18.823
879
+ - type: precision_at_10
880
+ value: 5.034000000000001
881
+ - type: precision_at_100
882
+ value: 0.895
883
+ - type: precision_at_1000
884
+ value: 0.132
885
+ - type: precision_at_3
886
+ value: 10.771
887
+ - type: precision_at_5
888
+ value: 8.1
889
+ - type: recall_at_1
890
+ value: 15.672
891
+ - type: recall_at_10
892
+ value: 37.296
893
+ - type: recall_at_100
894
+ value: 60.863
895
+ - type: recall_at_1000
896
+ value: 82.234
897
+ - type: recall_at_3
898
+ value: 25.330000000000002
899
+ - type: recall_at_5
900
+ value: 30.964000000000002
901
+ - task:
902
+ type: Retrieval
903
+ dataset:
904
+ type: BeIR/cqadupstack
905
+ name: MTEB CQADupstackUnixRetrieval
906
+ config: default
907
+ split: test
908
+ revision: None
909
+ metrics:
910
+ - type: map_at_1
911
+ value: 24.633
912
+ - type: map_at_10
913
+ value: 32.858
914
+ - type: map_at_100
915
+ value: 34.038000000000004
916
+ - type: map_at_1000
917
+ value: 34.141
918
+ - type: map_at_3
919
+ value: 30.209000000000003
920
+ - type: map_at_5
921
+ value: 31.567
922
+ - type: mrr_at_1
923
+ value: 28.358
924
+ - type: mrr_at_10
925
+ value: 36.433
926
+ - type: mrr_at_100
927
+ value: 37.352000000000004
928
+ - type: mrr_at_1000
929
+ value: 37.41
930
+ - type: mrr_at_3
931
+ value: 34.033
932
+ - type: mrr_at_5
933
+ value: 35.246
934
+ - type: ndcg_at_1
935
+ value: 28.358
936
+ - type: ndcg_at_10
937
+ value: 37.973
938
+ - type: ndcg_at_100
939
+ value: 43.411
940
+ - type: ndcg_at_1000
941
+ value: 45.747
942
+ - type: ndcg_at_3
943
+ value: 32.934999999999995
944
+ - type: ndcg_at_5
945
+ value: 35.013
946
+ - type: precision_at_1
947
+ value: 28.358
948
+ - type: precision_at_10
949
+ value: 6.418
950
+ - type: precision_at_100
951
+ value: 1.02
952
+ - type: precision_at_1000
953
+ value: 0.133
954
+ - type: precision_at_3
955
+ value: 14.677000000000001
956
+ - type: precision_at_5
957
+ value: 10.335999999999999
958
+ - type: recall_at_1
959
+ value: 24.633
960
+ - type: recall_at_10
961
+ value: 50.048
962
+ - type: recall_at_100
963
+ value: 73.821
964
+ - type: recall_at_1000
965
+ value: 90.046
966
+ - type: recall_at_3
967
+ value: 36.284
968
+ - type: recall_at_5
969
+ value: 41.370000000000005
970
+ - task:
971
+ type: Retrieval
972
+ dataset:
973
+ type: BeIR/cqadupstack
974
+ name: MTEB CQADupstackWebmastersRetrieval
975
+ config: default
976
+ split: test
977
+ revision: None
978
+ metrics:
979
+ - type: map_at_1
980
+ value: 23.133
981
+ - type: map_at_10
982
+ value: 31.491999999999997
983
+ - type: map_at_100
984
+ value: 33.062000000000005
985
+ - type: map_at_1000
986
+ value: 33.256
987
+ - type: map_at_3
988
+ value: 28.886
989
+ - type: map_at_5
990
+ value: 30.262
991
+ - type: mrr_at_1
992
+ value: 28.063
993
+ - type: mrr_at_10
994
+ value: 36.144
995
+ - type: mrr_at_100
996
+ value: 37.14
997
+ - type: mrr_at_1000
998
+ value: 37.191
999
+ - type: mrr_at_3
1000
+ value: 33.762
1001
+ - type: mrr_at_5
1002
+ value: 34.997
1003
+ - type: ndcg_at_1
1004
+ value: 28.063
1005
+ - type: ndcg_at_10
1006
+ value: 36.951
1007
+ - type: ndcg_at_100
1008
+ value: 43.287
1009
+ - type: ndcg_at_1000
1010
+ value: 45.777
1011
+ - type: ndcg_at_3
1012
+ value: 32.786
1013
+ - type: ndcg_at_5
1014
+ value: 34.65
1015
+ - type: precision_at_1
1016
+ value: 28.063
1017
+ - type: precision_at_10
1018
+ value: 7.055
1019
+ - type: precision_at_100
1020
+ value: 1.476
1021
+ - type: precision_at_1000
1022
+ value: 0.22899999999999998
1023
+ - type: precision_at_3
1024
+ value: 15.481
1025
+ - type: precision_at_5
1026
+ value: 11.186
1027
+ - type: recall_at_1
1028
+ value: 23.133
1029
+ - type: recall_at_10
1030
+ value: 47.285
1031
+ - type: recall_at_100
1032
+ value: 76.176
1033
+ - type: recall_at_1000
1034
+ value: 92.176
1035
+ - type: recall_at_3
1036
+ value: 35.223
1037
+ - type: recall_at_5
1038
+ value: 40.142
1039
+ - task:
1040
+ type: Retrieval
1041
+ dataset:
1042
+ type: BeIR/cqadupstack
1043
+ name: MTEB CQADupstackWordpressRetrieval
1044
+ config: default
1045
+ split: test
1046
+ revision: None
1047
+ metrics:
1048
+ - type: map_at_1
1049
+ value: 19.547
1050
+ - type: map_at_10
1051
+ value: 26.374
1052
+ - type: map_at_100
1053
+ value: 27.419
1054
+ - type: map_at_1000
1055
+ value: 27.539
1056
+ - type: map_at_3
1057
+ value: 23.882
1058
+ - type: map_at_5
1059
+ value: 25.163999999999998
1060
+ - type: mrr_at_1
1061
+ value: 21.442
1062
+ - type: mrr_at_10
1063
+ value: 28.458
1064
+ - type: mrr_at_100
1065
+ value: 29.360999999999997
1066
+ - type: mrr_at_1000
1067
+ value: 29.448999999999998
1068
+ - type: mrr_at_3
1069
+ value: 25.97
1070
+ - type: mrr_at_5
1071
+ value: 27.273999999999997
1072
+ - type: ndcg_at_1
1073
+ value: 21.442
1074
+ - type: ndcg_at_10
1075
+ value: 30.897000000000002
1076
+ - type: ndcg_at_100
1077
+ value: 35.99
1078
+ - type: ndcg_at_1000
1079
+ value: 38.832
1080
+ - type: ndcg_at_3
1081
+ value: 25.944
1082
+ - type: ndcg_at_5
1083
+ value: 28.126
1084
+ - type: precision_at_1
1085
+ value: 21.442
1086
+ - type: precision_at_10
1087
+ value: 4.9910000000000005
1088
+ - type: precision_at_100
1089
+ value: 0.8109999999999999
1090
+ - type: precision_at_1000
1091
+ value: 0.11800000000000001
1092
+ - type: precision_at_3
1093
+ value: 11.029
1094
+ - type: precision_at_5
1095
+ value: 7.911
1096
+ - type: recall_at_1
1097
+ value: 19.547
1098
+ - type: recall_at_10
1099
+ value: 42.886
1100
+ - type: recall_at_100
1101
+ value: 66.64999999999999
1102
+ - type: recall_at_1000
1103
+ value: 87.368
1104
+ - type: recall_at_3
1105
+ value: 29.143
1106
+ - type: recall_at_5
1107
+ value: 34.544000000000004
1108
+ - task:
1109
+ type: Retrieval
1110
+ dataset:
1111
+ type: climate-fever
1112
+ name: MTEB ClimateFEVER
1113
+ config: default
1114
+ split: test
1115
+ revision: None
1116
+ metrics:
1117
+ - type: map_at_1
1118
+ value: 15.572
1119
+ - type: map_at_10
1120
+ value: 25.312
1121
+ - type: map_at_100
1122
+ value: 27.062
1123
+ - type: map_at_1000
1124
+ value: 27.253
1125
+ - type: map_at_3
1126
+ value: 21.601
1127
+ - type: map_at_5
1128
+ value: 23.473
1129
+ - type: mrr_at_1
1130
+ value: 34.984
1131
+ - type: mrr_at_10
1132
+ value: 46.406
1133
+ - type: mrr_at_100
1134
+ value: 47.179
1135
+ - type: mrr_at_1000
1136
+ value: 47.21
1137
+ - type: mrr_at_3
1138
+ value: 43.485
1139
+ - type: mrr_at_5
1140
+ value: 45.322
1141
+ - type: ndcg_at_1
1142
+ value: 34.984
1143
+ - type: ndcg_at_10
1144
+ value: 34.344
1145
+ - type: ndcg_at_100
1146
+ value: 41.015
1147
+ - type: ndcg_at_1000
1148
+ value: 44.366
1149
+ - type: ndcg_at_3
1150
+ value: 29.119
1151
+ - type: ndcg_at_5
1152
+ value: 30.825999999999997
1153
+ - type: precision_at_1
1154
+ value: 34.984
1155
+ - type: precision_at_10
1156
+ value: 10.358
1157
+ - type: precision_at_100
1158
+ value: 1.762
1159
+ - type: precision_at_1000
1160
+ value: 0.23900000000000002
1161
+ - type: precision_at_3
1162
+ value: 21.368000000000002
1163
+ - type: precision_at_5
1164
+ value: 15.948
1165
+ - type: recall_at_1
1166
+ value: 15.572
1167
+ - type: recall_at_10
1168
+ value: 39.367999999999995
1169
+ - type: recall_at_100
1170
+ value: 62.183
1171
+ - type: recall_at_1000
1172
+ value: 80.92200000000001
1173
+ - type: recall_at_3
1174
+ value: 26.131999999999998
1175
+ - type: recall_at_5
1176
+ value: 31.635999999999996
1177
+ - task:
1178
+ type: Retrieval
1179
+ dataset:
1180
+ type: dbpedia-entity
1181
+ name: MTEB DBPedia
1182
+ config: default
1183
+ split: test
1184
+ revision: None
1185
+ metrics:
1186
+ - type: map_at_1
1187
+ value: 8.848
1188
+ - type: map_at_10
1189
+ value: 19.25
1190
+ - type: map_at_100
1191
+ value: 27.193
1192
+ - type: map_at_1000
1193
+ value: 28.721999999999998
1194
+ - type: map_at_3
1195
+ value: 13.968
1196
+ - type: map_at_5
1197
+ value: 16.283
1198
+ - type: mrr_at_1
1199
+ value: 68.75
1200
+ - type: mrr_at_10
1201
+ value: 76.25
1202
+ - type: mrr_at_100
1203
+ value: 76.534
1204
+ - type: mrr_at_1000
1205
+ value: 76.53999999999999
1206
+ - type: mrr_at_3
1207
+ value: 74.667
1208
+ - type: mrr_at_5
1209
+ value: 75.86699999999999
1210
+ - type: ndcg_at_1
1211
+ value: 56.00000000000001
1212
+ - type: ndcg_at_10
1213
+ value: 41.426
1214
+ - type: ndcg_at_100
1215
+ value: 45.660000000000004
1216
+ - type: ndcg_at_1000
1217
+ value: 53.02
1218
+ - type: ndcg_at_3
1219
+ value: 46.581
1220
+ - type: ndcg_at_5
1221
+ value: 43.836999999999996
1222
+ - type: precision_at_1
1223
+ value: 68.75
1224
+ - type: precision_at_10
1225
+ value: 32.800000000000004
1226
+ - type: precision_at_100
1227
+ value: 10.440000000000001
1228
+ - type: precision_at_1000
1229
+ value: 1.9980000000000002
1230
+ - type: precision_at_3
1231
+ value: 49.667
1232
+ - type: precision_at_5
1233
+ value: 42.25
1234
+ - type: recall_at_1
1235
+ value: 8.848
1236
+ - type: recall_at_10
1237
+ value: 24.467
1238
+ - type: recall_at_100
1239
+ value: 51.344
1240
+ - type: recall_at_1000
1241
+ value: 75.235
1242
+ - type: recall_at_3
1243
+ value: 15.329
1244
+ - type: recall_at_5
1245
+ value: 18.892999999999997
1246
+ - task:
1247
+ type: Classification
1248
+ dataset:
1249
+ type: mteb/emotion
1250
+ name: MTEB EmotionClassification
1251
+ config: default
1252
+ split: test
1253
+ revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
1254
+ metrics:
1255
+ - type: accuracy
1256
+ value: 48.95
1257
+ - type: f1
1258
+ value: 43.44563593360779
1259
+ - task:
1260
+ type: Retrieval
1261
+ dataset:
1262
+ type: fever
1263
+ name: MTEB FEVER
1264
+ config: default
1265
+ split: test
1266
+ revision: None
1267
+ metrics:
1268
+ - type: map_at_1
1269
+ value: 78.036
1270
+ - type: map_at_10
1271
+ value: 85.639
1272
+ - type: map_at_100
1273
+ value: 85.815
1274
+ - type: map_at_1000
1275
+ value: 85.829
1276
+ - type: map_at_3
1277
+ value: 84.795
1278
+ - type: map_at_5
1279
+ value: 85.336
1280
+ - type: mrr_at_1
1281
+ value: 84.353
1282
+ - type: mrr_at_10
1283
+ value: 90.582
1284
+ - type: mrr_at_100
1285
+ value: 90.617
1286
+ - type: mrr_at_1000
1287
+ value: 90.617
1288
+ - type: mrr_at_3
1289
+ value: 90.132
1290
+ - type: mrr_at_5
1291
+ value: 90.447
1292
+ - type: ndcg_at_1
1293
+ value: 84.353
1294
+ - type: ndcg_at_10
1295
+ value: 89.003
1296
+ - type: ndcg_at_100
1297
+ value: 89.60000000000001
1298
+ - type: ndcg_at_1000
1299
+ value: 89.836
1300
+ - type: ndcg_at_3
1301
+ value: 87.81400000000001
1302
+ - type: ndcg_at_5
1303
+ value: 88.478
1304
+ - type: precision_at_1
1305
+ value: 84.353
1306
+ - type: precision_at_10
1307
+ value: 10.482
1308
+ - type: precision_at_100
1309
+ value: 1.099
1310
+ - type: precision_at_1000
1311
+ value: 0.11399999999999999
1312
+ - type: precision_at_3
1313
+ value: 33.257999999999996
1314
+ - type: precision_at_5
1315
+ value: 20.465
1316
+ - type: recall_at_1
1317
+ value: 78.036
1318
+ - type: recall_at_10
1319
+ value: 94.517
1320
+ - type: recall_at_100
1321
+ value: 96.828
1322
+ - type: recall_at_1000
1323
+ value: 98.261
1324
+ - type: recall_at_3
1325
+ value: 91.12
1326
+ - type: recall_at_5
1327
+ value: 92.946
1328
+ - task:
1329
+ type: Retrieval
1330
+ dataset:
1331
+ type: fiqa
1332
+ name: MTEB FiQA2018
1333
+ config: default
1334
+ split: test
1335
+ revision: None
1336
+ metrics:
1337
+ - type: map_at_1
1338
+ value: 20.191
1339
+ - type: map_at_10
1340
+ value: 32.369
1341
+ - type: map_at_100
1342
+ value: 34.123999999999995
1343
+ - type: map_at_1000
1344
+ value: 34.317
1345
+ - type: map_at_3
1346
+ value: 28.71
1347
+ - type: map_at_5
1348
+ value: 30.607
1349
+ - type: mrr_at_1
1350
+ value: 40.894999999999996
1351
+ - type: mrr_at_10
1352
+ value: 48.842
1353
+ - type: mrr_at_100
1354
+ value: 49.599
1355
+ - type: mrr_at_1000
1356
+ value: 49.647000000000006
1357
+ - type: mrr_at_3
1358
+ value: 46.785
1359
+ - type: mrr_at_5
1360
+ value: 47.672
1361
+ - type: ndcg_at_1
1362
+ value: 40.894999999999996
1363
+ - type: ndcg_at_10
1364
+ value: 39.872
1365
+ - type: ndcg_at_100
1366
+ value: 46.126
1367
+ - type: ndcg_at_1000
1368
+ value: 49.476
1369
+ - type: ndcg_at_3
1370
+ value: 37.153000000000006
1371
+ - type: ndcg_at_5
1372
+ value: 37.433
1373
+ - type: precision_at_1
1374
+ value: 40.894999999999996
1375
+ - type: precision_at_10
1376
+ value: 10.818
1377
+ - type: precision_at_100
1378
+ value: 1.73
1379
+ - type: precision_at_1000
1380
+ value: 0.231
1381
+ - type: precision_at_3
1382
+ value: 25.051000000000002
1383
+ - type: precision_at_5
1384
+ value: 17.531
1385
+ - type: recall_at_1
1386
+ value: 20.191
1387
+ - type: recall_at_10
1388
+ value: 45.768
1389
+ - type: recall_at_100
1390
+ value: 68.82000000000001
1391
+ - type: recall_at_1000
1392
+ value: 89.133
1393
+ - type: recall_at_3
1394
+ value: 33.296
1395
+ - type: recall_at_5
1396
+ value: 38.022
1397
+ - task:
1398
+ type: Retrieval
1399
+ dataset:
1400
+ type: hotpotqa
1401
+ name: MTEB HotpotQA
1402
+ config: default
1403
+ split: test
1404
+ revision: None
1405
+ metrics:
1406
+ - type: map_at_1
1407
+ value: 39.257
1408
+ - type: map_at_10
1409
+ value: 61.467000000000006
1410
+ - type: map_at_100
1411
+ value: 62.364
1412
+ - type: map_at_1000
1413
+ value: 62.424
1414
+ - type: map_at_3
1415
+ value: 58.228
1416
+ - type: map_at_5
1417
+ value: 60.283
1418
+ - type: mrr_at_1
1419
+ value: 78.515
1420
+ - type: mrr_at_10
1421
+ value: 84.191
1422
+ - type: mrr_at_100
1423
+ value: 84.378
1424
+ - type: mrr_at_1000
1425
+ value: 84.385
1426
+ - type: mrr_at_3
1427
+ value: 83.284
1428
+ - type: mrr_at_5
1429
+ value: 83.856
1430
+ - type: ndcg_at_1
1431
+ value: 78.515
1432
+ - type: ndcg_at_10
1433
+ value: 69.78999999999999
1434
+ - type: ndcg_at_100
1435
+ value: 72.886
1436
+ - type: ndcg_at_1000
1437
+ value: 74.015
1438
+ - type: ndcg_at_3
1439
+ value: 65.23
1440
+ - type: ndcg_at_5
1441
+ value: 67.80199999999999
1442
+ - type: precision_at_1
1443
+ value: 78.515
1444
+ - type: precision_at_10
1445
+ value: 14.519000000000002
1446
+ - type: precision_at_100
1447
+ value: 1.694
1448
+ - type: precision_at_1000
1449
+ value: 0.184
1450
+ - type: precision_at_3
1451
+ value: 41.702
1452
+ - type: precision_at_5
1453
+ value: 27.046999999999997
1454
+ - type: recall_at_1
1455
+ value: 39.257
1456
+ - type: recall_at_10
1457
+ value: 72.59299999999999
1458
+ - type: recall_at_100
1459
+ value: 84.679
1460
+ - type: recall_at_1000
1461
+ value: 92.12
1462
+ - type: recall_at_3
1463
+ value: 62.552
1464
+ - type: recall_at_5
1465
+ value: 67.616
1466
+ - task:
1467
+ type: Classification
1468
+ dataset:
1469
+ type: mteb/imdb
1470
+ name: MTEB ImdbClassification
1471
+ config: default
1472
+ split: test
1473
+ revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7
1474
+ metrics:
1475
+ - type: accuracy
1476
+ value: 91.5152
1477
+ - type: ap
1478
+ value: 87.64584669595709
1479
+ - type: f1
1480
+ value: 91.50605576428437
1481
+ - task:
1482
+ type: Retrieval
1483
+ dataset:
1484
+ type: msmarco
1485
+ name: MTEB MSMARCO
1486
+ config: default
1487
+ split: dev
1488
+ revision: None
1489
+ metrics:
1490
+ - type: map_at_1
1491
+ value: 21.926000000000002
1492
+ - type: map_at_10
1493
+ value: 34.049
1494
+ - type: map_at_100
1495
+ value: 35.213
1496
+ - type: map_at_1000
1497
+ value: 35.265
1498
+ - type: map_at_3
1499
+ value: 30.309
1500
+ - type: map_at_5
1501
+ value: 32.407000000000004
1502
+ - type: mrr_at_1
1503
+ value: 22.55
1504
+ - type: mrr_at_10
1505
+ value: 34.657
1506
+ - type: mrr_at_100
1507
+ value: 35.760999999999996
1508
+ - type: mrr_at_1000
1509
+ value: 35.807
1510
+ - type: mrr_at_3
1511
+ value: 30.989
1512
+ - type: mrr_at_5
1513
+ value: 33.039
1514
+ - type: ndcg_at_1
1515
+ value: 22.55
1516
+ - type: ndcg_at_10
1517
+ value: 40.842
1518
+ - type: ndcg_at_100
1519
+ value: 46.436
1520
+ - type: ndcg_at_1000
1521
+ value: 47.721999999999994
1522
+ - type: ndcg_at_3
1523
+ value: 33.209
1524
+ - type: ndcg_at_5
1525
+ value: 36.943
1526
+ - type: precision_at_1
1527
+ value: 22.55
1528
+ - type: precision_at_10
1529
+ value: 6.447
1530
+ - type: precision_at_100
1531
+ value: 0.9249999999999999
1532
+ - type: precision_at_1000
1533
+ value: 0.104
1534
+ - type: precision_at_3
1535
+ value: 14.136000000000001
1536
+ - type: precision_at_5
1537
+ value: 10.381
1538
+ - type: recall_at_1
1539
+ value: 21.926000000000002
1540
+ - type: recall_at_10
1541
+ value: 61.724999999999994
1542
+ - type: recall_at_100
1543
+ value: 87.604
1544
+ - type: recall_at_1000
1545
+ value: 97.421
1546
+ - type: recall_at_3
1547
+ value: 40.944
1548
+ - type: recall_at_5
1549
+ value: 49.915
1550
+ - task:
1551
+ type: Classification
1552
+ dataset:
1553
+ type: mteb/mtop_domain
1554
+ name: MTEB MTOPDomainClassification (en)
1555
+ config: en
1556
+ split: test
1557
+ revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
1558
+ metrics:
1559
+ - type: accuracy
1560
+ value: 93.54765161878704
1561
+ - type: f1
1562
+ value: 93.3298945415573
1563
+ - task:
1564
+ type: Classification
1565
+ dataset:
1566
+ type: mteb/mtop_intent
1567
+ name: MTEB MTOPIntentClassification (en)
1568
+ config: en
1569
+ split: test
1570
+ revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
1571
+ metrics:
1572
+ - type: accuracy
1573
+ value: 75.71591427268582
1574
+ - type: f1
1575
+ value: 59.32113870474471
1576
+ - task:
1577
+ type: Classification
1578
+ dataset:
1579
+ type: mteb/amazon_massive_intent
1580
+ name: MTEB MassiveIntentClassification (en)
1581
+ config: en
1582
+ split: test
1583
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1584
+ metrics:
1585
+ - type: accuracy
1586
+ value: 75.83053127101547
1587
+ - type: f1
1588
+ value: 73.60757944876475
1589
+ - task:
1590
+ type: Classification
1591
+ dataset:
1592
+ type: mteb/amazon_massive_scenario
1593
+ name: MTEB MassiveScenarioClassification (en)
1594
+ config: en
1595
+ split: test
1596
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1597
+ metrics:
1598
+ - type: accuracy
1599
+ value: 78.72562205783457
1600
+ - type: f1
1601
+ value: 78.63761662505502
1602
+ - task:
1603
+ type: Clustering
1604
+ dataset:
1605
+ type: mteb/medrxiv-clustering-p2p
1606
+ name: MTEB MedrxivClusteringP2P
1607
+ config: default
1608
+ split: test
1609
+ revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
1610
+ metrics:
1611
+ - type: v_measure
1612
+ value: 33.37935633767996
1613
+ - task:
1614
+ type: Clustering
1615
+ dataset:
1616
+ type: mteb/medrxiv-clustering-s2s
1617
+ name: MTEB MedrxivClusteringS2S
1618
+ config: default
1619
+ split: test
1620
+ revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
1621
+ metrics:
1622
+ - type: v_measure
1623
+ value: 31.55270546130387
1624
+ - task:
1625
+ type: Reranking
1626
+ dataset:
1627
+ type: mteb/mind_small
1628
+ name: MTEB MindSmallReranking
1629
+ config: default
1630
+ split: test
1631
+ revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69
1632
+ metrics:
1633
+ - type: map
1634
+ value: 30.462692753143834
1635
+ - type: mrr
1636
+ value: 31.497569753511563
1637
+ - task:
1638
+ type: Retrieval
1639
+ dataset:
1640
+ type: nfcorpus
1641
+ name: MTEB NFCorpus
1642
+ config: default
1643
+ split: test
1644
+ revision: None
1645
+ metrics:
1646
+ - type: map_at_1
1647
+ value: 5.646
1648
+ - type: map_at_10
1649
+ value: 12.498
1650
+ - type: map_at_100
1651
+ value: 15.486
1652
+ - type: map_at_1000
1653
+ value: 16.805999999999997
1654
+ - type: map_at_3
1655
+ value: 9.325
1656
+ - type: map_at_5
1657
+ value: 10.751
1658
+ - type: mrr_at_1
1659
+ value: 43.034
1660
+ - type: mrr_at_10
1661
+ value: 52.662
1662
+ - type: mrr_at_100
1663
+ value: 53.189
1664
+ - type: mrr_at_1000
1665
+ value: 53.25
1666
+ - type: mrr_at_3
1667
+ value: 50.929
1668
+ - type: mrr_at_5
1669
+ value: 51.92
1670
+ - type: ndcg_at_1
1671
+ value: 41.796
1672
+ - type: ndcg_at_10
1673
+ value: 33.477000000000004
1674
+ - type: ndcg_at_100
1675
+ value: 29.996000000000002
1676
+ - type: ndcg_at_1000
1677
+ value: 38.864
1678
+ - type: ndcg_at_3
1679
+ value: 38.940000000000005
1680
+ - type: ndcg_at_5
1681
+ value: 36.689
1682
+ - type: precision_at_1
1683
+ value: 43.034
1684
+ - type: precision_at_10
1685
+ value: 24.799
1686
+ - type: precision_at_100
1687
+ value: 7.432999999999999
1688
+ - type: precision_at_1000
1689
+ value: 1.9929999999999999
1690
+ - type: precision_at_3
1691
+ value: 36.842000000000006
1692
+ - type: precision_at_5
1693
+ value: 32.135999999999996
1694
+ - type: recall_at_1
1695
+ value: 5.646
1696
+ - type: recall_at_10
1697
+ value: 15.963
1698
+ - type: recall_at_100
1699
+ value: 29.492
1700
+ - type: recall_at_1000
1701
+ value: 61.711000000000006
1702
+ - type: recall_at_3
1703
+ value: 10.585
1704
+ - type: recall_at_5
1705
+ value: 12.753999999999998
1706
+ - task:
1707
+ type: Retrieval
1708
+ dataset:
1709
+ type: nq
1710
+ name: MTEB NQ
1711
+ config: default
1712
+ split: test
1713
+ revision: None
1714
+ metrics:
1715
+ - type: map_at_1
1716
+ value: 27.602
1717
+ - type: map_at_10
1718
+ value: 41.545
1719
+ - type: map_at_100
1720
+ value: 42.644999999999996
1721
+ - type: map_at_1000
1722
+ value: 42.685
1723
+ - type: map_at_3
1724
+ value: 37.261
1725
+ - type: map_at_5
1726
+ value: 39.706
1727
+ - type: mrr_at_1
1728
+ value: 31.141000000000002
1729
+ - type: mrr_at_10
1730
+ value: 44.139
1731
+ - type: mrr_at_100
1732
+ value: 44.997
1733
+ - type: mrr_at_1000
1734
+ value: 45.025999999999996
1735
+ - type: mrr_at_3
1736
+ value: 40.503
1737
+ - type: mrr_at_5
1738
+ value: 42.64
1739
+ - type: ndcg_at_1
1740
+ value: 31.141000000000002
1741
+ - type: ndcg_at_10
1742
+ value: 48.995
1743
+ - type: ndcg_at_100
1744
+ value: 53.788000000000004
1745
+ - type: ndcg_at_1000
1746
+ value: 54.730000000000004
1747
+ - type: ndcg_at_3
1748
+ value: 40.844
1749
+ - type: ndcg_at_5
1750
+ value: 44.955
1751
+ - type: precision_at_1
1752
+ value: 31.141000000000002
1753
+ - type: precision_at_10
1754
+ value: 8.233
1755
+ - type: precision_at_100
1756
+ value: 1.093
1757
+ - type: precision_at_1000
1758
+ value: 0.11800000000000001
1759
+ - type: precision_at_3
1760
+ value: 18.579
1761
+ - type: precision_at_5
1762
+ value: 13.533999999999999
1763
+ - type: recall_at_1
1764
+ value: 27.602
1765
+ - type: recall_at_10
1766
+ value: 69.216
1767
+ - type: recall_at_100
1768
+ value: 90.252
1769
+ - type: recall_at_1000
1770
+ value: 97.27
1771
+ - type: recall_at_3
1772
+ value: 47.987
1773
+ - type: recall_at_5
1774
+ value: 57.438
1775
+ - task:
1776
+ type: Retrieval
1777
+ dataset:
1778
+ type: quora
1779
+ name: MTEB QuoraRetrieval
1780
+ config: default
1781
+ split: test
1782
+ revision: None
1783
+ metrics:
1784
+ - type: map_at_1
1785
+ value: 70.949
1786
+ - type: map_at_10
1787
+ value: 84.89999999999999
1788
+ - type: map_at_100
1789
+ value: 85.531
1790
+ - type: map_at_1000
1791
+ value: 85.548
1792
+ - type: map_at_3
1793
+ value: 82.027
1794
+ - type: map_at_5
1795
+ value: 83.853
1796
+ - type: mrr_at_1
1797
+ value: 81.69999999999999
1798
+ - type: mrr_at_10
1799
+ value: 87.813
1800
+ - type: mrr_at_100
1801
+ value: 87.917
1802
+ - type: mrr_at_1000
1803
+ value: 87.91799999999999
1804
+ - type: mrr_at_3
1805
+ value: 86.938
1806
+ - type: mrr_at_5
1807
+ value: 87.53999999999999
1808
+ - type: ndcg_at_1
1809
+ value: 81.75
1810
+ - type: ndcg_at_10
1811
+ value: 88.55499999999999
1812
+ - type: ndcg_at_100
1813
+ value: 89.765
1814
+ - type: ndcg_at_1000
1815
+ value: 89.871
1816
+ - type: ndcg_at_3
1817
+ value: 85.905
1818
+ - type: ndcg_at_5
1819
+ value: 87.41
1820
+ - type: precision_at_1
1821
+ value: 81.75
1822
+ - type: precision_at_10
1823
+ value: 13.403
1824
+ - type: precision_at_100
1825
+ value: 1.528
1826
+ - type: precision_at_1000
1827
+ value: 0.157
1828
+ - type: precision_at_3
1829
+ value: 37.597
1830
+ - type: precision_at_5
1831
+ value: 24.69
1832
+ - type: recall_at_1
1833
+ value: 70.949
1834
+ - type: recall_at_10
1835
+ value: 95.423
1836
+ - type: recall_at_100
1837
+ value: 99.509
1838
+ - type: recall_at_1000
1839
+ value: 99.982
1840
+ - type: recall_at_3
1841
+ value: 87.717
1842
+ - type: recall_at_5
1843
+ value: 92.032
1844
+ - task:
1845
+ type: Clustering
1846
+ dataset:
1847
+ type: mteb/reddit-clustering
1848
+ name: MTEB RedditClustering
1849
+ config: default
1850
+ split: test
1851
+ revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
1852
+ metrics:
1853
+ - type: v_measure
1854
+ value: 51.76962893449579
1855
+ - task:
1856
+ type: Clustering
1857
+ dataset:
1858
+ type: mteb/reddit-clustering-p2p
1859
+ name: MTEB RedditClusteringP2P
1860
+ config: default
1861
+ split: test
1862
+ revision: 282350215ef01743dc01b456c7f5241fa8937f16
1863
+ metrics:
1864
+ - type: v_measure
1865
+ value: 62.32897690686379
1866
+ - task:
1867
+ type: Retrieval
1868
+ dataset:
1869
+ type: scidocs
1870
+ name: MTEB SCIDOCS
1871
+ config: default
1872
+ split: test
1873
+ revision: None
1874
+ metrics:
1875
+ - type: map_at_1
1876
+ value: 4.478
1877
+ - type: map_at_10
1878
+ value: 11.994
1879
+ - type: map_at_100
1880
+ value: 13.977
1881
+ - type: map_at_1000
1882
+ value: 14.295
1883
+ - type: map_at_3
1884
+ value: 8.408999999999999
1885
+ - type: map_at_5
1886
+ value: 10.024
1887
+ - type: mrr_at_1
1888
+ value: 22.1
1889
+ - type: mrr_at_10
1890
+ value: 33.526
1891
+ - type: mrr_at_100
1892
+ value: 34.577000000000005
1893
+ - type: mrr_at_1000
1894
+ value: 34.632000000000005
1895
+ - type: mrr_at_3
1896
+ value: 30.217
1897
+ - type: mrr_at_5
1898
+ value: 31.962000000000003
1899
+ - type: ndcg_at_1
1900
+ value: 22.1
1901
+ - type: ndcg_at_10
1902
+ value: 20.191
1903
+ - type: ndcg_at_100
1904
+ value: 27.954
1905
+ - type: ndcg_at_1000
1906
+ value: 33.491
1907
+ - type: ndcg_at_3
1908
+ value: 18.787000000000003
1909
+ - type: ndcg_at_5
1910
+ value: 16.378999999999998
1911
+ - type: precision_at_1
1912
+ value: 22.1
1913
+ - type: precision_at_10
1914
+ value: 10.69
1915
+ - type: precision_at_100
1916
+ value: 2.1919999999999997
1917
+ - type: precision_at_1000
1918
+ value: 0.35200000000000004
1919
+ - type: precision_at_3
1920
+ value: 17.732999999999997
1921
+ - type: precision_at_5
1922
+ value: 14.499999999999998
1923
+ - type: recall_at_1
1924
+ value: 4.478
1925
+ - type: recall_at_10
1926
+ value: 21.657
1927
+ - type: recall_at_100
1928
+ value: 44.54
1929
+ - type: recall_at_1000
1930
+ value: 71.542
1931
+ - type: recall_at_3
1932
+ value: 10.778
1933
+ - type: recall_at_5
1934
+ value: 14.687
1935
+ - task:
1936
+ type: STS
1937
+ dataset:
1938
+ type: mteb/sickr-sts
1939
+ name: MTEB SICK-R
1940
+ config: default
1941
+ split: test
1942
+ revision: a6ea5a8cab320b040a23452cc28066d9beae2cee
1943
+ metrics:
1944
+ - type: cos_sim_pearson
1945
+ value: 82.82325259156718
1946
+ - type: cos_sim_spearman
1947
+ value: 79.2463589100662
1948
+ - type: euclidean_pearson
1949
+ value: 80.48318380496771
1950
+ - type: euclidean_spearman
1951
+ value: 79.34451935199979
1952
+ - type: manhattan_pearson
1953
+ value: 80.39041824178759
1954
+ - type: manhattan_spearman
1955
+ value: 79.23002892700211
1956
+ - task:
1957
+ type: STS
1958
+ dataset:
1959
+ type: mteb/sts12-sts
1960
+ name: MTEB STS12
1961
+ config: default
1962
+ split: test
1963
+ revision: a0d554a64d88156834ff5ae9920b964011b16384
1964
+ metrics:
1965
+ - type: cos_sim_pearson
1966
+ value: 85.74130231431258
1967
+ - type: cos_sim_spearman
1968
+ value: 78.36856568042397
1969
+ - type: euclidean_pearson
1970
+ value: 82.48301631890303
1971
+ - type: euclidean_spearman
1972
+ value: 78.28376980722732
1973
+ - type: manhattan_pearson
1974
+ value: 82.43552075450525
1975
+ - type: manhattan_spearman
1976
+ value: 78.22702443947126
1977
+ - task:
1978
+ type: STS
1979
+ dataset:
1980
+ type: mteb/sts13-sts
1981
+ name: MTEB STS13
1982
+ config: default
1983
+ split: test
1984
+ revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
1985
+ metrics:
1986
+ - type: cos_sim_pearson
1987
+ value: 79.96138619461459
1988
+ - type: cos_sim_spearman
1989
+ value: 81.85436343502379
1990
+ - type: euclidean_pearson
1991
+ value: 81.82895226665367
1992
+ - type: euclidean_spearman
1993
+ value: 82.22707349602916
1994
+ - type: manhattan_pearson
1995
+ value: 81.66303369445873
1996
+ - type: manhattan_spearman
1997
+ value: 82.05030197179455
1998
+ - task:
1999
+ type: STS
2000
+ dataset:
2001
+ type: mteb/sts14-sts
2002
+ name: MTEB STS14
2003
+ config: default
2004
+ split: test
2005
+ revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
2006
+ metrics:
2007
+ - type: cos_sim_pearson
2008
+ value: 80.05481244198648
2009
+ - type: cos_sim_spearman
2010
+ value: 80.85052504637808
2011
+ - type: euclidean_pearson
2012
+ value: 80.86728419744497
2013
+ - type: euclidean_spearman
2014
+ value: 81.033786401512
2015
+ - type: manhattan_pearson
2016
+ value: 80.90107531061103
2017
+ - type: manhattan_spearman
2018
+ value: 81.11374116827795
2019
+ - task:
2020
+ type: STS
2021
+ dataset:
2022
+ type: mteb/sts15-sts
2023
+ name: MTEB STS15
2024
+ config: default
2025
+ split: test
2026
+ revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
2027
+ metrics:
2028
+ - type: cos_sim_pearson
2029
+ value: 84.615220756399
2030
+ - type: cos_sim_spearman
2031
+ value: 86.46858500002092
2032
+ - type: euclidean_pearson
2033
+ value: 86.08307800247586
2034
+ - type: euclidean_spearman
2035
+ value: 86.72691443870013
2036
+ - type: manhattan_pearson
2037
+ value: 85.96155594487269
2038
+ - type: manhattan_spearman
2039
+ value: 86.605909505275
2040
+ - task:
2041
+ type: STS
2042
+ dataset:
2043
+ type: mteb/sts16-sts
2044
+ name: MTEB STS16
2045
+ config: default
2046
+ split: test
2047
+ revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
2048
+ metrics:
2049
+ - type: cos_sim_pearson
2050
+ value: 82.14363913634436
2051
+ - type: cos_sim_spearman
2052
+ value: 84.48430226487102
2053
+ - type: euclidean_pearson
2054
+ value: 83.75303424801902
2055
+ - type: euclidean_spearman
2056
+ value: 84.56762380734538
2057
+ - type: manhattan_pearson
2058
+ value: 83.6135447165928
2059
+ - type: manhattan_spearman
2060
+ value: 84.39898212616731
2061
+ - task:
2062
+ type: STS
2063
+ dataset:
2064
+ type: mteb/sts17-crosslingual-sts
2065
+ name: MTEB STS17 (en-en)
2066
+ config: en-en
2067
+ split: test
2068
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2069
+ metrics:
2070
+ - type: cos_sim_pearson
2071
+ value: 85.09909252554525
2072
+ - type: cos_sim_spearman
2073
+ value: 85.70951402743276
2074
+ - type: euclidean_pearson
2075
+ value: 87.1991936239908
2076
+ - type: euclidean_spearman
2077
+ value: 86.07745840612071
2078
+ - type: manhattan_pearson
2079
+ value: 87.25039137549952
2080
+ - type: manhattan_spearman
2081
+ value: 85.99938746659761
2082
+ - task:
2083
+ type: STS
2084
+ dataset:
2085
+ type: mteb/sts22-crosslingual-sts
2086
+ name: MTEB STS22 (en)
2087
+ config: en
2088
+ split: test
2089
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2090
+ metrics:
2091
+ - type: cos_sim_pearson
2092
+ value: 63.529332093413615
2093
+ - type: cos_sim_spearman
2094
+ value: 65.38177340147439
2095
+ - type: euclidean_pearson
2096
+ value: 66.35278011412136
2097
+ - type: euclidean_spearman
2098
+ value: 65.47147267032997
2099
+ - type: manhattan_pearson
2100
+ value: 66.71804682408693
2101
+ - type: manhattan_spearman
2102
+ value: 65.67406521423597
2103
+ - task:
2104
+ type: STS
2105
+ dataset:
2106
+ type: mteb/stsbenchmark-sts
2107
+ name: MTEB STSBenchmark
2108
+ config: default
2109
+ split: test
2110
+ revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
2111
+ metrics:
2112
+ - type: cos_sim_pearson
2113
+ value: 82.45802942885662
2114
+ - type: cos_sim_spearman
2115
+ value: 84.8853341842566
2116
+ - type: euclidean_pearson
2117
+ value: 84.60915021096707
2118
+ - type: euclidean_spearman
2119
+ value: 85.11181242913666
2120
+ - type: manhattan_pearson
2121
+ value: 84.38600521210364
2122
+ - type: manhattan_spearman
2123
+ value: 84.89045417981723
2124
+ - task:
2125
+ type: Reranking
2126
+ dataset:
2127
+ type: mteb/scidocs-reranking
2128
+ name: MTEB SciDocsRR
2129
+ config: default
2130
+ split: test
2131
+ revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab
2132
+ metrics:
2133
+ - type: map
2134
+ value: 85.92793380635129
2135
+ - type: mrr
2136
+ value: 95.85834191226348
2137
+ - task:
2138
+ type: Retrieval
2139
+ dataset:
2140
+ type: scifact
2141
+ name: MTEB SciFact
2142
+ config: default
2143
+ split: test
2144
+ revision: None
2145
+ metrics:
2146
+ - type: map_at_1
2147
+ value: 55.74400000000001
2148
+ - type: map_at_10
2149
+ value: 65.455
2150
+ - type: map_at_100
2151
+ value: 66.106
2152
+ - type: map_at_1000
2153
+ value: 66.129
2154
+ - type: map_at_3
2155
+ value: 62.719
2156
+ - type: map_at_5
2157
+ value: 64.441
2158
+ - type: mrr_at_1
2159
+ value: 58.667
2160
+ - type: mrr_at_10
2161
+ value: 66.776
2162
+ - type: mrr_at_100
2163
+ value: 67.363
2164
+ - type: mrr_at_1000
2165
+ value: 67.384
2166
+ - type: mrr_at_3
2167
+ value: 64.889
2168
+ - type: mrr_at_5
2169
+ value: 66.122
2170
+ - type: ndcg_at_1
2171
+ value: 58.667
2172
+ - type: ndcg_at_10
2173
+ value: 69.904
2174
+ - type: ndcg_at_100
2175
+ value: 72.807
2176
+ - type: ndcg_at_1000
2177
+ value: 73.423
2178
+ - type: ndcg_at_3
2179
+ value: 65.405
2180
+ - type: ndcg_at_5
2181
+ value: 67.86999999999999
2182
+ - type: precision_at_1
2183
+ value: 58.667
2184
+ - type: precision_at_10
2185
+ value: 9.3
2186
+ - type: precision_at_100
2187
+ value: 1.08
2188
+ - type: precision_at_1000
2189
+ value: 0.11299999999999999
2190
+ - type: precision_at_3
2191
+ value: 25.444
2192
+ - type: precision_at_5
2193
+ value: 17
2194
+ - type: recall_at_1
2195
+ value: 55.74400000000001
2196
+ - type: recall_at_10
2197
+ value: 82.122
2198
+ - type: recall_at_100
2199
+ value: 95.167
2200
+ - type: recall_at_1000
2201
+ value: 100
2202
+ - type: recall_at_3
2203
+ value: 70.14399999999999
2204
+ - type: recall_at_5
2205
+ value: 76.417
2206
+ - task:
2207
+ type: PairClassification
2208
+ dataset:
2209
+ type: mteb/sprintduplicatequestions-pairclassification
2210
+ name: MTEB SprintDuplicateQuestions
2211
+ config: default
2212
+ split: test
2213
+ revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
2214
+ metrics:
2215
+ - type: cos_sim_accuracy
2216
+ value: 99.86534653465347
2217
+ - type: cos_sim_ap
2218
+ value: 96.54142419791388
2219
+ - type: cos_sim_f1
2220
+ value: 93.07535641547861
2221
+ - type: cos_sim_precision
2222
+ value: 94.81327800829875
2223
+ - type: cos_sim_recall
2224
+ value: 91.4
2225
+ - type: dot_accuracy
2226
+ value: 99.86435643564356
2227
+ - type: dot_ap
2228
+ value: 96.53682260449868
2229
+ - type: dot_f1
2230
+ value: 92.98515104966718
2231
+ - type: dot_precision
2232
+ value: 95.27806925498426
2233
+ - type: dot_recall
2234
+ value: 90.8
2235
+ - type: euclidean_accuracy
2236
+ value: 99.86336633663366
2237
+ - type: euclidean_ap
2238
+ value: 96.5228676185697
2239
+ - type: euclidean_f1
2240
+ value: 92.9735234215886
2241
+ - type: euclidean_precision
2242
+ value: 94.70954356846472
2243
+ - type: euclidean_recall
2244
+ value: 91.3
2245
+ - type: manhattan_accuracy
2246
+ value: 99.85841584158416
2247
+ - type: manhattan_ap
2248
+ value: 96.50392760934032
2249
+ - type: manhattan_f1
2250
+ value: 92.84642321160581
2251
+ - type: manhattan_precision
2252
+ value: 92.8928928928929
2253
+ - type: manhattan_recall
2254
+ value: 92.80000000000001
2255
+ - type: max_accuracy
2256
+ value: 99.86534653465347
2257
+ - type: max_ap
2258
+ value: 96.54142419791388
2259
+ - type: max_f1
2260
+ value: 93.07535641547861
2261
+ - task:
2262
+ type: Clustering
2263
+ dataset:
2264
+ type: mteb/stackexchange-clustering
2265
+ name: MTEB StackExchangeClustering
2266
+ config: default
2267
+ split: test
2268
+ revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
2269
+ metrics:
2270
+ - type: v_measure
2271
+ value: 61.08285408766616
2272
+ - task:
2273
+ type: Clustering
2274
+ dataset:
2275
+ type: mteb/stackexchange-clustering-p2p
2276
+ name: MTEB StackExchangeClusteringP2P
2277
+ config: default
2278
+ split: test
2279
+ revision: 815ca46b2622cec33ccafc3735d572c266efdb44
2280
+ metrics:
2281
+ - type: v_measure
2282
+ value: 35.640675309010604
2283
+ - task:
2284
+ type: Reranking
2285
+ dataset:
2286
+ type: mteb/stackoverflowdupquestions-reranking
2287
+ name: MTEB StackOverflowDupQuestions
2288
+ config: default
2289
+ split: test
2290
+ revision: e185fbe320c72810689fc5848eb6114e1ef5ec69
2291
+ metrics:
2292
+ - type: map
2293
+ value: 53.20333913710715
2294
+ - type: mrr
2295
+ value: 54.088813555725324
2296
+ - task:
2297
+ type: Summarization
2298
+ dataset:
2299
+ type: mteb/summeval
2300
+ name: MTEB SummEval
2301
+ config: default
2302
+ split: test
2303
+ revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
2304
+ metrics:
2305
+ - type: cos_sim_pearson
2306
+ value: 30.79465221925075
2307
+ - type: cos_sim_spearman
2308
+ value: 30.530816059163634
2309
+ - type: dot_pearson
2310
+ value: 31.364837244718043
2311
+ - type: dot_spearman
2312
+ value: 30.79726823684003
2313
+ - task:
2314
+ type: Retrieval
2315
+ dataset:
2316
+ type: trec-covid
2317
+ name: MTEB TRECCOVID
2318
+ config: default
2319
+ split: test
2320
+ revision: None
2321
+ metrics:
2322
+ - type: map_at_1
2323
+ value: 0.22599999999999998
2324
+ - type: map_at_10
2325
+ value: 1.735
2326
+ - type: map_at_100
2327
+ value: 8.978
2328
+ - type: map_at_1000
2329
+ value: 20.851
2330
+ - type: map_at_3
2331
+ value: 0.613
2332
+ - type: map_at_5
2333
+ value: 0.964
2334
+ - type: mrr_at_1
2335
+ value: 88
2336
+ - type: mrr_at_10
2337
+ value: 92.867
2338
+ - type: mrr_at_100
2339
+ value: 92.867
2340
+ - type: mrr_at_1000
2341
+ value: 92.867
2342
+ - type: mrr_at_3
2343
+ value: 92.667
2344
+ - type: mrr_at_5
2345
+ value: 92.667
2346
+ - type: ndcg_at_1
2347
+ value: 82
2348
+ - type: ndcg_at_10
2349
+ value: 73.164
2350
+ - type: ndcg_at_100
2351
+ value: 51.878
2352
+ - type: ndcg_at_1000
2353
+ value: 44.864
2354
+ - type: ndcg_at_3
2355
+ value: 79.184
2356
+ - type: ndcg_at_5
2357
+ value: 76.39
2358
+ - type: precision_at_1
2359
+ value: 88
2360
+ - type: precision_at_10
2361
+ value: 76.2
2362
+ - type: precision_at_100
2363
+ value: 52.459999999999994
2364
+ - type: precision_at_1000
2365
+ value: 19.692
2366
+ - type: precision_at_3
2367
+ value: 82.667
2368
+ - type: precision_at_5
2369
+ value: 80
2370
+ - type: recall_at_1
2371
+ value: 0.22599999999999998
2372
+ - type: recall_at_10
2373
+ value: 1.942
2374
+ - type: recall_at_100
2375
+ value: 12.342
2376
+ - type: recall_at_1000
2377
+ value: 41.42
2378
+ - type: recall_at_3
2379
+ value: 0.637
2380
+ - type: recall_at_5
2381
+ value: 1.034
2382
+ - task:
2383
+ type: Retrieval
2384
+ dataset:
2385
+ type: webis-touche2020
2386
+ name: MTEB Touche2020
2387
+ config: default
2388
+ split: test
2389
+ revision: None
2390
+ metrics:
2391
+ - type: map_at_1
2392
+ value: 3.567
2393
+ - type: map_at_10
2394
+ value: 13.116
2395
+ - type: map_at_100
2396
+ value: 19.39
2397
+ - type: map_at_1000
2398
+ value: 20.988
2399
+ - type: map_at_3
2400
+ value: 7.109
2401
+ - type: map_at_5
2402
+ value: 9.950000000000001
2403
+ - type: mrr_at_1
2404
+ value: 42.857
2405
+ - type: mrr_at_10
2406
+ value: 57.404999999999994
2407
+ - type: mrr_at_100
2408
+ value: 58.021
2409
+ - type: mrr_at_1000
2410
+ value: 58.021
2411
+ - type: mrr_at_3
2412
+ value: 54.762
2413
+ - type: mrr_at_5
2414
+ value: 56.19
2415
+ - type: ndcg_at_1
2416
+ value: 38.775999999999996
2417
+ - type: ndcg_at_10
2418
+ value: 30.359
2419
+ - type: ndcg_at_100
2420
+ value: 41.284
2421
+ - type: ndcg_at_1000
2422
+ value: 52.30200000000001
2423
+ - type: ndcg_at_3
2424
+ value: 36.744
2425
+ - type: ndcg_at_5
2426
+ value: 34.326
2427
+ - type: precision_at_1
2428
+ value: 42.857
2429
+ - type: precision_at_10
2430
+ value: 26.122
2431
+ - type: precision_at_100
2432
+ value: 8.082
2433
+ - type: precision_at_1000
2434
+ value: 1.559
2435
+ - type: precision_at_3
2436
+ value: 40.136
2437
+ - type: precision_at_5
2438
+ value: 35.510000000000005
2439
+ - type: recall_at_1
2440
+ value: 3.567
2441
+ - type: recall_at_10
2442
+ value: 19.045
2443
+ - type: recall_at_100
2444
+ value: 49.979
2445
+ - type: recall_at_1000
2446
+ value: 84.206
2447
+ - type: recall_at_3
2448
+ value: 8.52
2449
+ - type: recall_at_5
2450
+ value: 13.103000000000002
2451
+ - task:
2452
+ type: Classification
2453
+ dataset:
2454
+ type: mteb/toxic_conversations_50k
2455
+ name: MTEB ToxicConversationsClassification
2456
+ config: default
2457
+ split: test
2458
+ revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c
2459
+ metrics:
2460
+ - type: accuracy
2461
+ value: 68.8394
2462
+ - type: ap
2463
+ value: 13.454399712443099
2464
+ - type: f1
2465
+ value: 53.04963076364322
2466
+ - task:
2467
+ type: Classification
2468
+ dataset:
2469
+ type: mteb/tweet_sentiment_extraction
2470
+ name: MTEB TweetSentimentExtractionClassification
2471
+ config: default
2472
+ split: test
2473
+ revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
2474
+ metrics:
2475
+ - type: accuracy
2476
+ value: 60.546123372948514
2477
+ - type: f1
2478
+ value: 60.86952793277713
2479
+ - task:
2480
+ type: Clustering
2481
+ dataset:
2482
+ type: mteb/twentynewsgroups-clustering
2483
+ name: MTEB TwentyNewsgroupsClustering
2484
+ config: default
2485
+ split: test
2486
+ revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
2487
+ metrics:
2488
+ - type: v_measure
2489
+ value: 49.10042955060234
2490
+ - task:
2491
+ type: PairClassification
2492
+ dataset:
2493
+ type: mteb/twittersemeval2015-pairclassification
2494
+ name: MTEB TwitterSemEval2015
2495
+ config: default
2496
+ split: test
2497
+ revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
2498
+ metrics:
2499
+ - type: cos_sim_accuracy
2500
+ value: 85.03308100375514
2501
+ - type: cos_sim_ap
2502
+ value: 71.08284605869684
2503
+ - type: cos_sim_f1
2504
+ value: 65.42539436255494
2505
+ - type: cos_sim_precision
2506
+ value: 64.14807302231237
2507
+ - type: cos_sim_recall
2508
+ value: 66.75461741424802
2509
+ - type: dot_accuracy
2510
+ value: 84.68736961316088
2511
+ - type: dot_ap
2512
+ value: 69.20524036530992
2513
+ - type: dot_f1
2514
+ value: 63.54893953365829
2515
+ - type: dot_precision
2516
+ value: 63.45698500394633
2517
+ - type: dot_recall
2518
+ value: 63.641160949868066
2519
+ - type: euclidean_accuracy
2520
+ value: 85.07480479227513
2521
+ - type: euclidean_ap
2522
+ value: 71.14592761009864
2523
+ - type: euclidean_f1
2524
+ value: 65.43814432989691
2525
+ - type: euclidean_precision
2526
+ value: 63.95465994962216
2527
+ - type: euclidean_recall
2528
+ value: 66.99208443271768
2529
+ - type: manhattan_accuracy
2530
+ value: 85.06288370984085
2531
+ - type: manhattan_ap
2532
+ value: 71.07289742593868
2533
+ - type: manhattan_f1
2534
+ value: 65.37585421412301
2535
+ - type: manhattan_precision
2536
+ value: 62.816147859922175
2537
+ - type: manhattan_recall
2538
+ value: 68.15303430079156
2539
+ - type: max_accuracy
2540
+ value: 85.07480479227513
2541
+ - type: max_ap
2542
+ value: 71.14592761009864
2543
+ - type: max_f1
2544
+ value: 65.43814432989691
2545
+ - task:
2546
+ type: PairClassification
2547
+ dataset:
2548
+ type: mteb/twitterurlcorpus-pairclassification
2549
+ name: MTEB TwitterURLCorpus
2550
+ config: default
2551
+ split: test
2552
+ revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
2553
+ metrics:
2554
+ - type: cos_sim_accuracy
2555
+ value: 87.79058485659952
2556
+ - type: cos_sim_ap
2557
+ value: 83.7183187008759
2558
+ - type: cos_sim_f1
2559
+ value: 75.86921142180798
2560
+ - type: cos_sim_precision
2561
+ value: 73.00683371298405
2562
+ - type: cos_sim_recall
2563
+ value: 78.96519864490298
2564
+ - type: dot_accuracy
2565
+ value: 87.0085768618776
2566
+ - type: dot_ap
2567
+ value: 81.87467488474279
2568
+ - type: dot_f1
2569
+ value: 74.04188363990559
2570
+ - type: dot_precision
2571
+ value: 72.10507114191901
2572
+ - type: dot_recall
2573
+ value: 76.08561749307053
2574
+ - type: euclidean_accuracy
2575
+ value: 87.8332751193387
2576
+ - type: euclidean_ap
2577
+ value: 83.83585648120315
2578
+ - type: euclidean_f1
2579
+ value: 76.02582177042369
2580
+ - type: euclidean_precision
2581
+ value: 73.36388371759989
2582
+ - type: euclidean_recall
2583
+ value: 78.88820449645827
2584
+ - type: manhattan_accuracy
2585
+ value: 87.87208444910156
2586
+ - type: manhattan_ap
2587
+ value: 83.8101950642973
2588
+ - type: manhattan_f1
2589
+ value: 75.90454195535027
2590
+ - type: manhattan_precision
2591
+ value: 72.44419564761039
2592
+ - type: manhattan_recall
2593
+ value: 79.71204188481676
2594
+ - type: max_accuracy
2595
+ value: 87.87208444910156
2596
+ - type: max_ap
2597
+ value: 83.83585648120315
2598
+ - type: max_f1
2599
+ value: 76.02582177042369
2600
+ license: mit
2601
+ language:
2602
+ - en
2603
+ pipeline_tag: sentence-similarity
2604
+ ---
2605
+
2606
+
2607
+ <h1 align="center">FlagEmbedding</h1>
2608
+
2609
+
2610
+ <h4 align="center">
2611
+ <p>
2612
+ <a href=#model-list>Model List</a> |
2613
+ <a href=#usage>Usage</a> |
2614
+ <a href="#evaluation">Evaluation</a> |
2615
+ <a href="#train">Train</a> |
2616
+ <a href="#contact">Contact</a> |
2617
+ <a href="#license">License</a>
2618
+ <p>
2619
+ </h4>
2620
+
2621
+ More details please refer to our Github: [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding).
2622
+
2623
+ [English](README.md) | [中文](https://github.com/FlagOpen/FlagEmbedding/blob/master/README_zh.md)
2624
+
2625
+ FlagEmbedding can map any text to a low-dimensional dense vector which can be used for tasks like retrieval, classification, clustering, or semantic search.
2626
+ And it also can be used in vector database for LLMs.
2627
+
2628
+ ************* 🌟**Updates**🌟 *************
2629
+ - 08/09/2023: BGE Models are integrated into **Langchain**, you can use it like [**this**](#using-langchain); C-MTEB **leaderboard** is [avaliable](https://huggingface.co/spaces/mteb/leaderboard).
2630
+ - 08/05/2023: Release base-scale and small-scale models, **best performance among the models of the same size 🤗**
2631
+ - 08/02/2023: Release `bge-large-*`(short for BAAI General Embedding) Models, **rank 1st on MTEB and C-MTEB benchmark!**
2632
+ - 08/01/2023: We release the [Chinese Massive Text Embedding Benchmark](https://github.com/FlagOpen/FlagEmbedding/blob/master/C_MTEB) (**C-MTEB**), consisting of 31 test dataset.
2633
+
2634
+
2635
+ ## Model List
2636
+
2637
+ `bge` is short for `BAAI general embedding`.
2638
+
2639
+ | Model | Language | Description | query instruction for retrieval\* |
2640
+ |:-------------------------------|:--------:| :--------:| :--------:|
2641
+ | [BAAI/bge-large-en](https://huggingface.co/BAAI/bge-large-en) | English | rank **1st** in [MTEB](https://huggingface.co/spaces/mteb/leaderboard) leaderboard | `Represent this sentence for searching relevant passages: ` |
2642
+ | [BAAI/bge-base-en](https://huggingface.co/BAAI/bge-base-en) | English | rank **2nd** in [MTEB](https://huggingface.co/spaces/mteb/leaderboard) leaderboard | `Represent this sentence for searching relevant passages: ` |
2643
+ | [BAAI/bge-small-en](https://huggingface.co/BAAI/bge-small-en) | English | a small-scale model but with competitive performance | `Represent this sentence for searching relevant passages: ` |
2644
+ | [BAAI/bge-large-zh](https://huggingface.co/BAAI/bge-large-zh) | Chinese | rank **1st** in [C-MTEB](https://github.com/FlagOpen/FlagEmbedding/tree/master/C_MTEB) benchmark | `为这个句子生成表示以用于检索相关文章:` |
2645
+ | [BAAI/bge-large-zh-noinstruct](https://huggingface.co/BAAI/bge-large-zh-noinstruct) | Chinese | This model is trained without instruction, and rank **2nd** in [C-MTEB](https://github.com/FlagOpen/FlagEmbedding/tree/master/C_MTEB) benchmark | |
2646
+ | [BAAI/bge-base-zh](https://huggingface.co/BAAI/bge-base-zh) | Chinese | a base-scale model but has similar ability with `bge-large-zh` | `为这个句子生成表示以用于检索相关文章:` |
2647
+ | [BAAI/bge-small-zh](https://huggingface.co/BAAI/bge-small-zh) | Chinese | a small-scale model but with competitive performance | `为这个句子生成表示以用于检索相关文章:` |
2648
+
2649
+ \*: If you need to search the **long** relevant passages to a **short** query (s2p retrieval task), you need to add the instruction to the query; in other cases, no instruction is needed, just use the original query directly. In all cases, **no instruction** need to be added to passages.
2650
+
2651
+ ## Usage
2652
+
2653
+ Here are some examples to use `bge` models with
2654
+ [FlagEmbedding](#using-flagembedding), [Sentence-Transformers](#using-sentence-transformers), [Langchain](#using-langchain), or [Huggingface Transformers](#using-huggingface-transformers).
2655
+
2656
+ #### Using FlagEmbedding
2657
+ ```
2658
+ pip install -U FlagEmbedding
2659
+ ```
2660
+ If it doesn't work for you, you can see [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/baai_general_embedding/README.md) for more methods to install FlagEmbedding.
2661
+
2662
+ ```python
2663
+ from FlagEmbedding import FlagModel
2664
+ sentences = ["样例数据-1", "样例数据-2"]
2665
+ model = FlagModel('BAAI/bge-large-zh', query_instruction_for_retrieval="为这个句子生成表示以用于检索相关文章:")
2666
+ embeddings_1 = model.encode(sentences)
2667
+ embeddings_2 = model.encode(sentences)
2668
+ similarity = embeddings_1 @ embeddings_2.T
2669
+ print(similarity)
2670
+
2671
+ # for s2p(short query to long passage) retrieval task, please use encode_queries() which will automatically add the instruction to each query
2672
+ # corpus in retrieval task can still use encode() or encode_corpus(), since they don't need instruction
2673
+ queries = ['query_1', 'query_2']
2674
+ passages = ["样例文档-1", "样例文档-2"]
2675
+ q_embeddings = model.encode_queries(queries)
2676
+ p_embeddings = model.encode(passages)
2677
+ scores = q_embeddings @ p_embeddings.T
2678
+ ```
2679
+ The value of argument `query_instruction_for_retrieval` see [Model List](https://github.com/FlagOpen/FlagEmbedding/tree/master#model-list).
2680
+
2681
+ FlagModel will use all available GPUs when encoding, please set `os.environ["CUDA_VISIBLE_DEVICES"]` to choose GPU.
2682
+
2683
+
2684
+ #### Using Sentence-Transformers
2685
+
2686
+ Using this model also is easy when you have [sentence-transformers](https://www.SBERT.net) installed:
2687
+
2688
+ ```
2689
+ pip install -U sentence-transformers
2690
+ ```
2691
+ ```python
2692
+ from sentence_transformers import SentenceTransformer
2693
+ sentences = ["样例数据-1", "样例数据-2"]
2694
+ model = SentenceTransformer('BAAI/bge-large-zh')
2695
+ embeddings_1 = model.encode(sentences, normalize_embeddings=True)
2696
+ embeddings_2 = model.encode(sentences, normalize_embeddings=True)
2697
+ similarity = embeddings_1 @ embeddings_2.T
2698
+ print(similarity)
2699
+ ```
2700
+ For s2p(short query to long passage) retrieval task,
2701
+ each short query should start with an instruction (instructions see [Model List](https://github.com/FlagOpen/FlagEmbedding/tree/master#model-list)).
2702
+ But the instruction is not needed for passages.
2703
+ ```python
2704
+ from sentence_transformers import SentenceTransformer
2705
+ queries = ['query_1', 'query_2']
2706
+ passages = ["样例文档-1", "样例文档-2"]
2707
+ instruction = "为这个句子生成表示以用于检索相关文章:"
2708
+
2709
+ model = SentenceTransformer('BAAI/bge-large-zh')
2710
+ q_embeddings = model.encode([instruction+q for q in queries], normalize_embeddings=True)
2711
+ p_embeddings = model.encode(passages, normalize_embeddings=True)
2712
+ scores = q_embeddings @ p_embeddings.T
2713
+ ```
2714
+
2715
+ #### Using Langchain
2716
+
2717
+ You can use `bge` in langchain like this:
2718
+ ```python
2719
+ from langchain.embeddings import HuggingFaceBgeEmbeddings
2720
+ model_name = "BAAI/bge-small-en"
2721
+ model_kwargs = {'device': 'cuda'}
2722
+ encode_kwargs = {'normalize_embeddings': True} # set True to compute cosine similarity
2723
+ model_norm = HuggingFaceBgeEmbeddings(
2724
+ model_name=model_name,
2725
+ model_kwargs=model_kwargs,
2726
+ encode_kwargs=encode_kwargs
2727
+ )
2728
+ ```
2729
+
2730
+
2731
+ #### Using HuggingFace Transformers
2732
+
2733
+ With transformers package, you can use the model like this: First, you pass your input through the transformer model, then you select the last hidden state of first token (i.e., [CLS]) as the sentence embedding.
2734
+
2735
+ ```python
2736
+ from transformers import AutoTokenizer, AutoModel
2737
+ import torch
2738
+ # Sentences we want sentence embeddings for
2739
+ sentences = ["样例数据-1", "样例数据-2"]
2740
+
2741
+ # Load model from HuggingFace Hub
2742
+ tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-large-zh')
2743
+ model = AutoModel.from_pretrained('BAAI/bge-large-zh')
2744
+
2745
+ # Tokenize sentences
2746
+ encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
2747
+ # for s2p(short query to long passage) retrieval task, add an instruction to query (not add instruction for passages)
2748
+ # encoded_input = tokenizer([instruction + q for q in queries], padding=True, truncation=True, return_tensors='pt')
2749
+
2750
+ # Compute token embeddings
2751
+ with torch.no_grad():
2752
+ model_output = model(**encoded_input)
2753
+ # Perform pooling. In this case, cls pooling.
2754
+ sentence_embeddings = model_output[0][:, 0]
2755
+ # normalize embeddings
2756
+ sentence_embeddings = torch.nn.functional.normalize(sentence_embeddings, p=2, dim=1)
2757
+ print("Sentence embeddings:", sentence_embeddings)
2758
+ ```
2759
+
2760
+
2761
+ ## Evaluation
2762
+ `baai-general-embedding` models achieve **state-of-the-art performance on both MTEB and C-MTEB leaderboard!**
2763
+ More details and evaluation tools see our [scripts](https://github.com/FlagOpen/FlagEmbedding/blob/master/C_MTEB/README.md).
2764
+
2765
+ - **MTEB**:
2766
+
2767
+ | Model Name | Dimension | Sequence Length | Average (56) | Retrieval (15) |Clustering (11) | Pair Classification (3) | Reranking (4) | STS (10) | Summarization (1) | Classification (12) |
2768
+ |:----:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
2769
+ | [**bge-large-en**](https://huggingface.co/BAAI/bge-large-en) | 1024 | 512 | **63.98** | **53.9** | **46.98** | 85.8 | **59.48** | 81.56 | 32.06 | **76.21** |
2770
+ | [**bge-base-en**](https://huggingface.co/BAAI/bge-base-en) | 768 | 512 | 63.36 | 53.0 | 46.32 | 85.86 | 58.7 | 81.84 | 29.27 | 75.27 |
2771
+ | [gte-large](https://huggingface.co/thenlper/gte-large) | 1024 | 512 | 63.13 | 52.22 | 46.84 | 85.00 | 59.13 | 83.35 | 31.66 | 73.33 |
2772
+ | [gte-base](https://huggingface.co/thenlper/gte-base) | 768 | 512 | 62.39 | 51.14 | 46.2 | 84.57 | 58.61 | 82.3 | 31.17 | 73.01 |
2773
+ | [e5-large-v2](https://huggingface.co/intfloat/e5-large-v2) | 1024| 512 | 62.25 | 50.56 | 44.49 | 86.03 | 56.61 | 82.05 | 30.19 | 75.24 |
2774
+ | [**bge-small-en**](https://huggingface.co/BAAI/bge-small-en) | 384 | 512 | 62.11 | 51.82 | 44.31 | 83.78 | 57.97 | 80.72 | 30.53 | 74.37 |
2775
+ | [instructor-xl](https://huggingface.co/hkunlp/instructor-xl) | 768 | 512 | 61.79 | 49.26 | 44.74 | 86.62 | 57.29 | 83.06 | 32.32 | 61.79 |
2776
+ | [e5-base-v2](https://huggingface.co/intfloat/e5-base-v2) | 768 | 512 | 61.5 | 50.29 | 43.80 | 85.73 | 55.91 | 81.05 | 30.28 | 73.84 |
2777
+ | [gte-small](https://huggingface.co/thenlper/gte-small) | 384 | 512 | 61.36 | 49.46 | 44.89 | 83.54 | 57.7 | 82.07 | 30.42 | 72.31 |
2778
+ | [text-embedding-ada-002](https://platform.openai.com/docs/guides/embeddings) | 1536 | 8192 | 60.99 | 49.25 | 45.9 | 84.89 | 56.32 | 80.97 | 30.8 | 70.93 |
2779
+ | [e5-small-v2](https://huggingface.co/intfloat/e5-base-v2) | 384 | 512 | 59.93 | 49.04 | 39.92 | 84.67 | 54.32 | 80.39 | 31.16 | 72.94 |
2780
+ | [sentence-t5-xxl](https://huggingface.co/sentence-transformers/sentence-t5-xxl) | 768 | 512 | 59.51 | 42.24 | 43.72 | 85.06 | 56.42 | 82.63 | 30.08 | 73.42 |
2781
+ | [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) | 768 | 514 | 57.78 | 43.81 | 43.69 | 83.04 | 59.36 | 80.28 | 27.49 | 65.07 |
2782
+ | [sgpt-bloom-7b1-msmarco](https://huggingface.co/bigscience/sgpt-bloom-7b1-msmarco) | 4096 | 2048 | 57.59 | 48.22 | 38.93 | 81.9 | 55.65 | 77.74 | 33.6 | 66.19 |
2783
+ | [all-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2) | 384 | 512 | 56.53 | 42.69 | 41.81 | 82.41 | 58.44 | 79.8 | 27.9 | 63.21 |
2784
+ | [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) | 384 | 512 | 56.26 | 41.95 | 42.35 | 82.37 | 58.04 | 78.9 | 30.81 | 63.05 |
2785
+ | [contriever-base-msmarco](https://huggingface.co/nthakur/contriever-base-msmarco) | 768 | 512 | 56.00 | 41.88 | 41.1 | 82.54 | 53.14 | 76.51 | 30.36 | 66.68 |
2786
+ | [sentence-t5-base](https://huggingface.co/sentence-transformers/sentence-t5-base) | 768 | 512 | 55.27 | 33.63 | 40.21 | 85.18 | 53.09 | 81.14 | 31.39 | 69.81 |
2787
+
2788
+
2789
+
2790
+ - **C-MTEB**:
2791
+ We create a benchmark C-MTEB for chinese text embedding which consists of 31 datasets from 6 tasks.
2792
+ Please refer to [C_MTEB](https://github.com/FlagOpen/FlagEmbedding/blob/master/C_MTEB/README.md) for a detailed introduction.
2793
+
2794
+ | Model | Embedding dimension | Avg | Retrieval | STS | PairClassification | Classification | Reranking | Clustering |
2795
+ |:-------------------------------|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|
2796
+ | [**bge-large-zh**](https://huggingface.co/BAAI/bge-large-zh) | 1024 | **64.20** | **71.53** | **53.23** | **78.94** | 72.26 | **65.11** | 48.39 |
2797
+ | [**bge-large-zh-noinstruct**](https://huggingface.co/BAAI/bge-large-zh-noinstruct) | 1024 | 63.53 | 70.55 | 50.98 | 76.77 | **72.49** | 64.91 | **50.01** |
2798
+ | [**BAAI/bge-base-zh**](https://huggingface.co/BAAI/bge-base-zh) | 768 | 62.96 | 69.53 | 52.05 | 77.5 | 70.98 | 64.91 | 47.63 |
2799
+ | [**BAAI/bge-small-zh**](https://huggingface.co/BAAI/bge-small-zh) | 512 | 58.27 | 63.07 | 46.87 | 70.35 | 67.78 | 61.48 | 45.09 |
2800
+ | [m3e-base](https://huggingface.co/moka-ai/m3e-base) | 768 | 57.10 |56.91 | 48.15 | 63.99 | 70.28 | 59.34 | 47.68 |
2801
+ | [m3e-large](https://huggingface.co/moka-ai/m3e-large) | 1024 | 57.05 |54.75 | 48.64 | 64.3 | 71.22 | 59.66 | 48.88 |
2802
+ | [text-embedding-ada-002(OpenAI)](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings) | 1536 | 53.02 | 52.0 | 40.61 | 69.56 | 67.38 | 54.28 | 45.68 |
2803
+ | [luotuo](https://huggingface.co/silk-road/luotuo-bert-medium) | 1024 | 49.37 | 44.4 | 39.41 | 66.62 | 65.29 | 49.25 | 44.39 |
2804
+ | [text2vec](https://huggingface.co/shibing624/text2vec-base-chinese) | 768 | 47.63 | 38.79 | 41.71 | 67.41 | 65.18 | 49.45 | 37.66 |
2805
+ | [text2vec-large](https://huggingface.co/GanymedeNil/text2vec-large-chinese) | 1024 | 47.36 | 41.94 | 41.98 | 70.86 | 63.42 | 49.16 | 30.02 |
2806
+
2807
+
2808
+
2809
+ ## Train
2810
+ This section will introduce the way we used to train the general embedding.
2811
+ The training scripts are in [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/baai_general_embedding/README.md),
2812
+ and we provide some examples to do [pre-train](https://github.com/FlagOpen/FlagEmbedding/blob/master/examples/pretrain/README.md) and [fine-tune](https://github.com/FlagOpen/FlagEmbedding/blob/master/examples/finetune/README.md).
2813
+
2814
+
2815
+ **1. RetroMAE Pre-train**
2816
+ We pre-train the model following the method [retromae](https://github.com/staoxiao/RetroMAE),
2817
+ which shows promising improvement in retrieval task ([paper](https://aclanthology.org/2022.emnlp-main.35.pdf)).
2818
+ The pre-training was conducted on 24 A100(40G) GPUs with a batch size of 720.
2819
+ In retromae, the mask ratio of encoder and decoder are 0.3, 0.5 respectively.
2820
+ We used the AdamW optimizer and the learning rate is 2e-5.
2821
+
2822
+ **Pre-training data**:
2823
+ - English:
2824
+ - [Pile](https://pile.eleuther.ai/)
2825
+ - [wikipedia](https://huggingface.co/datasets/wikipedia)
2826
+ - [msmarco](https://huggingface.co/datasets/Tevatron/msmarco-passage-corpus)
2827
+ - Chinese:
2828
+ - [wudao](https://github.com/BAAI-WuDao/Data)
2829
+
2830
+
2831
+ **2. Finetune**
2832
+ We fine-tune the model using a contrastive objective.
2833
+ The format of input data is a triple`(query, positive, negative)`.
2834
+ Besides the negative in the triple, we also adopt in-batch negatives strategy.
2835
+ We employ the cross-device negatives sharing method to share negatives among different GPUs,
2836
+ which can dramatically **increase the number of negatives**.
2837
+
2838
+ We trained our model on 48 A100(40G) GPUs with a large batch size of 32,768 (so there are **65,535** negatives for each query in a batch).
2839
+ We used the AdamW optimizer and the learning rate is 1e-5.
2840
+ The temperature for contrastive loss is 0.01.
2841
+
2842
+ Besides, we add instruction to the query for s2p(short query to long passage) retrieval task in the training (add nothing to passages).
2843
+ For English, the instruction is `Represent this sentence for searching relevant passages: `;
2844
+ For Chinese, the instruction is `为这个句子生成表示以用于检索相关文章:`.
2845
+ In the evaluation, the instruction should be added for queries in retrieval task, not be added for other tasks.
2846
+ Noted that the instruction is not needed for passages.
2847
+
2848
+ The finetune script is accessible in this repository: [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/baai_general_embedding/README.md).
2849
+ You can easily finetune your model with it.
2850
+
2851
+ **Training data**:
2852
+
2853
+ - For English, we collect 230M text pairs from [wikipedia](https://huggingface.co/datasets/wikipedia), [cc-net](https://github.com/facebookresearch/cc_net), and so on.
2854
+
2855
+ - For chinese, we collect 120M text pairs from [wudao](https://github.com/BAAI-WuDao/Data), [simclue](https://github.com/CLUEbenchmark/SimCLUE) and so on.
2856
+
2857
+ **The data collection is to be released in the future.**
2858
+
2859
+ We will continually update the embedding models and training codes,
2860
+ hoping to promote the development of the embedding model community.
2861
+
2862
+
2863
+
2864
+ ## License
2865
+ FlagEmbedding is licensed under [MIT License](https://github.com/FlagOpen/FlagEmbedding/blob/master/LICENSE). The released models can be used for commercial purposes free of charge.
2866
+
2867
+
2868
+
config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "eos_token": "</s>",
4
+ "layer_norm_epsilon": 1e-12,
5
+ "unk_token": "[UNK]"
6
+ }
model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f31c0401e28950b9a85f863a4963d82dce362008a7c3036d0d86c252c284723a
3
+ size 34433870
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "clean_up_tokenization_spaces": true,
3
+ "cls_token": "[CLS]",
4
+ "do_basic_tokenize": true,
5
+ "do_lower_case": true,
6
+ "mask_token": "[MASK]",
7
+ "model_max_length": 512,
8
+ "never_split": null,
9
+ "pad_token": "[PAD]",
10
+ "sep_token": "[SEP]",
11
+ "strip_accents": null,
12
+ "tokenize_chinese_chars": true,
13
+ "tokenizer_class": "BertTokenizer",
14
+ "unk_token": "[UNK]"
15
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff
 
vocabulary.json ADDED
The diff for this file is too large to render. See raw diff
 
vocabulary.txt ADDED
The diff for this file is too large to render. See raw diff