elsayovita commited on
Commit
1152e94
1 Parent(s): 6f8c255

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,818 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: TaylorAI/bge-micro-v2
3
+ datasets: []
4
+ language:
5
+ - en
6
+ library_name: sentence-transformers
7
+ license: apache-2.0
8
+ metrics:
9
+ - cosine_accuracy@1
10
+ - cosine_accuracy@3
11
+ - cosine_accuracy@5
12
+ - cosine_accuracy@10
13
+ - cosine_precision@1
14
+ - cosine_precision@3
15
+ - cosine_precision@5
16
+ - cosine_precision@10
17
+ - cosine_recall@1
18
+ - cosine_recall@3
19
+ - cosine_recall@5
20
+ - cosine_recall@10
21
+ - cosine_ndcg@10
22
+ - cosine_mrr@10
23
+ - cosine_map@100
24
+ pipeline_tag: sentence-similarity
25
+ tags:
26
+ - sentence-transformers
27
+ - sentence-similarity
28
+ - feature-extraction
29
+ - generated_from_trainer
30
+ - dataset_size:11863
31
+ - loss:MatryoshkaLoss
32
+ - loss:MultipleNegativesRankingLoss
33
+ widget:
34
+ - source_sentence: In the fiscal year 2022, the emissions were categorized into different
35
+ scopes, with each scope representing a specific source of emissions
36
+ sentences:
37
+ - 'Question: What is NetLink proactive in identifying to be more efficient in? '
38
+ - What standard is the Environment, Health, and Safety Management System (EHSMS)
39
+ audited to by a third-party accredited certification body at the operational assets
40
+ level of CLI?
41
+ - What do the different scopes represent in terms of emissions in the fiscal year
42
+ 2022?
43
+ - source_sentence: NetLink is committed to protecting the security of all information
44
+ and information systems, including both end-user data and corporate data. To this
45
+ end, management ensures that the appropriate IT policies, personal data protection
46
+ policy, risk mitigation strategies, cyber security programmes, systems, processes,
47
+ and controls are in place to protect our IT systems and confidential data
48
+ sentences:
49
+ - '"What recognition did NetLink receive in FY22?"'
50
+ - What measures does NetLink have in place to protect the security of all information
51
+ and information systems, including end-user data and corporate data?
52
+ - 'Question: What does Disclosure 102-10 discuss regarding the organization and
53
+ its supply chain?'
54
+ - source_sentence: In the domain of economic performance, the focus is on the financial
55
+ health and growth of the organization, ensuring sustainable profitability and
56
+ value creation for stakeholders
57
+ sentences:
58
+ - What does NetLink prioritize by investing in its network to ensure reliability
59
+ and quality of infrastructure?
60
+ - What percentage of the total energy was accounted for by heat, steam, and chilled
61
+ water in 2021 according to the given information?
62
+ - What is the focus in the domain of economic performance, ensuring sustainable
63
+ profitability and value creation for stakeholders?
64
+ - source_sentence: Disclosure 102-41 discusses collective bargaining agreements and
65
+ is found on page 98
66
+ sentences:
67
+ - What topic is discussed in Disclosure 102-41 on page 98 of the document?
68
+ - What was the number of cases in 2021, following a decrease from 42 cases in 2020?
69
+ - What type of data does GRI 101 provide in relation to connecting the nation?
70
+ - source_sentence: Employee health and well-being has never been more topical than
71
+ it was in the past year. We understand that people around the world, including
72
+ our employees, have been increasingly exposed to factors affecting their physical
73
+ and mental wellbeing. We are committed to creating an environment that supports
74
+ our employees and ensures they feel valued and have a sense of belonging. We utilised
75
+ sentences:
76
+ - What aspect of the standard covers the evaluation of the management approach?
77
+ - 'Question: What is the company''s commitment towards its employees'' health and
78
+ well-being based on the provided context information?'
79
+ - What types of skills does NetLink focus on developing through their training and
80
+ development opportunities for employees?
81
+ model-index:
82
+ - name: BGE micro v2 ESG
83
+ results:
84
+ - task:
85
+ type: information-retrieval
86
+ name: Information Retrieval
87
+ dataset:
88
+ name: dim 384
89
+ type: dim_384
90
+ metrics:
91
+ - type: cosine_accuracy@1
92
+ value: 0.7549523729242181
93
+ name: Cosine Accuracy@1
94
+ - type: cosine_accuracy@3
95
+ value: 0.8991823316193206
96
+ name: Cosine Accuracy@3
97
+ - type: cosine_accuracy@5
98
+ value: 0.9237123830397033
99
+ name: Cosine Accuracy@5
100
+ - type: cosine_accuracy@10
101
+ value: 0.9447020146674534
102
+ name: Cosine Accuracy@10
103
+ - type: cosine_precision@1
104
+ value: 0.7549523729242181
105
+ name: Cosine Precision@1
106
+ - type: cosine_precision@3
107
+ value: 0.2997274438731068
108
+ name: Cosine Precision@3
109
+ - type: cosine_precision@5
110
+ value: 0.1847424766079407
111
+ name: Cosine Precision@5
112
+ - type: cosine_precision@10
113
+ value: 0.09447020146674537
114
+ name: Cosine Precision@10
115
+ - type: cosine_recall@1
116
+ value: 0.020970899247894956
117
+ name: Cosine Recall@1
118
+ - type: cosine_recall@3
119
+ value: 0.02497728698942558
120
+ name: Cosine Recall@3
121
+ - type: cosine_recall@5
122
+ value: 0.025658677306658433
123
+ name: Cosine Recall@5
124
+ - type: cosine_recall@10
125
+ value: 0.026241722629651493
126
+ name: Cosine Recall@10
127
+ - type: cosine_ndcg@10
128
+ value: 0.18912117167223944
129
+ name: Cosine Ndcg@10
130
+ - type: cosine_mrr@10
131
+ value: 0.8309359566693303
132
+ name: Cosine Mrr@10
133
+ - type: cosine_map@100
134
+ value: 0.023120117824201005
135
+ name: Cosine Map@100
136
+ - task:
137
+ type: information-retrieval
138
+ name: Information Retrieval
139
+ dataset:
140
+ name: dim 256
141
+ type: dim_256
142
+ metrics:
143
+ - type: cosine_accuracy@1
144
+ value: 0.7496417432352693
145
+ name: Cosine Accuracy@1
146
+ - type: cosine_accuracy@3
147
+ value: 0.8958105032453848
148
+ name: Cosine Accuracy@3
149
+ - type: cosine_accuracy@5
150
+ value: 0.9187389361881481
151
+ name: Cosine Accuracy@5
152
+ - type: cosine_accuracy@10
153
+ value: 0.9417516648402596
154
+ name: Cosine Accuracy@10
155
+ - type: cosine_precision@1
156
+ value: 0.7496417432352693
157
+ name: Cosine Precision@1
158
+ - type: cosine_precision@3
159
+ value: 0.2986035010817949
160
+ name: Cosine Precision@3
161
+ - type: cosine_precision@5
162
+ value: 0.1837477872376296
163
+ name: Cosine Precision@5
164
+ - type: cosine_precision@10
165
+ value: 0.09417516648402599
166
+ name: Cosine Precision@10
167
+ - type: cosine_recall@1
168
+ value: 0.020823381756535267
169
+ name: Cosine Recall@1
170
+ - type: cosine_recall@3
171
+ value: 0.02488362509014959
172
+ name: Cosine Recall@3
173
+ - type: cosine_recall@5
174
+ value: 0.025520526005226342
175
+ name: Cosine Recall@5
176
+ - type: cosine_recall@10
177
+ value: 0.026159768467784998
178
+ name: Cosine Recall@10
179
+ - type: cosine_ndcg@10
180
+ value: 0.188171652806899
181
+ name: Cosine Ndcg@10
182
+ - type: cosine_mrr@10
183
+ value: 0.8261983036492017
184
+ name: Cosine Mrr@10
185
+ - type: cosine_map@100
186
+ value: 0.022991454812532088
187
+ name: Cosine Map@100
188
+ - task:
189
+ type: information-retrieval
190
+ name: Information Retrieval
191
+ dataset:
192
+ name: dim 128
193
+ type: dim_128
194
+ metrics:
195
+ - type: cosine_accuracy@1
196
+ value: 0.7355643597740875
197
+ name: Cosine Accuracy@1
198
+ - type: cosine_accuracy@3
199
+ value: 0.8874652280198938
200
+ name: Cosine Accuracy@3
201
+ - type: cosine_accuracy@5
202
+ value: 0.9105622523813538
203
+ name: Cosine Accuracy@5
204
+ - type: cosine_accuracy@10
205
+ value: 0.9341650509989041
206
+ name: Cosine Accuracy@10
207
+ - type: cosine_precision@1
208
+ value: 0.7355643597740875
209
+ name: Cosine Precision@1
210
+ - type: cosine_precision@3
211
+ value: 0.2958217426732979
212
+ name: Cosine Precision@3
213
+ - type: cosine_precision@5
214
+ value: 0.1821124504762708
215
+ name: Cosine Precision@5
216
+ - type: cosine_precision@10
217
+ value: 0.09341650509989044
218
+ name: Cosine Precision@10
219
+ - type: cosine_recall@1
220
+ value: 0.02043234332705799
221
+ name: Cosine Recall@1
222
+ - type: cosine_recall@3
223
+ value: 0.0246518118894415
224
+ name: Cosine Recall@3
225
+ - type: cosine_recall@5
226
+ value: 0.025293395899482058
227
+ name: Cosine Recall@5
228
+ - type: cosine_recall@10
229
+ value: 0.02594902919441401
230
+ name: Cosine Recall@10
231
+ - type: cosine_ndcg@10
232
+ value: 0.18580500893220617
233
+ name: Cosine Ndcg@10
234
+ - type: cosine_mrr@10
235
+ value: 0.8144083444724101
236
+ name: Cosine Mrr@10
237
+ - type: cosine_map@100
238
+ value: 0.022667974495178208
239
+ name: Cosine Map@100
240
+ - task:
241
+ type: information-retrieval
242
+ name: Information Retrieval
243
+ dataset:
244
+ name: dim 64
245
+ type: dim_64
246
+ metrics:
247
+ - type: cosine_accuracy@1
248
+ value: 0.6972098120205682
249
+ name: Cosine Accuracy@1
250
+ - type: cosine_accuracy@3
251
+ value: 0.8493635673944196
252
+ name: Cosine Accuracy@3
253
+ - type: cosine_accuracy@5
254
+ value: 0.8830818511337772
255
+ name: Cosine Accuracy@5
256
+ - type: cosine_accuracy@10
257
+ value: 0.913175419371154
258
+ name: Cosine Accuracy@10
259
+ - type: cosine_precision@1
260
+ value: 0.6972098120205682
261
+ name: Cosine Precision@1
262
+ - type: cosine_precision@3
263
+ value: 0.28312118913147316
264
+ name: Cosine Precision@3
265
+ - type: cosine_precision@5
266
+ value: 0.17661637022675547
267
+ name: Cosine Precision@5
268
+ - type: cosine_precision@10
269
+ value: 0.09131754193711542
270
+ name: Cosine Precision@10
271
+ - type: cosine_recall@1
272
+ value: 0.019366939222793565
273
+ name: Cosine Recall@1
274
+ - type: cosine_recall@3
275
+ value: 0.023593432427622775
276
+ name: Cosine Recall@3
277
+ - type: cosine_recall@5
278
+ value: 0.024530051420382712
279
+ name: Cosine Recall@5
280
+ - type: cosine_recall@10
281
+ value: 0.02536598387142095
282
+ name: Cosine Recall@10
283
+ - type: cosine_ndcg@10
284
+ value: 0.1787893349481174
285
+ name: Cosine Ndcg@10
286
+ - type: cosine_mrr@10
287
+ value: 0.7792686076088251
288
+ name: Cosine Mrr@10
289
+ - type: cosine_map@100
290
+ value: 0.021712360244980362
291
+ name: Cosine Map@100
292
+ - task:
293
+ type: information-retrieval
294
+ name: Information Retrieval
295
+ dataset:
296
+ name: dim 32
297
+ type: dim_32
298
+ metrics:
299
+ - type: cosine_accuracy@1
300
+ value: 0.5974036921520695
301
+ name: Cosine Accuracy@1
302
+ - type: cosine_accuracy@3
303
+ value: 0.7523392059344179
304
+ name: Cosine Accuracy@3
305
+ - type: cosine_accuracy@5
306
+ value: 0.7970159318890668
307
+ name: Cosine Accuracy@5
308
+ - type: cosine_accuracy@10
309
+ value: 0.8448115990896063
310
+ name: Cosine Accuracy@10
311
+ - type: cosine_precision@1
312
+ value: 0.5974036921520695
313
+ name: Cosine Precision@1
314
+ - type: cosine_precision@3
315
+ value: 0.25077973531147263
316
+ name: Cosine Precision@3
317
+ - type: cosine_precision@5
318
+ value: 0.15940318637781337
319
+ name: Cosine Precision@5
320
+ - type: cosine_precision@10
321
+ value: 0.08448115990896064
322
+ name: Cosine Precision@10
323
+ - type: cosine_recall@1
324
+ value: 0.016594547004224157
325
+ name: Cosine Recall@1
326
+ - type: cosine_recall@3
327
+ value: 0.02089831127595606
328
+ name: Cosine Recall@3
329
+ - type: cosine_recall@5
330
+ value: 0.022139331441362976
331
+ name: Cosine Recall@5
332
+ - type: cosine_recall@10
333
+ value: 0.023466988863600182
334
+ name: Cosine Recall@10
335
+ - type: cosine_ndcg@10
336
+ value: 0.15933281345013575
337
+ name: Cosine Ndcg@10
338
+ - type: cosine_mrr@10
339
+ value: 0.6849689711507925
340
+ name: Cosine Mrr@10
341
+ - type: cosine_map@100
342
+ value: 0.019142044257794796
343
+ name: Cosine Map@100
344
+ ---
345
+
346
+ # BGE micro v2 ESG
347
+
348
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [TaylorAI/bge-micro-v2](https://huggingface.co/TaylorAI/bge-micro-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
349
+
350
+ ## Model Details
351
+
352
+ ### Model Description
353
+ - **Model Type:** Sentence Transformer
354
+ - **Base model:** [TaylorAI/bge-micro-v2](https://huggingface.co/TaylorAI/bge-micro-v2) <!-- at revision 3edf6d7de0faa426b09780416fe61009f26ae589 -->
355
+ - **Maximum Sequence Length:** 512 tokens
356
+ - **Output Dimensionality:** 384 tokens
357
+ - **Similarity Function:** Cosine Similarity
358
+ <!-- - **Training Dataset:** Unknown -->
359
+ - **Language:** en
360
+ - **License:** apache-2.0
361
+
362
+ ### Model Sources
363
+
364
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
365
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
366
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
367
+
368
+ ### Full Model Architecture
369
+
370
+ ```
371
+ SentenceTransformer(
372
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
373
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
374
+ )
375
+ ```
376
+
377
+ ## Usage
378
+
379
+ ### Direct Usage (Sentence Transformers)
380
+
381
+ First install the Sentence Transformers library:
382
+
383
+ ```bash
384
+ pip install -U sentence-transformers
385
+ ```
386
+
387
+ Then you can load this model and run inference.
388
+ ```python
389
+ from sentence_transformers import SentenceTransformer
390
+
391
+ # Download from the 🤗 Hub
392
+ model = SentenceTransformer("elsayovita/bge-micro-v2-esg-v2")
393
+ # Run inference
394
+ sentences = [
395
+ 'Employee health and well-being has never been more topical than it was in the past year. We understand that people around the world, including our employees, have been increasingly exposed to factors affecting their physical and mental wellbeing. We are committed to creating an environment that supports our employees and ensures they feel valued and have a sense of belonging. We utilised',
396
+ "Question: What is the company's commitment towards its employees' health and well-being based on the provided context information?",
397
+ 'What types of skills does NetLink focus on developing through their training and development opportunities for employees?',
398
+ ]
399
+ embeddings = model.encode(sentences)
400
+ print(embeddings.shape)
401
+ # [3, 384]
402
+
403
+ # Get the similarity scores for the embeddings
404
+ similarities = model.similarity(embeddings, embeddings)
405
+ print(similarities.shape)
406
+ # [3, 3]
407
+ ```
408
+
409
+ <!--
410
+ ### Direct Usage (Transformers)
411
+
412
+ <details><summary>Click to see the direct usage in Transformers</summary>
413
+
414
+ </details>
415
+ -->
416
+
417
+ <!--
418
+ ### Downstream Usage (Sentence Transformers)
419
+
420
+ You can finetune this model on your own dataset.
421
+
422
+ <details><summary>Click to expand</summary>
423
+
424
+ </details>
425
+ -->
426
+
427
+ <!--
428
+ ### Out-of-Scope Use
429
+
430
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
431
+ -->
432
+
433
+ ## Evaluation
434
+
435
+ ### Metrics
436
+
437
+ #### Information Retrieval
438
+ * Dataset: `dim_384`
439
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
440
+
441
+ | Metric | Value |
442
+ |:--------------------|:-----------|
443
+ | cosine_accuracy@1 | 0.755 |
444
+ | cosine_accuracy@3 | 0.8992 |
445
+ | cosine_accuracy@5 | 0.9237 |
446
+ | cosine_accuracy@10 | 0.9447 |
447
+ | cosine_precision@1 | 0.755 |
448
+ | cosine_precision@3 | 0.2997 |
449
+ | cosine_precision@5 | 0.1847 |
450
+ | cosine_precision@10 | 0.0945 |
451
+ | cosine_recall@1 | 0.021 |
452
+ | cosine_recall@3 | 0.025 |
453
+ | cosine_recall@5 | 0.0257 |
454
+ | cosine_recall@10 | 0.0262 |
455
+ | cosine_ndcg@10 | 0.1891 |
456
+ | cosine_mrr@10 | 0.8309 |
457
+ | **cosine_map@100** | **0.0231** |
458
+
459
+ #### Information Retrieval
460
+ * Dataset: `dim_256`
461
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
462
+
463
+ | Metric | Value |
464
+ |:--------------------|:----------|
465
+ | cosine_accuracy@1 | 0.7496 |
466
+ | cosine_accuracy@3 | 0.8958 |
467
+ | cosine_accuracy@5 | 0.9187 |
468
+ | cosine_accuracy@10 | 0.9418 |
469
+ | cosine_precision@1 | 0.7496 |
470
+ | cosine_precision@3 | 0.2986 |
471
+ | cosine_precision@5 | 0.1837 |
472
+ | cosine_precision@10 | 0.0942 |
473
+ | cosine_recall@1 | 0.0208 |
474
+ | cosine_recall@3 | 0.0249 |
475
+ | cosine_recall@5 | 0.0255 |
476
+ | cosine_recall@10 | 0.0262 |
477
+ | cosine_ndcg@10 | 0.1882 |
478
+ | cosine_mrr@10 | 0.8262 |
479
+ | **cosine_map@100** | **0.023** |
480
+
481
+ #### Information Retrieval
482
+ * Dataset: `dim_128`
483
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
484
+
485
+ | Metric | Value |
486
+ |:--------------------|:-----------|
487
+ | cosine_accuracy@1 | 0.7356 |
488
+ | cosine_accuracy@3 | 0.8875 |
489
+ | cosine_accuracy@5 | 0.9106 |
490
+ | cosine_accuracy@10 | 0.9342 |
491
+ | cosine_precision@1 | 0.7356 |
492
+ | cosine_precision@3 | 0.2958 |
493
+ | cosine_precision@5 | 0.1821 |
494
+ | cosine_precision@10 | 0.0934 |
495
+ | cosine_recall@1 | 0.0204 |
496
+ | cosine_recall@3 | 0.0247 |
497
+ | cosine_recall@5 | 0.0253 |
498
+ | cosine_recall@10 | 0.0259 |
499
+ | cosine_ndcg@10 | 0.1858 |
500
+ | cosine_mrr@10 | 0.8144 |
501
+ | **cosine_map@100** | **0.0227** |
502
+
503
+ #### Information Retrieval
504
+ * Dataset: `dim_64`
505
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
506
+
507
+ | Metric | Value |
508
+ |:--------------------|:-----------|
509
+ | cosine_accuracy@1 | 0.6972 |
510
+ | cosine_accuracy@3 | 0.8494 |
511
+ | cosine_accuracy@5 | 0.8831 |
512
+ | cosine_accuracy@10 | 0.9132 |
513
+ | cosine_precision@1 | 0.6972 |
514
+ | cosine_precision@3 | 0.2831 |
515
+ | cosine_precision@5 | 0.1766 |
516
+ | cosine_precision@10 | 0.0913 |
517
+ | cosine_recall@1 | 0.0194 |
518
+ | cosine_recall@3 | 0.0236 |
519
+ | cosine_recall@5 | 0.0245 |
520
+ | cosine_recall@10 | 0.0254 |
521
+ | cosine_ndcg@10 | 0.1788 |
522
+ | cosine_mrr@10 | 0.7793 |
523
+ | **cosine_map@100** | **0.0217** |
524
+
525
+ #### Information Retrieval
526
+ * Dataset: `dim_32`
527
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
528
+
529
+ | Metric | Value |
530
+ |:--------------------|:-----------|
531
+ | cosine_accuracy@1 | 0.5974 |
532
+ | cosine_accuracy@3 | 0.7523 |
533
+ | cosine_accuracy@5 | 0.797 |
534
+ | cosine_accuracy@10 | 0.8448 |
535
+ | cosine_precision@1 | 0.5974 |
536
+ | cosine_precision@3 | 0.2508 |
537
+ | cosine_precision@5 | 0.1594 |
538
+ | cosine_precision@10 | 0.0845 |
539
+ | cosine_recall@1 | 0.0166 |
540
+ | cosine_recall@3 | 0.0209 |
541
+ | cosine_recall@5 | 0.0221 |
542
+ | cosine_recall@10 | 0.0235 |
543
+ | cosine_ndcg@10 | 0.1593 |
544
+ | cosine_mrr@10 | 0.685 |
545
+ | **cosine_map@100** | **0.0191** |
546
+
547
+ <!--
548
+ ## Bias, Risks and Limitations
549
+
550
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
551
+ -->
552
+
553
+ <!--
554
+ ### Recommendations
555
+
556
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
557
+ -->
558
+
559
+ ## Training Details
560
+
561
+ ### Training Dataset
562
+
563
+ #### Unnamed Dataset
564
+
565
+
566
+ * Size: 11,863 training samples
567
+ * Columns: <code>context</code> and <code>question</code>
568
+ * Approximate statistics based on the first 1000 samples:
569
+ | | context | question |
570
+ |:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
571
+ | type | string | string |
572
+ | details | <ul><li>min: 13 tokens</li><li>mean: 40.74 tokens</li><li>max: 277 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 24.4 tokens</li><li>max: 62 tokens</li></ul> |
573
+ * Samples:
574
+ | context | question |
575
+ |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------|
576
+ | <code>The engagement with key stakeholders involves various topics and methods throughout the year</code> | <code>Question: What does the engagement with key stakeholders involve throughout the year?</code> |
577
+ | <code>For unitholders and analysts, the focus is on business and operations, the release of financial results, and the overall performance and announcements</code> | <code>Question: What is the focus for unitholders and analysts in terms of business and operations, financial results, performance, and announcements?</code> |
578
+ | <code>These are communicated through press releases and other required disclosures via SGXNet and NetLink's website</code> | <code>What platform is used to communicate press releases and required disclosures for NetLink?</code> |
579
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
580
+ ```json
581
+ {
582
+ "loss": "MultipleNegativesRankingLoss",
583
+ "matryoshka_dims": [
584
+ 384,
585
+ 256,
586
+ 128,
587
+ 64,
588
+ 32
589
+ ],
590
+ "matryoshka_weights": [
591
+ 1,
592
+ 1,
593
+ 1,
594
+ 1,
595
+ 1
596
+ ],
597
+ "n_dims_per_step": -1
598
+ }
599
+ ```
600
+
601
+ ### Training Hyperparameters
602
+ #### Non-Default Hyperparameters
603
+
604
+ - `eval_strategy`: epoch
605
+ - `per_device_train_batch_size`: 32
606
+ - `per_device_eval_batch_size`: 16
607
+ - `gradient_accumulation_steps`: 16
608
+ - `learning_rate`: 2e-05
609
+ - `num_train_epochs`: 4
610
+ - `lr_scheduler_type`: cosine
611
+ - `warmup_ratio`: 0.1
612
+ - `bf16`: True
613
+ - `tf32`: False
614
+ - `load_best_model_at_end`: True
615
+ - `optim`: adamw_torch_fused
616
+ - `batch_sampler`: no_duplicates
617
+
618
+ #### All Hyperparameters
619
+ <details><summary>Click to expand</summary>
620
+
621
+ - `overwrite_output_dir`: False
622
+ - `do_predict`: False
623
+ - `eval_strategy`: epoch
624
+ - `prediction_loss_only`: True
625
+ - `per_device_train_batch_size`: 32
626
+ - `per_device_eval_batch_size`: 16
627
+ - `per_gpu_train_batch_size`: None
628
+ - `per_gpu_eval_batch_size`: None
629
+ - `gradient_accumulation_steps`: 16
630
+ - `eval_accumulation_steps`: None
631
+ - `learning_rate`: 2e-05
632
+ - `weight_decay`: 0.0
633
+ - `adam_beta1`: 0.9
634
+ - `adam_beta2`: 0.999
635
+ - `adam_epsilon`: 1e-08
636
+ - `max_grad_norm`: 1.0
637
+ - `num_train_epochs`: 4
638
+ - `max_steps`: -1
639
+ - `lr_scheduler_type`: cosine
640
+ - `lr_scheduler_kwargs`: {}
641
+ - `warmup_ratio`: 0.1
642
+ - `warmup_steps`: 0
643
+ - `log_level`: passive
644
+ - `log_level_replica`: warning
645
+ - `log_on_each_node`: True
646
+ - `logging_nan_inf_filter`: True
647
+ - `save_safetensors`: True
648
+ - `save_on_each_node`: False
649
+ - `save_only_model`: False
650
+ - `restore_callback_states_from_checkpoint`: False
651
+ - `no_cuda`: False
652
+ - `use_cpu`: False
653
+ - `use_mps_device`: False
654
+ - `seed`: 42
655
+ - `data_seed`: None
656
+ - `jit_mode_eval`: False
657
+ - `use_ipex`: False
658
+ - `bf16`: True
659
+ - `fp16`: False
660
+ - `fp16_opt_level`: O1
661
+ - `half_precision_backend`: auto
662
+ - `bf16_full_eval`: False
663
+ - `fp16_full_eval`: False
664
+ - `tf32`: False
665
+ - `local_rank`: 0
666
+ - `ddp_backend`: None
667
+ - `tpu_num_cores`: None
668
+ - `tpu_metrics_debug`: False
669
+ - `debug`: []
670
+ - `dataloader_drop_last`: False
671
+ - `dataloader_num_workers`: 0
672
+ - `dataloader_prefetch_factor`: None
673
+ - `past_index`: -1
674
+ - `disable_tqdm`: False
675
+ - `remove_unused_columns`: True
676
+ - `label_names`: None
677
+ - `load_best_model_at_end`: True
678
+ - `ignore_data_skip`: False
679
+ - `fsdp`: []
680
+ - `fsdp_min_num_params`: 0
681
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
682
+ - `fsdp_transformer_layer_cls_to_wrap`: None
683
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
684
+ - `deepspeed`: None
685
+ - `label_smoothing_factor`: 0.0
686
+ - `optim`: adamw_torch_fused
687
+ - `optim_args`: None
688
+ - `adafactor`: False
689
+ - `group_by_length`: False
690
+ - `length_column_name`: length
691
+ - `ddp_find_unused_parameters`: None
692
+ - `ddp_bucket_cap_mb`: None
693
+ - `ddp_broadcast_buffers`: False
694
+ - `dataloader_pin_memory`: True
695
+ - `dataloader_persistent_workers`: False
696
+ - `skip_memory_metrics`: True
697
+ - `use_legacy_prediction_loop`: False
698
+ - `push_to_hub`: False
699
+ - `resume_from_checkpoint`: None
700
+ - `hub_model_id`: None
701
+ - `hub_strategy`: every_save
702
+ - `hub_private_repo`: False
703
+ - `hub_always_push`: False
704
+ - `gradient_checkpointing`: False
705
+ - `gradient_checkpointing_kwargs`: None
706
+ - `include_inputs_for_metrics`: False
707
+ - `eval_do_concat_batches`: True
708
+ - `fp16_backend`: auto
709
+ - `push_to_hub_model_id`: None
710
+ - `push_to_hub_organization`: None
711
+ - `mp_parameters`:
712
+ - `auto_find_batch_size`: False
713
+ - `full_determinism`: False
714
+ - `torchdynamo`: None
715
+ - `ray_scope`: last
716
+ - `ddp_timeout`: 1800
717
+ - `torch_compile`: False
718
+ - `torch_compile_backend`: None
719
+ - `torch_compile_mode`: None
720
+ - `dispatch_batches`: None
721
+ - `split_batches`: None
722
+ - `include_tokens_per_second`: False
723
+ - `include_num_input_tokens_seen`: False
724
+ - `neftune_noise_alpha`: None
725
+ - `optim_target_modules`: None
726
+ - `batch_eval_metrics`: False
727
+ - `eval_on_start`: False
728
+ - `batch_sampler`: no_duplicates
729
+ - `multi_dataset_batch_sampler`: proportional
730
+
731
+ </details>
732
+
733
+ ### Training Logs
734
+ | Epoch | Step | Training Loss | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_32_cosine_map@100 | dim_384_cosine_map@100 | dim_64_cosine_map@100 |
735
+ |:----------:|:------:|:-------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:|:---------------------:|
736
+ | 0.4313 | 10 | 5.2501 | - | - | - | - | - |
737
+ | 0.8625 | 20 | 3.4967 | - | - | - | - | - |
738
+ | 1.0350 | 24 | - | 0.0221 | 0.0224 | 0.0185 | 0.0226 | 0.0210 |
739
+ | 1.2264 | 30 | 3.1196 | - | - | - | - | - |
740
+ | 1.6577 | 40 | 2.4428 | - | - | - | - | - |
741
+ | 2.0458 | 49 | - | 0.0226 | 0.0229 | 0.0189 | 0.0230 | 0.0215 |
742
+ | 2.0216 | 50 | 2.2222 | - | - | - | - | - |
743
+ | 2.4528 | 60 | 2.3441 | - | - | - | - | - |
744
+ | 2.8841 | 70 | 2.0096 | - | - | - | - | - |
745
+ | 3.0566 | 74 | - | 0.0227 | 0.0230 | 0.0191 | 0.0231 | 0.0217 |
746
+ | 3.2480 | 80 | 2.3019 | - | - | - | - | - |
747
+ | 3.6792 | 90 | 1.9538 | - | - | - | - | - |
748
+ | **3.7655** | **92** | **-** | **0.0227** | **0.023** | **0.0191** | **0.0231** | **0.0217** |
749
+
750
+ * The bold row denotes the saved checkpoint.
751
+
752
+ ### Framework Versions
753
+ - Python: 3.10.12
754
+ - Sentence Transformers: 3.0.1
755
+ - Transformers: 4.42.4
756
+ - PyTorch: 2.4.0+cu121
757
+ - Accelerate: 0.32.1
758
+ - Datasets: 2.21.0
759
+ - Tokenizers: 0.19.1
760
+
761
+ ## Citation
762
+
763
+ ### BibTeX
764
+
765
+ #### Sentence Transformers
766
+ ```bibtex
767
+ @inproceedings{reimers-2019-sentence-bert,
768
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
769
+ author = "Reimers, Nils and Gurevych, Iryna",
770
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
771
+ month = "11",
772
+ year = "2019",
773
+ publisher = "Association for Computational Linguistics",
774
+ url = "https://arxiv.org/abs/1908.10084",
775
+ }
776
+ ```
777
+
778
+ #### MatryoshkaLoss
779
+ ```bibtex
780
+ @misc{kusupati2024matryoshka,
781
+ title={Matryoshka Representation Learning},
782
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
783
+ year={2024},
784
+ eprint={2205.13147},
785
+ archivePrefix={arXiv},
786
+ primaryClass={cs.LG}
787
+ }
788
+ ```
789
+
790
+ #### MultipleNegativesRankingLoss
791
+ ```bibtex
792
+ @misc{henderson2017efficient,
793
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
794
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
795
+ year={2017},
796
+ eprint={1705.00652},
797
+ archivePrefix={arXiv},
798
+ primaryClass={cs.CL}
799
+ }
800
+ ```
801
+
802
+ <!--
803
+ ## Glossary
804
+
805
+ *Clearly define terms in order to be accessible across audiences.*
806
+ -->
807
+
808
+ <!--
809
+ ## Model Card Authors
810
+
811
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
812
+ -->
813
+
814
+ <!--
815
+ ## Model Card Contact
816
+
817
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
818
+ -->
config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "TaylorAI/bge-micro-v2",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 384,
11
+ "id2label": {
12
+ "0": "LABEL_0"
13
+ },
14
+ "initializer_range": 0.02,
15
+ "intermediate_size": 1536,
16
+ "label2id": {
17
+ "LABEL_0": 0
18
+ },
19
+ "layer_norm_eps": 1e-12,
20
+ "max_position_embeddings": 512,
21
+ "model_type": "bert",
22
+ "num_attention_heads": 12,
23
+ "num_hidden_layers": 3,
24
+ "pad_token_id": 0,
25
+ "position_embedding_type": "absolute",
26
+ "torch_dtype": "float32",
27
+ "transformers_version": "4.42.4",
28
+ "type_vocab_size": 2,
29
+ "use_cache": true,
30
+ "vocab_size": 30522
31
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.42.4",
5
+ "pytorch": "2.4.0+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:702b99b04e570422feb4a5884e8062bd9c9e51b450ab5f62bdc3bed3751c4557
3
+ size 69565312
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "[PAD]",
4
+ "[UNK]",
5
+ "[CLS]",
6
+ "[SEP]",
7
+ "[MASK]"
8
+ ],
9
+ "cls_token": {
10
+ "content": "[CLS]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "mask_token": {
17
+ "content": "[MASK]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "pad_token": {
24
+ "content": "[PAD]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "sep_token": {
31
+ "content": "[SEP]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "unk_token": {
38
+ "content": "[UNK]",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ }
44
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "additional_special_tokens": [
45
+ "[PAD]",
46
+ "[UNK]",
47
+ "[CLS]",
48
+ "[SEP]",
49
+ "[MASK]"
50
+ ],
51
+ "clean_up_tokenization_spaces": true,
52
+ "cls_token": "[CLS]",
53
+ "do_basic_tokenize": true,
54
+ "do_lower_case": true,
55
+ "mask_token": "[MASK]",
56
+ "max_length": 512,
57
+ "model_max_length": 512,
58
+ "never_split": null,
59
+ "pad_to_multiple_of": null,
60
+ "pad_token": "[PAD]",
61
+ "pad_token_type_id": 0,
62
+ "padding_side": "right",
63
+ "sep_token": "[SEP]",
64
+ "stride": 0,
65
+ "strip_accents": null,
66
+ "tokenize_chinese_chars": true,
67
+ "tokenizer_class": "BertTokenizer",
68
+ "truncation_side": "right",
69
+ "truncation_strategy": "longest_first",
70
+ "unk_token": "[UNK]"
71
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff