elsayovita commited on
Commit
fc8db67
1 Parent(s): 5d2ad82

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,811 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: TaylorAI/bge-micro-v2
3
+ datasets: []
4
+ language:
5
+ - en
6
+ library_name: sentence-transformers
7
+ license: apache-2.0
8
+ metrics:
9
+ - cosine_accuracy@1
10
+ - cosine_accuracy@3
11
+ - cosine_accuracy@5
12
+ - cosine_accuracy@10
13
+ - cosine_precision@1
14
+ - cosine_precision@3
15
+ - cosine_precision@5
16
+ - cosine_precision@10
17
+ - cosine_recall@1
18
+ - cosine_recall@3
19
+ - cosine_recall@5
20
+ - cosine_recall@10
21
+ - cosine_ndcg@10
22
+ - cosine_mrr@10
23
+ - cosine_map@100
24
+ pipeline_tag: sentence-similarity
25
+ tags:
26
+ - sentence-transformers
27
+ - sentence-similarity
28
+ - feature-extraction
29
+ - generated_from_trainer
30
+ - dataset_size:11863
31
+ - loss:MatryoshkaLoss
32
+ - loss:MultipleNegativesRankingLoss
33
+ widget:
34
+ - source_sentence: In the fiscal year 2022, the emissions were categorized into different
35
+ scopes, with each scope representing a specific source of emissions
36
+ sentences:
37
+ - 'Question: What is NetLink proactive in identifying to be more efficient in? '
38
+ - What standard is the Environment, Health, and Safety Management System (EHSMS)
39
+ audited to by a third-party accredited certification body at the operational assets
40
+ level of CLI?
41
+ - What do the different scopes represent in terms of emissions in the fiscal year
42
+ 2022?
43
+ - source_sentence: NetLink is committed to protecting the security of all information
44
+ and information systems, including both end-user data and corporate data. To this
45
+ end, management ensures that the appropriate IT policies, personal data protection
46
+ policy, risk mitigation strategies, cyber security programmes, systems, processes,
47
+ and controls are in place to protect our IT systems and confidential data
48
+ sentences:
49
+ - '"What recognition did NetLink receive in FY22?"'
50
+ - What measures does NetLink have in place to protect the security of all information
51
+ and information systems, including end-user data and corporate data?
52
+ - 'Question: What does Disclosure 102-10 discuss regarding the organization and
53
+ its supply chain?'
54
+ - source_sentence: In the domain of economic performance, the focus is on the financial
55
+ health and growth of the organization, ensuring sustainable profitability and
56
+ value creation for stakeholders
57
+ sentences:
58
+ - What does NetLink prioritize by investing in its network to ensure reliability
59
+ and quality of infrastructure?
60
+ - What percentage of the total energy was accounted for by heat, steam, and chilled
61
+ water in 2021 according to the given information?
62
+ - What is the focus in the domain of economic performance, ensuring sustainable
63
+ profitability and value creation for stakeholders?
64
+ - source_sentence: Disclosure 102-41 discusses collective bargaining agreements and
65
+ is found on page 98
66
+ sentences:
67
+ - What topic is discussed in Disclosure 102-41 on page 98 of the document?
68
+ - What was the number of cases in 2021, following a decrease from 42 cases in 2020?
69
+ - What type of data does GRI 101 provide in relation to connecting the nation?
70
+ - source_sentence: Employee health and well-being has never been more topical than
71
+ it was in the past year. We understand that people around the world, including
72
+ our employees, have been increasingly exposed to factors affecting their physical
73
+ and mental wellbeing. We are committed to creating an environment that supports
74
+ our employees and ensures they feel valued and have a sense of belonging. We utilised
75
+ sentences:
76
+ - What aspect of the standard covers the evaluation of the management approach?
77
+ - 'Question: What is the company''s commitment towards its employees'' health and
78
+ well-being based on the provided context information?'
79
+ - What types of skills does NetLink focus on developing through their training and
80
+ development opportunities for employees?
81
+ model-index:
82
+ - name: BGE micro v2 ESG
83
+ results:
84
+ - task:
85
+ type: information-retrieval
86
+ name: Information Retrieval
87
+ dataset:
88
+ name: dim 384
89
+ type: dim_384
90
+ metrics:
91
+ - type: cosine_accuracy@1
92
+ value: 0.7393576666947652
93
+ name: Cosine Accuracy@1
94
+ - type: cosine_accuracy@3
95
+ value: 0.8871280451825002
96
+ name: Cosine Accuracy@3
97
+ - type: cosine_accuracy@5
98
+ value: 0.9143555593020315
99
+ name: Cosine Accuracy@5
100
+ - type: cosine_accuracy@10
101
+ value: 0.9382955407569755
102
+ name: Cosine Accuracy@10
103
+ - type: cosine_precision@1
104
+ value: 0.7393576666947652
105
+ name: Cosine Precision@1
106
+ - type: cosine_precision@3
107
+ value: 0.2957093483941667
108
+ name: Cosine Precision@3
109
+ - type: cosine_precision@5
110
+ value: 0.1828711118604063
111
+ name: Cosine Precision@5
112
+ - type: cosine_precision@10
113
+ value: 0.09382955407569755
114
+ name: Cosine Precision@10
115
+ - type: cosine_recall@1
116
+ value: 0.020537712963743484
117
+ name: Cosine Recall@1
118
+ - type: cosine_recall@3
119
+ value: 0.024642445699513908
120
+ name: Cosine Recall@3
121
+ - type: cosine_recall@5
122
+ value: 0.02539876553616755
123
+ name: Cosine Recall@5
124
+ - type: cosine_recall@10
125
+ value: 0.026063765021027103
126
+ name: Cosine Recall@10
127
+ - type: cosine_ndcg@10
128
+ value: 0.18655528566337626
129
+ name: Cosine Ndcg@10
130
+ - type: cosine_mrr@10
131
+ value: 0.8176322873975245
132
+ name: Cosine Mrr@10
133
+ - type: cosine_map@100
134
+ value: 0.022756262897092067
135
+ name: Cosine Map@100
136
+ - task:
137
+ type: information-retrieval
138
+ name: Information Retrieval
139
+ dataset:
140
+ name: dim 256
141
+ type: dim_256
142
+ metrics:
143
+ - type: cosine_accuracy@1
144
+ value: 0.731602461434713
145
+ name: Cosine Accuracy@1
146
+ - type: cosine_accuracy@3
147
+ value: 0.8831661468431257
148
+ name: Cosine Accuracy@3
149
+ - type: cosine_accuracy@5
150
+ value: 0.9111523223467926
151
+ name: Cosine Accuracy@5
152
+ - type: cosine_accuracy@10
153
+ value: 0.9355137823484785
154
+ name: Cosine Accuracy@10
155
+ - type: cosine_precision@1
156
+ value: 0.731602461434713
157
+ name: Cosine Precision@1
158
+ - type: cosine_precision@3
159
+ value: 0.2943887156143752
160
+ name: Cosine Precision@3
161
+ - type: cosine_precision@5
162
+ value: 0.18223046446935853
163
+ name: Cosine Precision@5
164
+ - type: cosine_precision@10
165
+ value: 0.09355137823484787
166
+ name: Cosine Precision@10
167
+ - type: cosine_recall@1
168
+ value: 0.020322290595408698
169
+ name: Cosine Recall@1
170
+ - type: cosine_recall@3
171
+ value: 0.024532392967864608
172
+ name: Cosine Recall@3
173
+ - type: cosine_recall@5
174
+ value: 0.02530978673185536
175
+ name: Cosine Recall@5
176
+ - type: cosine_recall@10
177
+ value: 0.02598649395412441
178
+ name: Cosine Recall@10
179
+ - type: cosine_ndcg@10
180
+ value: 0.1854736961250685
181
+ name: Cosine Ndcg@10
182
+ - type: cosine_mrr@10
183
+ value: 0.8120234114607371
184
+ name: Cosine Mrr@10
185
+ - type: cosine_map@100
186
+ value: 0.022602117473168613
187
+ name: Cosine Map@100
188
+ - task:
189
+ type: information-retrieval
190
+ name: Information Retrieval
191
+ dataset:
192
+ name: dim 128
193
+ type: dim_128
194
+ metrics:
195
+ - type: cosine_accuracy@1
196
+ value: 0.7171035994267891
197
+ name: Cosine Accuracy@1
198
+ - type: cosine_accuracy@3
199
+ value: 0.8735564359774087
200
+ name: Cosine Accuracy@3
201
+ - type: cosine_accuracy@5
202
+ value: 0.9012897243530305
203
+ name: Cosine Accuracy@5
204
+ - type: cosine_accuracy@10
205
+ value: 0.927927168507123
206
+ name: Cosine Accuracy@10
207
+ - type: cosine_precision@1
208
+ value: 0.7171035994267891
209
+ name: Cosine Precision@1
210
+ - type: cosine_precision@3
211
+ value: 0.2911854786591362
212
+ name: Cosine Precision@3
213
+ - type: cosine_precision@5
214
+ value: 0.1802579448706061
215
+ name: Cosine Precision@5
216
+ - type: cosine_precision@10
217
+ value: 0.09279271685071232
218
+ name: Cosine Precision@10
219
+ - type: cosine_recall@1
220
+ value: 0.019919544428521924
221
+ name: Cosine Recall@1
222
+ - type: cosine_recall@3
223
+ value: 0.02426545655492803
224
+ name: Cosine Recall@3
225
+ - type: cosine_recall@5
226
+ value: 0.025035825676473073
227
+ name: Cosine Recall@5
228
+ - type: cosine_recall@10
229
+ value: 0.025775754680753424
230
+ name: Cosine Recall@10
231
+ - type: cosine_ndcg@10
232
+ value: 0.18301753980732727
233
+ name: Cosine Ndcg@10
234
+ - type: cosine_mrr@10
235
+ value: 0.7997301868287288
236
+ name: Cosine Mrr@10
237
+ - type: cosine_map@100
238
+ value: 0.022264162086570314
239
+ name: Cosine Map@100
240
+ - task:
241
+ type: information-retrieval
242
+ name: Information Retrieval
243
+ dataset:
244
+ name: dim 64
245
+ type: dim_64
246
+ metrics:
247
+ - type: cosine_accuracy@1
248
+ value: 0.6758829975554245
249
+ name: Cosine Accuracy@1
250
+ - type: cosine_accuracy@3
251
+ value: 0.8359605496080249
252
+ name: Cosine Accuracy@3
253
+ - type: cosine_accuracy@5
254
+ value: 0.8713647475343504
255
+ name: Cosine Accuracy@5
256
+ - type: cosine_accuracy@10
257
+ value: 0.9060945797858889
258
+ name: Cosine Accuracy@10
259
+ - type: cosine_precision@1
260
+ value: 0.6758829975554245
261
+ name: Cosine Precision@1
262
+ - type: cosine_precision@3
263
+ value: 0.2786535165360083
264
+ name: Cosine Precision@3
265
+ - type: cosine_precision@5
266
+ value: 0.1742729495068701
267
+ name: Cosine Precision@5
268
+ - type: cosine_precision@10
269
+ value: 0.0906094579785889
270
+ name: Cosine Precision@10
271
+ - type: cosine_recall@1
272
+ value: 0.018774527709872903
273
+ name: Cosine Recall@1
274
+ - type: cosine_recall@3
275
+ value: 0.0232211263780007
276
+ name: Cosine Recall@3
277
+ - type: cosine_recall@5
278
+ value: 0.024204576320398637
279
+ name: Cosine Recall@5
280
+ - type: cosine_recall@10
281
+ value: 0.025169293882941365
282
+ name: Cosine Recall@10
283
+ - type: cosine_ndcg@10
284
+ value: 0.17554680827328792
285
+ name: Cosine Ndcg@10
286
+ - type: cosine_mrr@10
287
+ value: 0.7621402212294056
288
+ name: Cosine Mrr@10
289
+ - type: cosine_map@100
290
+ value: 0.02123787521914149
291
+ name: Cosine Map@100
292
+ - task:
293
+ type: information-retrieval
294
+ name: Information Retrieval
295
+ dataset:
296
+ name: dim 32
297
+ type: dim_32
298
+ metrics:
299
+ - type: cosine_accuracy@1
300
+ value: 0.575908286268229
301
+ name: Cosine Accuracy@1
302
+ - type: cosine_accuracy@3
303
+ value: 0.7347214026806036
304
+ name: Cosine Accuracy@3
305
+ - type: cosine_accuracy@5
306
+ value: 0.780156790019388
307
+ name: Cosine Accuracy@5
308
+ - type: cosine_accuracy@10
309
+ value: 0.8298069628255922
310
+ name: Cosine Accuracy@10
311
+ - type: cosine_precision@1
312
+ value: 0.575908286268229
313
+ name: Cosine Precision@1
314
+ - type: cosine_precision@3
315
+ value: 0.24490713422686783
316
+ name: Cosine Precision@3
317
+ - type: cosine_precision@5
318
+ value: 0.1560313580038776
319
+ name: Cosine Precision@5
320
+ - type: cosine_precision@10
321
+ value: 0.08298069628255922
322
+ name: Cosine Precision@10
323
+ - type: cosine_recall@1
324
+ value: 0.015997452396339696
325
+ name: Cosine Recall@1
326
+ - type: cosine_recall@3
327
+ value: 0.020408927852238995
328
+ name: Cosine Recall@3
329
+ - type: cosine_recall@5
330
+ value: 0.021671021944983007
331
+ name: Cosine Recall@5
332
+ - type: cosine_recall@10
333
+ value: 0.02305019341182201
334
+ name: Cosine Recall@10
335
+ - type: cosine_ndcg@10
336
+ value: 0.1551668722356578
337
+ name: Cosine Ndcg@10
338
+ - type: cosine_mrr@10
339
+ value: 0.6648409286443452
340
+ name: Cosine Mrr@10
341
+ - type: cosine_map@100
342
+ value: 0.01858718928494409
343
+ name: Cosine Map@100
344
+ ---
345
+
346
+ # BGE micro v2 ESG
347
+
348
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [TaylorAI/bge-micro-v2](https://huggingface.co/TaylorAI/bge-micro-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
349
+
350
+ ## Model Details
351
+
352
+ ### Model Description
353
+ - **Model Type:** Sentence Transformer
354
+ - **Base model:** [TaylorAI/bge-micro-v2](https://huggingface.co/TaylorAI/bge-micro-v2) <!-- at revision 3edf6d7de0faa426b09780416fe61009f26ae589 -->
355
+ - **Maximum Sequence Length:** 512 tokens
356
+ - **Output Dimensionality:** 384 tokens
357
+ - **Similarity Function:** Cosine Similarity
358
+ <!-- - **Training Dataset:** Unknown -->
359
+ - **Language:** en
360
+ - **License:** apache-2.0
361
+
362
+ ### Model Sources
363
+
364
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
365
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
366
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
367
+
368
+ ### Full Model Architecture
369
+
370
+ ```
371
+ SentenceTransformer(
372
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
373
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
374
+ )
375
+ ```
376
+
377
+ ## Usage
378
+
379
+ ### Direct Usage (Sentence Transformers)
380
+
381
+ First install the Sentence Transformers library:
382
+
383
+ ```bash
384
+ pip install -U sentence-transformers
385
+ ```
386
+
387
+ Then you can load this model and run inference.
388
+ ```python
389
+ from sentence_transformers import SentenceTransformer
390
+
391
+ # Download from the 🤗 Hub
392
+ model = SentenceTransformer("elsayovita/bge-micro-v2-esg")
393
+ # Run inference
394
+ sentences = [
395
+ 'Employee health and well-being has never been more topical than it was in the past year. We understand that people around the world, including our employees, have been increasingly exposed to factors affecting their physical and mental wellbeing. We are committed to creating an environment that supports our employees and ensures they feel valued and have a sense of belonging. We utilised',
396
+ "Question: What is the company's commitment towards its employees' health and well-being based on the provided context information?",
397
+ 'What types of skills does NetLink focus on developing through their training and development opportunities for employees?',
398
+ ]
399
+ embeddings = model.encode(sentences)
400
+ print(embeddings.shape)
401
+ # [3, 384]
402
+
403
+ # Get the similarity scores for the embeddings
404
+ similarities = model.similarity(embeddings, embeddings)
405
+ print(similarities.shape)
406
+ # [3, 3]
407
+ ```
408
+
409
+ <!--
410
+ ### Direct Usage (Transformers)
411
+
412
+ <details><summary>Click to see the direct usage in Transformers</summary>
413
+
414
+ </details>
415
+ -->
416
+
417
+ <!--
418
+ ### Downstream Usage (Sentence Transformers)
419
+
420
+ You can finetune this model on your own dataset.
421
+
422
+ <details><summary>Click to expand</summary>
423
+
424
+ </details>
425
+ -->
426
+
427
+ <!--
428
+ ### Out-of-Scope Use
429
+
430
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
431
+ -->
432
+
433
+ ## Evaluation
434
+
435
+ ### Metrics
436
+
437
+ #### Information Retrieval
438
+ * Dataset: `dim_384`
439
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
440
+
441
+ | Metric | Value |
442
+ |:--------------------|:-----------|
443
+ | cosine_accuracy@1 | 0.7394 |
444
+ | cosine_accuracy@3 | 0.8871 |
445
+ | cosine_accuracy@5 | 0.9144 |
446
+ | cosine_accuracy@10 | 0.9383 |
447
+ | cosine_precision@1 | 0.7394 |
448
+ | cosine_precision@3 | 0.2957 |
449
+ | cosine_precision@5 | 0.1829 |
450
+ | cosine_precision@10 | 0.0938 |
451
+ | cosine_recall@1 | 0.0205 |
452
+ | cosine_recall@3 | 0.0246 |
453
+ | cosine_recall@5 | 0.0254 |
454
+ | cosine_recall@10 | 0.0261 |
455
+ | cosine_ndcg@10 | 0.1866 |
456
+ | cosine_mrr@10 | 0.8176 |
457
+ | **cosine_map@100** | **0.0228** |
458
+
459
+ #### Information Retrieval
460
+ * Dataset: `dim_256`
461
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
462
+
463
+ | Metric | Value |
464
+ |:--------------------|:-----------|
465
+ | cosine_accuracy@1 | 0.7316 |
466
+ | cosine_accuracy@3 | 0.8832 |
467
+ | cosine_accuracy@5 | 0.9112 |
468
+ | cosine_accuracy@10 | 0.9355 |
469
+ | cosine_precision@1 | 0.7316 |
470
+ | cosine_precision@3 | 0.2944 |
471
+ | cosine_precision@5 | 0.1822 |
472
+ | cosine_precision@10 | 0.0936 |
473
+ | cosine_recall@1 | 0.0203 |
474
+ | cosine_recall@3 | 0.0245 |
475
+ | cosine_recall@5 | 0.0253 |
476
+ | cosine_recall@10 | 0.026 |
477
+ | cosine_ndcg@10 | 0.1855 |
478
+ | cosine_mrr@10 | 0.812 |
479
+ | **cosine_map@100** | **0.0226** |
480
+
481
+ #### Information Retrieval
482
+ * Dataset: `dim_128`
483
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
484
+
485
+ | Metric | Value |
486
+ |:--------------------|:-----------|
487
+ | cosine_accuracy@1 | 0.7171 |
488
+ | cosine_accuracy@3 | 0.8736 |
489
+ | cosine_accuracy@5 | 0.9013 |
490
+ | cosine_accuracy@10 | 0.9279 |
491
+ | cosine_precision@1 | 0.7171 |
492
+ | cosine_precision@3 | 0.2912 |
493
+ | cosine_precision@5 | 0.1803 |
494
+ | cosine_precision@10 | 0.0928 |
495
+ | cosine_recall@1 | 0.0199 |
496
+ | cosine_recall@3 | 0.0243 |
497
+ | cosine_recall@5 | 0.025 |
498
+ | cosine_recall@10 | 0.0258 |
499
+ | cosine_ndcg@10 | 0.183 |
500
+ | cosine_mrr@10 | 0.7997 |
501
+ | **cosine_map@100** | **0.0223** |
502
+
503
+ #### Information Retrieval
504
+ * Dataset: `dim_64`
505
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
506
+
507
+ | Metric | Value |
508
+ |:--------------------|:-----------|
509
+ | cosine_accuracy@1 | 0.6759 |
510
+ | cosine_accuracy@3 | 0.836 |
511
+ | cosine_accuracy@5 | 0.8714 |
512
+ | cosine_accuracy@10 | 0.9061 |
513
+ | cosine_precision@1 | 0.6759 |
514
+ | cosine_precision@3 | 0.2787 |
515
+ | cosine_precision@5 | 0.1743 |
516
+ | cosine_precision@10 | 0.0906 |
517
+ | cosine_recall@1 | 0.0188 |
518
+ | cosine_recall@3 | 0.0232 |
519
+ | cosine_recall@5 | 0.0242 |
520
+ | cosine_recall@10 | 0.0252 |
521
+ | cosine_ndcg@10 | 0.1755 |
522
+ | cosine_mrr@10 | 0.7621 |
523
+ | **cosine_map@100** | **0.0212** |
524
+
525
+ #### Information Retrieval
526
+ * Dataset: `dim_32`
527
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
528
+
529
+ | Metric | Value |
530
+ |:--------------------|:-----------|
531
+ | cosine_accuracy@1 | 0.5759 |
532
+ | cosine_accuracy@3 | 0.7347 |
533
+ | cosine_accuracy@5 | 0.7802 |
534
+ | cosine_accuracy@10 | 0.8298 |
535
+ | cosine_precision@1 | 0.5759 |
536
+ | cosine_precision@3 | 0.2449 |
537
+ | cosine_precision@5 | 0.156 |
538
+ | cosine_precision@10 | 0.083 |
539
+ | cosine_recall@1 | 0.016 |
540
+ | cosine_recall@3 | 0.0204 |
541
+ | cosine_recall@5 | 0.0217 |
542
+ | cosine_recall@10 | 0.0231 |
543
+ | cosine_ndcg@10 | 0.1552 |
544
+ | cosine_mrr@10 | 0.6648 |
545
+ | **cosine_map@100** | **0.0186** |
546
+
547
+ <!--
548
+ ## Bias, Risks and Limitations
549
+
550
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
551
+ -->
552
+
553
+ <!--
554
+ ### Recommendations
555
+
556
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
557
+ -->
558
+
559
+ ## Training Details
560
+
561
+ ### Training Dataset
562
+
563
+ #### Unnamed Dataset
564
+
565
+
566
+ * Size: 11,863 training samples
567
+ * Columns: <code>context</code> and <code>question</code>
568
+ * Approximate statistics based on the first 1000 samples:
569
+ | | context | question |
570
+ |:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
571
+ | type | string | string |
572
+ | details | <ul><li>min: 13 tokens</li><li>mean: 40.74 tokens</li><li>max: 277 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 24.4 tokens</li><li>max: 62 tokens</li></ul> |
573
+ * Samples:
574
+ | context | question |
575
+ |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------|
576
+ | <code>The engagement with key stakeholders involves various topics and methods throughout the year</code> | <code>Question: What does the engagement with key stakeholders involve throughout the year?</code> |
577
+ | <code>For unitholders and analysts, the focus is on business and operations, the release of financial results, and the overall performance and announcements</code> | <code>Question: What is the focus for unitholders and analysts in terms of business and operations, financial results, performance, and announcements?</code> |
578
+ | <code>These are communicated through press releases and other required disclosures via SGXNet and NetLink's website</code> | <code>What platform is used to communicate press releases and required disclosures for NetLink?</code> |
579
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
580
+ ```json
581
+ {
582
+ "loss": "MultipleNegativesRankingLoss",
583
+ "matryoshka_dims": [
584
+ 384,
585
+ 256,
586
+ 128,
587
+ 64,
588
+ 32
589
+ ],
590
+ "matryoshka_weights": [
591
+ 1,
592
+ 1,
593
+ 1,
594
+ 1,
595
+ 1
596
+ ],
597
+ "n_dims_per_step": -1
598
+ }
599
+ ```
600
+
601
+ ### Training Hyperparameters
602
+ #### Non-Default Hyperparameters
603
+
604
+ - `eval_strategy`: epoch
605
+ - `per_device_train_batch_size`: 32
606
+ - `per_device_eval_batch_size`: 16
607
+ - `gradient_accumulation_steps`: 16
608
+ - `learning_rate`: 2e-05
609
+ - `num_train_epochs`: 2
610
+ - `lr_scheduler_type`: cosine
611
+ - `warmup_ratio`: 0.1
612
+ - `bf16`: True
613
+ - `tf32`: False
614
+ - `load_best_model_at_end`: True
615
+ - `optim`: adamw_torch_fused
616
+ - `batch_sampler`: no_duplicates
617
+
618
+ #### All Hyperparameters
619
+ <details><summary>Click to expand</summary>
620
+
621
+ - `overwrite_output_dir`: False
622
+ - `do_predict`: False
623
+ - `eval_strategy`: epoch
624
+ - `prediction_loss_only`: True
625
+ - `per_device_train_batch_size`: 32
626
+ - `per_device_eval_batch_size`: 16
627
+ - `per_gpu_train_batch_size`: None
628
+ - `per_gpu_eval_batch_size`: None
629
+ - `gradient_accumulation_steps`: 16
630
+ - `eval_accumulation_steps`: None
631
+ - `learning_rate`: 2e-05
632
+ - `weight_decay`: 0.0
633
+ - `adam_beta1`: 0.9
634
+ - `adam_beta2`: 0.999
635
+ - `adam_epsilon`: 1e-08
636
+ - `max_grad_norm`: 1.0
637
+ - `num_train_epochs`: 2
638
+ - `max_steps`: -1
639
+ - `lr_scheduler_type`: cosine
640
+ - `lr_scheduler_kwargs`: {}
641
+ - `warmup_ratio`: 0.1
642
+ - `warmup_steps`: 0
643
+ - `log_level`: passive
644
+ - `log_level_replica`: warning
645
+ - `log_on_each_node`: True
646
+ - `logging_nan_inf_filter`: True
647
+ - `save_safetensors`: True
648
+ - `save_on_each_node`: False
649
+ - `save_only_model`: False
650
+ - `restore_callback_states_from_checkpoint`: False
651
+ - `no_cuda`: False
652
+ - `use_cpu`: False
653
+ - `use_mps_device`: False
654
+ - `seed`: 42
655
+ - `data_seed`: None
656
+ - `jit_mode_eval`: False
657
+ - `use_ipex`: False
658
+ - `bf16`: True
659
+ - `fp16`: False
660
+ - `fp16_opt_level`: O1
661
+ - `half_precision_backend`: auto
662
+ - `bf16_full_eval`: False
663
+ - `fp16_full_eval`: False
664
+ - `tf32`: False
665
+ - `local_rank`: 0
666
+ - `ddp_backend`: None
667
+ - `tpu_num_cores`: None
668
+ - `tpu_metrics_debug`: False
669
+ - `debug`: []
670
+ - `dataloader_drop_last`: False
671
+ - `dataloader_num_workers`: 0
672
+ - `dataloader_prefetch_factor`: None
673
+ - `past_index`: -1
674
+ - `disable_tqdm`: False
675
+ - `remove_unused_columns`: True
676
+ - `label_names`: None
677
+ - `load_best_model_at_end`: True
678
+ - `ignore_data_skip`: False
679
+ - `fsdp`: []
680
+ - `fsdp_min_num_params`: 0
681
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
682
+ - `fsdp_transformer_layer_cls_to_wrap`: None
683
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
684
+ - `deepspeed`: None
685
+ - `label_smoothing_factor`: 0.0
686
+ - `optim`: adamw_torch_fused
687
+ - `optim_args`: None
688
+ - `adafactor`: False
689
+ - `group_by_length`: False
690
+ - `length_column_name`: length
691
+ - `ddp_find_unused_parameters`: None
692
+ - `ddp_bucket_cap_mb`: None
693
+ - `ddp_broadcast_buffers`: False
694
+ - `dataloader_pin_memory`: True
695
+ - `dataloader_persistent_workers`: False
696
+ - `skip_memory_metrics`: True
697
+ - `use_legacy_prediction_loop`: False
698
+ - `push_to_hub`: False
699
+ - `resume_from_checkpoint`: None
700
+ - `hub_model_id`: None
701
+ - `hub_strategy`: every_save
702
+ - `hub_private_repo`: False
703
+ - `hub_always_push`: False
704
+ - `gradient_checkpointing`: False
705
+ - `gradient_checkpointing_kwargs`: None
706
+ - `include_inputs_for_metrics`: False
707
+ - `eval_do_concat_batches`: True
708
+ - `fp16_backend`: auto
709
+ - `push_to_hub_model_id`: None
710
+ - `push_to_hub_organization`: None
711
+ - `mp_parameters`:
712
+ - `auto_find_batch_size`: False
713
+ - `full_determinism`: False
714
+ - `torchdynamo`: None
715
+ - `ray_scope`: last
716
+ - `ddp_timeout`: 1800
717
+ - `torch_compile`: False
718
+ - `torch_compile_backend`: None
719
+ - `torch_compile_mode`: None
720
+ - `dispatch_batches`: None
721
+ - `split_batches`: None
722
+ - `include_tokens_per_second`: False
723
+ - `include_num_input_tokens_seen`: False
724
+ - `neftune_noise_alpha`: None
725
+ - `optim_target_modules`: None
726
+ - `batch_eval_metrics`: False
727
+ - `eval_on_start`: False
728
+ - `batch_sampler`: no_duplicates
729
+ - `multi_dataset_batch_sampler`: proportional
730
+
731
+ </details>
732
+
733
+ ### Training Logs
734
+ | Epoch | Step | Training Loss | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_32_cosine_map@100 | dim_384_cosine_map@100 | dim_64_cosine_map@100 |
735
+ |:----------:|:------:|:-------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:|:---------------------:|
736
+ | 0.4313 | 10 | 5.0772 | - | - | - | - | - |
737
+ | 0.8625 | 20 | 3.2666 | - | - | - | - | - |
738
+ | 1.0350 | 24 | - | 0.0221 | 0.0224 | 0.0185 | 0.0226 | 0.0211 |
739
+ | 1.2264 | 30 | 3.1157 | - | - | - | - | - |
740
+ | 1.6577 | 40 | 2.585 | - | - | - | - | - |
741
+ | **1.9164** | **46** | **-** | **0.0223** | **0.0226** | **0.0186** | **0.0228** | **0.0212** |
742
+
743
+ * The bold row denotes the saved checkpoint.
744
+
745
+ ### Framework Versions
746
+ - Python: 3.10.12
747
+ - Sentence Transformers: 3.0.1
748
+ - Transformers: 4.42.4
749
+ - PyTorch: 2.4.0+cu121
750
+ - Accelerate: 0.32.1
751
+ - Datasets: 2.21.0
752
+ - Tokenizers: 0.19.1
753
+
754
+ ## Citation
755
+
756
+ ### BibTeX
757
+
758
+ #### Sentence Transformers
759
+ ```bibtex
760
+ @inproceedings{reimers-2019-sentence-bert,
761
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
762
+ author = "Reimers, Nils and Gurevych, Iryna",
763
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
764
+ month = "11",
765
+ year = "2019",
766
+ publisher = "Association for Computational Linguistics",
767
+ url = "https://arxiv.org/abs/1908.10084",
768
+ }
769
+ ```
770
+
771
+ #### MatryoshkaLoss
772
+ ```bibtex
773
+ @misc{kusupati2024matryoshka,
774
+ title={Matryoshka Representation Learning},
775
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
776
+ year={2024},
777
+ eprint={2205.13147},
778
+ archivePrefix={arXiv},
779
+ primaryClass={cs.LG}
780
+ }
781
+ ```
782
+
783
+ #### MultipleNegativesRankingLoss
784
+ ```bibtex
785
+ @misc{henderson2017efficient,
786
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
787
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
788
+ year={2017},
789
+ eprint={1705.00652},
790
+ archivePrefix={arXiv},
791
+ primaryClass={cs.CL}
792
+ }
793
+ ```
794
+
795
+ <!--
796
+ ## Glossary
797
+
798
+ *Clearly define terms in order to be accessible across audiences.*
799
+ -->
800
+
801
+ <!--
802
+ ## Model Card Authors
803
+
804
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
805
+ -->
806
+
807
+ <!--
808
+ ## Model Card Contact
809
+
810
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
811
+ -->
config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "TaylorAI/bge-micro-v2",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 384,
11
+ "id2label": {
12
+ "0": "LABEL_0"
13
+ },
14
+ "initializer_range": 0.02,
15
+ "intermediate_size": 1536,
16
+ "label2id": {
17
+ "LABEL_0": 0
18
+ },
19
+ "layer_norm_eps": 1e-12,
20
+ "max_position_embeddings": 512,
21
+ "model_type": "bert",
22
+ "num_attention_heads": 12,
23
+ "num_hidden_layers": 3,
24
+ "pad_token_id": 0,
25
+ "position_embedding_type": "absolute",
26
+ "torch_dtype": "float32",
27
+ "transformers_version": "4.42.4",
28
+ "type_vocab_size": 2,
29
+ "use_cache": true,
30
+ "vocab_size": 30522
31
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.42.4",
5
+ "pytorch": "2.4.0+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fbee868052a84747aebc36f015fd21e77732ed7c44e7975e34910e2afde9b514
3
+ size 69565312
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "[PAD]",
4
+ "[UNK]",
5
+ "[CLS]",
6
+ "[SEP]",
7
+ "[MASK]"
8
+ ],
9
+ "cls_token": {
10
+ "content": "[CLS]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "mask_token": {
17
+ "content": "[MASK]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "pad_token": {
24
+ "content": "[PAD]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "sep_token": {
31
+ "content": "[SEP]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "unk_token": {
38
+ "content": "[UNK]",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ }
44
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "additional_special_tokens": [
45
+ "[PAD]",
46
+ "[UNK]",
47
+ "[CLS]",
48
+ "[SEP]",
49
+ "[MASK]"
50
+ ],
51
+ "clean_up_tokenization_spaces": true,
52
+ "cls_token": "[CLS]",
53
+ "do_basic_tokenize": true,
54
+ "do_lower_case": true,
55
+ "mask_token": "[MASK]",
56
+ "max_length": 512,
57
+ "model_max_length": 512,
58
+ "never_split": null,
59
+ "pad_to_multiple_of": null,
60
+ "pad_token": "[PAD]",
61
+ "pad_token_type_id": 0,
62
+ "padding_side": "right",
63
+ "sep_token": "[SEP]",
64
+ "stride": 0,
65
+ "strip_accents": null,
66
+ "tokenize_chinese_chars": true,
67
+ "tokenizer_class": "BertTokenizer",
68
+ "truncation_side": "right",
69
+ "truncation_strategy": "longest_first",
70
+ "unk_token": "[UNK]"
71
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff