IlhamEbdesk commited on
Commit
ada6ac6
1 Parent(s): 4091a83

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,790 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-base-en-v1.5
3
+ datasets: []
4
+ language:
5
+ - my
6
+ library_name: sentence-transformers
7
+ license: apache-2.0
8
+ metrics:
9
+ - cosine_accuracy@1
10
+ - cosine_accuracy@3
11
+ - cosine_accuracy@5
12
+ - cosine_accuracy@10
13
+ - cosine_precision@1
14
+ - cosine_precision@3
15
+ - cosine_precision@5
16
+ - cosine_precision@10
17
+ - cosine_recall@1
18
+ - cosine_recall@3
19
+ - cosine_recall@5
20
+ - cosine_recall@10
21
+ - cosine_ndcg@10
22
+ - cosine_mrr@10
23
+ - cosine_map@100
24
+ pipeline_tag: sentence-similarity
25
+ tags:
26
+ - sentence-transformers
27
+ - sentence-similarity
28
+ - feature-extraction
29
+ - generated_from_trainer
30
+ - dataset_size:389
31
+ - loss:MatryoshkaLoss
32
+ - loss:MultipleNegativesRankingLoss
33
+ widget:
34
+ - source_sentence: Tukang kayu adalah individu yang bekerja dengan kayu untuk membina
35
+ atau membaiki struktur dan perabot.
36
+ sentences:
37
+ - Apakah itu pakar latihan?
38
+ - Apakah itu tukang kayu?
39
+ - Apakah itu pakar mikrobiologi?
40
+ - source_sentence: Pakar pemakanan adalah profesional yang memberi nasihat mengenai
41
+ pemakanan dan diet untuk meningkatkan kesihatan.
42
+ sentences:
43
+ - Apakah itu penulis kreatif?
44
+ - Apakah itu ahli geologi marin?
45
+ - Apakah itu pakar pemakanan?
46
+ - source_sentence: Dokter adalah profesional medis yang mendiagnosis dan merawat penyakit
47
+ serta cedera pasien.
48
+ sentences:
49
+ - Apa itu dokter?
50
+ - Apakah itu pengurus kargo?
51
+ - Apakah itu pakar teknologi nano?
52
+ - source_sentence: Juruteknik pembinaan kapal adalah individu yang terlibat dalam
53
+ proses pembinaan dan pembaikan kapal, memastikan struktur dan sistem kapal dibina
54
+ mengikut spesifikasi.
55
+ sentences:
56
+ - Apakah itu juruteknik pembinaan kapal?
57
+ - Apakah itu pengurus projek IT?
58
+ - Apakah itu pakar perkapalan?
59
+ - source_sentence: Penyelaras kempen iklan adalah individu yang menyelaraskan semua
60
+ aspek kempen iklan, termasuk jadual, pelaksanaan, dan laporan prestasi.
61
+ sentences:
62
+ - Apakah itu jurutera sistem propulsi?
63
+ - Apakah itu pembuat roti?
64
+ - Apakah itu penyelaras kempen iklan?
65
+ model-index:
66
+ - name: BGE base Financial Matryoshka
67
+ results:
68
+ - task:
69
+ type: information-retrieval
70
+ name: Information Retrieval
71
+ dataset:
72
+ name: dim 768
73
+ type: dim_768
74
+ metrics:
75
+ - type: cosine_accuracy@1
76
+ value: 0.8226221079691517
77
+ name: Cosine Accuracy@1
78
+ - type: cosine_accuracy@3
79
+ value: 0.9768637532133676
80
+ name: Cosine Accuracy@3
81
+ - type: cosine_accuracy@5
82
+ value: 0.987146529562982
83
+ name: Cosine Accuracy@5
84
+ - type: cosine_accuracy@10
85
+ value: 0.9974293059125964
86
+ name: Cosine Accuracy@10
87
+ - type: cosine_precision@1
88
+ value: 0.8226221079691517
89
+ name: Cosine Precision@1
90
+ - type: cosine_precision@3
91
+ value: 0.32562125107112255
92
+ name: Cosine Precision@3
93
+ - type: cosine_precision@5
94
+ value: 0.1974293059125964
95
+ name: Cosine Precision@5
96
+ - type: cosine_precision@10
97
+ value: 0.09974293059125963
98
+ name: Cosine Precision@10
99
+ - type: cosine_recall@1
100
+ value: 0.8226221079691517
101
+ name: Cosine Recall@1
102
+ - type: cosine_recall@3
103
+ value: 0.9768637532133676
104
+ name: Cosine Recall@3
105
+ - type: cosine_recall@5
106
+ value: 0.987146529562982
107
+ name: Cosine Recall@5
108
+ - type: cosine_recall@10
109
+ value: 0.9974293059125964
110
+ name: Cosine Recall@10
111
+ - type: cosine_ndcg@10
112
+ value: 0.9255252859780915
113
+ name: Cosine Ndcg@10
114
+ - type: cosine_mrr@10
115
+ value: 0.9009670706328802
116
+ name: Cosine Mrr@10
117
+ - type: cosine_map@100
118
+ value: 0.9011023703216912
119
+ name: Cosine Map@100
120
+ - task:
121
+ type: information-retrieval
122
+ name: Information Retrieval
123
+ dataset:
124
+ name: dim 512
125
+ type: dim_512
126
+ metrics:
127
+ - type: cosine_accuracy@1
128
+ value: 0.8046272493573264
129
+ name: Cosine Accuracy@1
130
+ - type: cosine_accuracy@3
131
+ value: 0.974293059125964
132
+ name: Cosine Accuracy@3
133
+ - type: cosine_accuracy@5
134
+ value: 0.987146529562982
135
+ name: Cosine Accuracy@5
136
+ - type: cosine_accuracy@10
137
+ value: 0.9922879177377892
138
+ name: Cosine Accuracy@10
139
+ - type: cosine_precision@1
140
+ value: 0.8046272493573264
141
+ name: Cosine Precision@1
142
+ - type: cosine_precision@3
143
+ value: 0.324764353041988
144
+ name: Cosine Precision@3
145
+ - type: cosine_precision@5
146
+ value: 0.1974293059125964
147
+ name: Cosine Precision@5
148
+ - type: cosine_precision@10
149
+ value: 0.0992287917737789
150
+ name: Cosine Precision@10
151
+ - type: cosine_recall@1
152
+ value: 0.8046272493573264
153
+ name: Cosine Recall@1
154
+ - type: cosine_recall@3
155
+ value: 0.974293059125964
156
+ name: Cosine Recall@3
157
+ - type: cosine_recall@5
158
+ value: 0.987146529562982
159
+ name: Cosine Recall@5
160
+ - type: cosine_recall@10
161
+ value: 0.9922879177377892
162
+ name: Cosine Recall@10
163
+ - type: cosine_ndcg@10
164
+ value: 0.9158947182791948
165
+ name: Cosine Ndcg@10
166
+ - type: cosine_mrr@10
167
+ value: 0.8895519647447668
168
+ name: Cosine Mrr@10
169
+ - type: cosine_map@100
170
+ value: 0.8900397092700132
171
+ name: Cosine Map@100
172
+ - task:
173
+ type: information-retrieval
174
+ name: Information Retrieval
175
+ dataset:
176
+ name: dim 256
177
+ type: dim_256
178
+ metrics:
179
+ - type: cosine_accuracy@1
180
+ value: 0.7892030848329049
181
+ name: Cosine Accuracy@1
182
+ - type: cosine_accuracy@3
183
+ value: 0.9665809768637532
184
+ name: Cosine Accuracy@3
185
+ - type: cosine_accuracy@5
186
+ value: 0.974293059125964
187
+ name: Cosine Accuracy@5
188
+ - type: cosine_accuracy@10
189
+ value: 0.987146529562982
190
+ name: Cosine Accuracy@10
191
+ - type: cosine_precision@1
192
+ value: 0.7892030848329049
193
+ name: Cosine Precision@1
194
+ - type: cosine_precision@3
195
+ value: 0.3221936589545844
196
+ name: Cosine Precision@3
197
+ - type: cosine_precision@5
198
+ value: 0.19485861182519276
199
+ name: Cosine Precision@5
200
+ - type: cosine_precision@10
201
+ value: 0.0987146529562982
202
+ name: Cosine Precision@10
203
+ - type: cosine_recall@1
204
+ value: 0.7892030848329049
205
+ name: Cosine Recall@1
206
+ - type: cosine_recall@3
207
+ value: 0.9665809768637532
208
+ name: Cosine Recall@3
209
+ - type: cosine_recall@5
210
+ value: 0.974293059125964
211
+ name: Cosine Recall@5
212
+ - type: cosine_recall@10
213
+ value: 0.987146529562982
214
+ name: Cosine Recall@10
215
+ - type: cosine_ndcg@10
216
+ value: 0.9046037741833534
217
+ name: Cosine Ndcg@10
218
+ - type: cosine_mrr@10
219
+ value: 0.8764455053658137
220
+ name: Cosine Mrr@10
221
+ - type: cosine_map@100
222
+ value: 0.8770676096874822
223
+ name: Cosine Map@100
224
+ - task:
225
+ type: information-retrieval
226
+ name: Information Retrieval
227
+ dataset:
228
+ name: dim 128
229
+ type: dim_128
230
+ metrics:
231
+ - type: cosine_accuracy@1
232
+ value: 0.7480719794344473
233
+ name: Cosine Accuracy@1
234
+ - type: cosine_accuracy@3
235
+ value: 0.9408740359897172
236
+ name: Cosine Accuracy@3
237
+ - type: cosine_accuracy@5
238
+ value: 0.9537275064267352
239
+ name: Cosine Accuracy@5
240
+ - type: cosine_accuracy@10
241
+ value: 0.9691516709511568
242
+ name: Cosine Accuracy@10
243
+ - type: cosine_precision@1
244
+ value: 0.7480719794344473
245
+ name: Cosine Precision@1
246
+ - type: cosine_precision@3
247
+ value: 0.31362467866323906
248
+ name: Cosine Precision@3
249
+ - type: cosine_precision@5
250
+ value: 0.190745501285347
251
+ name: Cosine Precision@5
252
+ - type: cosine_precision@10
253
+ value: 0.09691516709511568
254
+ name: Cosine Precision@10
255
+ - type: cosine_recall@1
256
+ value: 0.7480719794344473
257
+ name: Cosine Recall@1
258
+ - type: cosine_recall@3
259
+ value: 0.9408740359897172
260
+ name: Cosine Recall@3
261
+ - type: cosine_recall@5
262
+ value: 0.9537275064267352
263
+ name: Cosine Recall@5
264
+ - type: cosine_recall@10
265
+ value: 0.9691516709511568
266
+ name: Cosine Recall@10
267
+ - type: cosine_ndcg@10
268
+ value: 0.8765083941585068
269
+ name: Cosine Ndcg@10
270
+ - type: cosine_mrr@10
271
+ value: 0.8449820459460564
272
+ name: Cosine Mrr@10
273
+ - type: cosine_map@100
274
+ value: 0.8461326502118156
275
+ name: Cosine Map@100
276
+ - task:
277
+ type: information-retrieval
278
+ name: Information Retrieval
279
+ dataset:
280
+ name: dim 64
281
+ type: dim_64
282
+ metrics:
283
+ - type: cosine_accuracy@1
284
+ value: 0.7223650385604113
285
+ name: Cosine Accuracy@1
286
+ - type: cosine_accuracy@3
287
+ value: 0.897172236503856
288
+ name: Cosine Accuracy@3
289
+ - type: cosine_accuracy@5
290
+ value: 0.9254498714652957
291
+ name: Cosine Accuracy@5
292
+ - type: cosine_accuracy@10
293
+ value: 0.9434447300771208
294
+ name: Cosine Accuracy@10
295
+ - type: cosine_precision@1
296
+ value: 0.7223650385604113
297
+ name: Cosine Precision@1
298
+ - type: cosine_precision@3
299
+ value: 0.29905741216795206
300
+ name: Cosine Precision@3
301
+ - type: cosine_precision@5
302
+ value: 0.18508997429305912
303
+ name: Cosine Precision@5
304
+ - type: cosine_precision@10
305
+ value: 0.09434447300771207
306
+ name: Cosine Precision@10
307
+ - type: cosine_recall@1
308
+ value: 0.7223650385604113
309
+ name: Cosine Recall@1
310
+ - type: cosine_recall@3
311
+ value: 0.897172236503856
312
+ name: Cosine Recall@3
313
+ - type: cosine_recall@5
314
+ value: 0.9254498714652957
315
+ name: Cosine Recall@5
316
+ - type: cosine_recall@10
317
+ value: 0.9434447300771208
318
+ name: Cosine Recall@10
319
+ - type: cosine_ndcg@10
320
+ value: 0.8455216956566762
321
+ name: Cosine Ndcg@10
322
+ - type: cosine_mrr@10
323
+ value: 0.8126851511812953
324
+ name: Cosine Mrr@10
325
+ - type: cosine_map@100
326
+ value: 0.8145628077638951
327
+ name: Cosine Map@100
328
+ ---
329
+
330
+ # BGE base Financial Matryoshka
331
+
332
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
333
+
334
+ ## Model Details
335
+
336
+ ### Model Description
337
+ - **Model Type:** Sentence Transformer
338
+ - **Base model:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
339
+ - **Maximum Sequence Length:** 512 tokens
340
+ - **Output Dimensionality:** 768 tokens
341
+ - **Similarity Function:** Cosine Similarity
342
+ <!-- - **Training Dataset:** Unknown -->
343
+ - **Language:** my
344
+ - **License:** apache-2.0
345
+
346
+ ### Model Sources
347
+
348
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
349
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
350
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
351
+
352
+ ### Full Model Architecture
353
+
354
+ ```
355
+ SentenceTransformer(
356
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
357
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
358
+ (2): Normalize()
359
+ )
360
+ ```
361
+
362
+ ## Usage
363
+
364
+ ### Direct Usage (Sentence Transformers)
365
+
366
+ First install the Sentence Transformers library:
367
+
368
+ ```bash
369
+ pip install -U sentence-transformers
370
+ ```
371
+
372
+ Then you can load this model and run inference.
373
+ ```python
374
+ from sentence_transformers import SentenceTransformer
375
+
376
+ # Download from the 🤗 Hub
377
+ model = SentenceTransformer("IlhamEbdesk/bge-base-financial-matryoshka_test_my")
378
+ # Run inference
379
+ sentences = [
380
+ 'Penyelaras kempen iklan adalah individu yang menyelaraskan semua aspek kempen iklan, termasuk jadual, pelaksanaan, dan laporan prestasi.',
381
+ 'Apakah itu penyelaras kempen iklan?',
382
+ 'Apakah itu pembuat roti?',
383
+ ]
384
+ embeddings = model.encode(sentences)
385
+ print(embeddings.shape)
386
+ # [3, 768]
387
+
388
+ # Get the similarity scores for the embeddings
389
+ similarities = model.similarity(embeddings, embeddings)
390
+ print(similarities.shape)
391
+ # [3, 3]
392
+ ```
393
+
394
+ <!--
395
+ ### Direct Usage (Transformers)
396
+
397
+ <details><summary>Click to see the direct usage in Transformers</summary>
398
+
399
+ </details>
400
+ -->
401
+
402
+ <!--
403
+ ### Downstream Usage (Sentence Transformers)
404
+
405
+ You can finetune this model on your own dataset.
406
+
407
+ <details><summary>Click to expand</summary>
408
+
409
+ </details>
410
+ -->
411
+
412
+ <!--
413
+ ### Out-of-Scope Use
414
+
415
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
416
+ -->
417
+
418
+ ## Evaluation
419
+
420
+ ### Metrics
421
+
422
+ #### Information Retrieval
423
+ * Dataset: `dim_768`
424
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
425
+
426
+ | Metric | Value |
427
+ |:--------------------|:-----------|
428
+ | cosine_accuracy@1 | 0.8226 |
429
+ | cosine_accuracy@3 | 0.9769 |
430
+ | cosine_accuracy@5 | 0.9871 |
431
+ | cosine_accuracy@10 | 0.9974 |
432
+ | cosine_precision@1 | 0.8226 |
433
+ | cosine_precision@3 | 0.3256 |
434
+ | cosine_precision@5 | 0.1974 |
435
+ | cosine_precision@10 | 0.0997 |
436
+ | cosine_recall@1 | 0.8226 |
437
+ | cosine_recall@3 | 0.9769 |
438
+ | cosine_recall@5 | 0.9871 |
439
+ | cosine_recall@10 | 0.9974 |
440
+ | cosine_ndcg@10 | 0.9255 |
441
+ | cosine_mrr@10 | 0.901 |
442
+ | **cosine_map@100** | **0.9011** |
443
+
444
+ #### Information Retrieval
445
+ * Dataset: `dim_512`
446
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
447
+
448
+ | Metric | Value |
449
+ |:--------------------|:---------|
450
+ | cosine_accuracy@1 | 0.8046 |
451
+ | cosine_accuracy@3 | 0.9743 |
452
+ | cosine_accuracy@5 | 0.9871 |
453
+ | cosine_accuracy@10 | 0.9923 |
454
+ | cosine_precision@1 | 0.8046 |
455
+ | cosine_precision@3 | 0.3248 |
456
+ | cosine_precision@5 | 0.1974 |
457
+ | cosine_precision@10 | 0.0992 |
458
+ | cosine_recall@1 | 0.8046 |
459
+ | cosine_recall@3 | 0.9743 |
460
+ | cosine_recall@5 | 0.9871 |
461
+ | cosine_recall@10 | 0.9923 |
462
+ | cosine_ndcg@10 | 0.9159 |
463
+ | cosine_mrr@10 | 0.8896 |
464
+ | **cosine_map@100** | **0.89** |
465
+
466
+ #### Information Retrieval
467
+ * Dataset: `dim_256`
468
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
469
+
470
+ | Metric | Value |
471
+ |:--------------------|:-----------|
472
+ | cosine_accuracy@1 | 0.7892 |
473
+ | cosine_accuracy@3 | 0.9666 |
474
+ | cosine_accuracy@5 | 0.9743 |
475
+ | cosine_accuracy@10 | 0.9871 |
476
+ | cosine_precision@1 | 0.7892 |
477
+ | cosine_precision@3 | 0.3222 |
478
+ | cosine_precision@5 | 0.1949 |
479
+ | cosine_precision@10 | 0.0987 |
480
+ | cosine_recall@1 | 0.7892 |
481
+ | cosine_recall@3 | 0.9666 |
482
+ | cosine_recall@5 | 0.9743 |
483
+ | cosine_recall@10 | 0.9871 |
484
+ | cosine_ndcg@10 | 0.9046 |
485
+ | cosine_mrr@10 | 0.8764 |
486
+ | **cosine_map@100** | **0.8771** |
487
+
488
+ #### Information Retrieval
489
+ * Dataset: `dim_128`
490
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
491
+
492
+ | Metric | Value |
493
+ |:--------------------|:-----------|
494
+ | cosine_accuracy@1 | 0.7481 |
495
+ | cosine_accuracy@3 | 0.9409 |
496
+ | cosine_accuracy@5 | 0.9537 |
497
+ | cosine_accuracy@10 | 0.9692 |
498
+ | cosine_precision@1 | 0.7481 |
499
+ | cosine_precision@3 | 0.3136 |
500
+ | cosine_precision@5 | 0.1907 |
501
+ | cosine_precision@10 | 0.0969 |
502
+ | cosine_recall@1 | 0.7481 |
503
+ | cosine_recall@3 | 0.9409 |
504
+ | cosine_recall@5 | 0.9537 |
505
+ | cosine_recall@10 | 0.9692 |
506
+ | cosine_ndcg@10 | 0.8765 |
507
+ | cosine_mrr@10 | 0.845 |
508
+ | **cosine_map@100** | **0.8461** |
509
+
510
+ #### Information Retrieval
511
+ * Dataset: `dim_64`
512
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
513
+
514
+ | Metric | Value |
515
+ |:--------------------|:-----------|
516
+ | cosine_accuracy@1 | 0.7224 |
517
+ | cosine_accuracy@3 | 0.8972 |
518
+ | cosine_accuracy@5 | 0.9254 |
519
+ | cosine_accuracy@10 | 0.9434 |
520
+ | cosine_precision@1 | 0.7224 |
521
+ | cosine_precision@3 | 0.2991 |
522
+ | cosine_precision@5 | 0.1851 |
523
+ | cosine_precision@10 | 0.0943 |
524
+ | cosine_recall@1 | 0.7224 |
525
+ | cosine_recall@3 | 0.8972 |
526
+ | cosine_recall@5 | 0.9254 |
527
+ | cosine_recall@10 | 0.9434 |
528
+ | cosine_ndcg@10 | 0.8455 |
529
+ | cosine_mrr@10 | 0.8127 |
530
+ | **cosine_map@100** | **0.8146** |
531
+
532
+ <!--
533
+ ## Bias, Risks and Limitations
534
+
535
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
536
+ -->
537
+
538
+ <!--
539
+ ### Recommendations
540
+
541
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
542
+ -->
543
+
544
+ ## Training Details
545
+
546
+ ### Training Dataset
547
+
548
+ #### Unnamed Dataset
549
+
550
+
551
+ * Size: 389 training samples
552
+ * Columns: <code>positive</code> and <code>anchor</code>
553
+ * Approximate statistics based on the first 1000 samples:
554
+ | | positive | anchor |
555
+ |:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
556
+ | type | string | string |
557
+ | details | <ul><li>min: 27 tokens</li><li>mean: 61.59 tokens</li><li>max: 139 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 15.26 tokens</li><li>max: 24 tokens</li></ul> |
558
+ * Samples:
559
+ | positive | anchor |
560
+ |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------|
561
+ | <code>Dokter adalah profesional medis yang mendiagnosis dan merawat penyakit serta cedera pasien.</code> | <code>Apa itu dokter?</code> |
562
+ | <code>Pereka sistem akuakultur adalah individu yang merancang dan membangunkan sistem untuk membiakkan ikan secara berkesan, termasuk reka bentuk kolam, sistem aliran air, dan pemantauan kualiti air.</code> | <code>Apakah itu pereka sistem akuakultur?</code> |
563
+ | <code>Ahli sejarah seni adalah individu yang mengkaji perkembangan seni sepanjang sejarah dan konteks sosial, politik, dan budaya yang mempengaruhi penciptaannya. Mereka bekerja di muzium, galeri, dan institusi akademik, menganalisis karya seni</code> | <code>Apakah itu ahli sejarah seni?</code> |
564
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
565
+ ```json
566
+ {
567
+ "loss": "MultipleNegativesRankingLoss",
568
+ "matryoshka_dims": [
569
+ 768,
570
+ 512,
571
+ 256,
572
+ 128,
573
+ 64
574
+ ],
575
+ "matryoshka_weights": [
576
+ 1,
577
+ 1,
578
+ 1,
579
+ 1,
580
+ 1
581
+ ],
582
+ "n_dims_per_step": -1
583
+ }
584
+ ```
585
+
586
+ ### Training Hyperparameters
587
+ #### Non-Default Hyperparameters
588
+
589
+ - `eval_strategy`: epoch
590
+ - `per_device_train_batch_size`: 32
591
+ - `per_device_eval_batch_size`: 16
592
+ - `gradient_accumulation_steps`: 16
593
+ - `learning_rate`: 2e-05
594
+ - `num_train_epochs`: 4
595
+ - `lr_scheduler_type`: cosine
596
+ - `warmup_ratio`: 0.1
597
+ - `tf32`: False
598
+ - `load_best_model_at_end`: True
599
+ - `batch_sampler`: no_duplicates
600
+
601
+ #### All Hyperparameters
602
+ <details><summary>Click to expand</summary>
603
+
604
+ - `overwrite_output_dir`: False
605
+ - `do_predict`: False
606
+ - `eval_strategy`: epoch
607
+ - `prediction_loss_only`: True
608
+ - `per_device_train_batch_size`: 32
609
+ - `per_device_eval_batch_size`: 16
610
+ - `per_gpu_train_batch_size`: None
611
+ - `per_gpu_eval_batch_size`: None
612
+ - `gradient_accumulation_steps`: 16
613
+ - `eval_accumulation_steps`: None
614
+ - `learning_rate`: 2e-05
615
+ - `weight_decay`: 0.0
616
+ - `adam_beta1`: 0.9
617
+ - `adam_beta2`: 0.999
618
+ - `adam_epsilon`: 1e-08
619
+ - `max_grad_norm`: 1.0
620
+ - `num_train_epochs`: 4
621
+ - `max_steps`: -1
622
+ - `lr_scheduler_type`: cosine
623
+ - `lr_scheduler_kwargs`: {}
624
+ - `warmup_ratio`: 0.1
625
+ - `warmup_steps`: 0
626
+ - `log_level`: passive
627
+ - `log_level_replica`: warning
628
+ - `log_on_each_node`: True
629
+ - `logging_nan_inf_filter`: True
630
+ - `save_safetensors`: True
631
+ - `save_on_each_node`: False
632
+ - `save_only_model`: False
633
+ - `restore_callback_states_from_checkpoint`: False
634
+ - `no_cuda`: False
635
+ - `use_cpu`: False
636
+ - `use_mps_device`: False
637
+ - `seed`: 42
638
+ - `data_seed`: None
639
+ - `jit_mode_eval`: False
640
+ - `use_ipex`: False
641
+ - `bf16`: False
642
+ - `fp16`: False
643
+ - `fp16_opt_level`: O1
644
+ - `half_precision_backend`: auto
645
+ - `bf16_full_eval`: False
646
+ - `fp16_full_eval`: False
647
+ - `tf32`: False
648
+ - `local_rank`: 0
649
+ - `ddp_backend`: None
650
+ - `tpu_num_cores`: None
651
+ - `tpu_metrics_debug`: False
652
+ - `debug`: []
653
+ - `dataloader_drop_last`: False
654
+ - `dataloader_num_workers`: 0
655
+ - `dataloader_prefetch_factor`: None
656
+ - `past_index`: -1
657
+ - `disable_tqdm`: False
658
+ - `remove_unused_columns`: True
659
+ - `label_names`: None
660
+ - `load_best_model_at_end`: True
661
+ - `ignore_data_skip`: False
662
+ - `fsdp`: []
663
+ - `fsdp_min_num_params`: 0
664
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
665
+ - `fsdp_transformer_layer_cls_to_wrap`: None
666
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
667
+ - `deepspeed`: None
668
+ - `label_smoothing_factor`: 0.0
669
+ - `optim`: adamw_torch
670
+ - `optim_args`: None
671
+ - `adafactor`: False
672
+ - `group_by_length`: False
673
+ - `length_column_name`: length
674
+ - `ddp_find_unused_parameters`: None
675
+ - `ddp_bucket_cap_mb`: None
676
+ - `ddp_broadcast_buffers`: False
677
+ - `dataloader_pin_memory`: True
678
+ - `dataloader_persistent_workers`: False
679
+ - `skip_memory_metrics`: True
680
+ - `use_legacy_prediction_loop`: False
681
+ - `push_to_hub`: False
682
+ - `resume_from_checkpoint`: None
683
+ - `hub_model_id`: None
684
+ - `hub_strategy`: every_save
685
+ - `hub_private_repo`: False
686
+ - `hub_always_push`: False
687
+ - `gradient_checkpointing`: False
688
+ - `gradient_checkpointing_kwargs`: None
689
+ - `include_inputs_for_metrics`: False
690
+ - `eval_do_concat_batches`: True
691
+ - `fp16_backend`: auto
692
+ - `push_to_hub_model_id`: None
693
+ - `push_to_hub_organization`: None
694
+ - `mp_parameters`:
695
+ - `auto_find_batch_size`: False
696
+ - `full_determinism`: False
697
+ - `torchdynamo`: None
698
+ - `ray_scope`: last
699
+ - `ddp_timeout`: 1800
700
+ - `torch_compile`: False
701
+ - `torch_compile_backend`: None
702
+ - `torch_compile_mode`: None
703
+ - `dispatch_batches`: None
704
+ - `split_batches`: None
705
+ - `include_tokens_per_second`: False
706
+ - `include_num_input_tokens_seen`: False
707
+ - `neftune_noise_alpha`: None
708
+ - `optim_target_modules`: None
709
+ - `batch_eval_metrics`: False
710
+ - `batch_sampler`: no_duplicates
711
+ - `multi_dataset_batch_sampler`: proportional
712
+
713
+ </details>
714
+
715
+ ### Training Logs
716
+ | Epoch | Step | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_512_cosine_map@100 | dim_64_cosine_map@100 | dim_768_cosine_map@100 |
717
+ |:----------:|:-----:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:|
718
+ | 1.0 | 1 | 0.6375 | 0.7065 | 0.7339 | 0.5984 | 0.7483 |
719
+ | 2.0 | 3 | 0.8282 | 0.8712 | 0.8821 | 0.7994 | 0.8929 |
720
+ | **2.4615** | **4** | **0.8461** | **0.8771** | **0.89** | **0.8146** | **0.9011** |
721
+
722
+ * The bold row denotes the saved checkpoint.
723
+
724
+ ### Framework Versions
725
+ - Python: 3.10.12
726
+ - Sentence Transformers: 3.0.1
727
+ - Transformers: 4.41.2
728
+ - PyTorch: 2.1.2+cu121
729
+ - Accelerate: 0.32.1
730
+ - Datasets: 2.19.1
731
+ - Tokenizers: 0.19.1
732
+
733
+ ## Citation
734
+
735
+ ### BibTeX
736
+
737
+ #### Sentence Transformers
738
+ ```bibtex
739
+ @inproceedings{reimers-2019-sentence-bert,
740
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
741
+ author = "Reimers, Nils and Gurevych, Iryna",
742
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
743
+ month = "11",
744
+ year = "2019",
745
+ publisher = "Association for Computational Linguistics",
746
+ url = "https://arxiv.org/abs/1908.10084",
747
+ }
748
+ ```
749
+
750
+ #### MatryoshkaLoss
751
+ ```bibtex
752
+ @misc{kusupati2024matryoshka,
753
+ title={Matryoshka Representation Learning},
754
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
755
+ year={2024},
756
+ eprint={2205.13147},
757
+ archivePrefix={arXiv},
758
+ primaryClass={cs.LG}
759
+ }
760
+ ```
761
+
762
+ #### MultipleNegativesRankingLoss
763
+ ```bibtex
764
+ @misc{henderson2017efficient,
765
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
766
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
767
+ year={2017},
768
+ eprint={1705.00652},
769
+ archivePrefix={arXiv},
770
+ primaryClass={cs.CL}
771
+ }
772
+ ```
773
+
774
+ <!--
775
+ ## Glossary
776
+
777
+ *Clearly define terms in order to be accessible across audiences.*
778
+ -->
779
+
780
+ <!--
781
+ ## Model Card Authors
782
+
783
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
784
+ -->
785
+
786
+ <!--
787
+ ## Model Card Contact
788
+
789
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
790
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-base-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 3072,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 12,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.41.2",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.1.2+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6ac79a6d4082c28428fc9051f629ab0b6a9d4cf3c851d0e1e16340a9a781ce4f
3
+ size 437951328
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff