tomaarsen HF staff commited on
Commit
6356b5e
1 Parent(s): f32d520

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +980 -978
README.md CHANGED
@@ -1,979 +1,981 @@
1
- ---
2
- datasets:
3
- - sentence-transformers/gooaq
4
- language:
5
- - en
6
- library_name: sentence-transformers
7
- license: apache-2.0
8
- metrics:
9
- - cosine_accuracy@1
10
- - cosine_accuracy@3
11
- - cosine_accuracy@5
12
- - cosine_accuracy@10
13
- - cosine_precision@1
14
- - cosine_precision@3
15
- - cosine_precision@5
16
- - cosine_precision@10
17
- - cosine_recall@1
18
- - cosine_recall@3
19
- - cosine_recall@5
20
- - cosine_recall@10
21
- - cosine_ndcg@10
22
- - cosine_mrr@10
23
- - cosine_map@100
24
- pipeline_tag: sentence-similarity
25
- tags:
26
- - sentence-transformers
27
- - sentence-similarity
28
- - feature-extraction
29
- - generated_from_trainer
30
- - dataset_size:3012496
31
- - loss:MatryoshkaLoss
32
- - loss:MultipleNegativesRankingLoss
33
- widget:
34
- - source_sentence: how to sign legal documents as power of attorney?
35
- sentences:
36
- - 'After the principal''s name, write “by” and then sign your own name. Under or
37
- after the signature line, indicate your status as POA by including any of the
38
- following identifiers: as POA, as Agent, as Attorney in Fact or as Power of Attorney.'
39
- - '[''From the Home screen, swipe left to Apps.'', ''Tap Transfer my Data.'', ''Tap
40
- Menu (...).'', ''Tap Export to SD card.'']'
41
- - Ginger Dank Nugs (Grape) - 350mg. Feast your eyes on these unique and striking
42
- gourmet chocolates; Coco Nugs created by Ginger Dank. Crafted to resemble perfect
43
- nugs of cannabis, each of the 10 buds contains 35mg of THC. ... This is a perfect
44
- product for both cannabis and chocolate lovers, who appreciate a little twist.
45
- - source_sentence: how to delete vdom in fortigate?
46
- sentences:
47
- - Go to System -> VDOM -> VDOM2 and select 'Delete'. This VDOM is now successfully
48
- removed from the configuration.
49
- - 'Both combination birth control pills and progestin-only pills may cause headaches
50
- as a side effect. Additional side effects of birth control pills may include:
51
- breast tenderness. nausea.'
52
- - White cheese tends to show imperfections more readily and as consumers got more
53
- used to yellow-orange cheese, it became an expected option. Today, many cheddars
54
- are yellow. While most cheesemakers use annatto, some use an artificial coloring
55
- agent instead, according to Sachs.
56
- - source_sentence: where are earthquakes most likely to occur on earth?
57
- sentences:
58
- - Zelle in the Bank of the America app is a fast, safe, and easy way to send and
59
- receive money with family and friends who have a bank account in the U.S., all
60
- with no fees. Money moves in minutes directly between accounts that are already
61
- enrolled with Zelle.
62
- - It takes about 3 days for a spacecraft to reach the Moon. During that time a spacecraft
63
- travels at least 240,000 miles (386,400 kilometers) which is the distance between
64
- Earth and the Moon.
65
- - Most earthquakes occur along the edge of the oceanic and continental plates. The
66
- earth's crust (the outer layer of the planet) is made up of several pieces, called
67
- plates. The plates under the oceans are called oceanic plates and the rest are
68
- continental plates.
69
- - source_sentence: fix iphone is disabled connect to itunes without itunes?
70
- sentences:
71
- - To fix a disabled iPhone or iPad without iTunes, you have to erase your device.
72
- Click on the "Erase iPhone" option and confirm your selection. Wait for a while
73
- as the "Find My iPhone" feature will remotely erase your iOS device. Needless
74
- to say, it will also disable its lock.
75
- - How Māui brought fire to the world. One evening, after eating a hearty meal, Māui
76
- lay beside his fire staring into the flames. ... In the middle of the night, while
77
- everyone was sleeping, Māui went from village to village and extinguished all
78
- the fires until not a single fire burned in the world.
79
- - Angry Orchard makes a variety of year-round craft cider styles, including Angry
80
- Orchard Crisp Apple, a fruit-forward hard cider that balances the sweetness of
81
- culinary apples with dryness and bright acidity of bittersweet apples for a complex,
82
- refreshing taste.
83
- - source_sentence: how to reverse a video on tiktok that's not yours?
84
- sentences:
85
- - '[''Tap "Effects" at the bottom of your screen — it\''s an icon that looks like
86
- a clock. Open the Effects menu. ... '', ''At the end of the new list that appears,
87
- tap "Time." Select "Time" at the end. ... '', ''Select "Reverse" — you\''ll then
88
- see a preview of your new, reversed video appear on the screen.'']'
89
- - Franchise Facts Poke Bar has a franchise fee of up to $30,000, with a total initial
90
- investment range of $157,800 to $438,000. The initial cost of a franchise includes
91
- several fees -- Unlock this franchise to better understand the costs such as training
92
- and territory fees.
93
- - Relative age is the age of a rock layer (or the fossils it contains) compared
94
- to other layers. It can be determined by looking at the position of rock layers.
95
- Absolute age is the numeric age of a layer of rocks or fossils. Absolute age can
96
- be determined by using radiometric dating.
97
- co2_eq_emissions:
98
- emissions: 6.448001991119035
99
- energy_consumed: 0.0165885485310573
100
- source: codecarbon
101
- training_type: fine-tuning
102
- on_cloud: false
103
- cpu_model: 13th Gen Intel(R) Core(TM) i7-13700K
104
- ram_total_size: 31.777088165283203
105
- hours_used: 0.109
106
- hardware_used: 1 x NVIDIA GeForce RTX 3090
107
- model-index:
108
- - name: Static Embeddings with BERT uncased tokenizer finetuned on GooAQ pairs
109
- results:
110
- - task:
111
- type: information-retrieval
112
- name: Information Retrieval
113
- dataset:
114
- name: gooaq 1024 dev
115
- type: gooaq-1024-dev
116
- metrics:
117
- - type: cosine_accuracy@1
118
- value: 0.6309
119
- name: Cosine Accuracy@1
120
- - type: cosine_accuracy@3
121
- value: 0.8409
122
- name: Cosine Accuracy@3
123
- - type: cosine_accuracy@5
124
- value: 0.8986
125
- name: Cosine Accuracy@5
126
- - type: cosine_accuracy@10
127
- value: 0.9444
128
- name: Cosine Accuracy@10
129
- - type: cosine_precision@1
130
- value: 0.6309
131
- name: Cosine Precision@1
132
- - type: cosine_precision@3
133
- value: 0.28029999999999994
134
- name: Cosine Precision@3
135
- - type: cosine_precision@5
136
- value: 0.17972000000000002
137
- name: Cosine Precision@5
138
- - type: cosine_precision@10
139
- value: 0.09444000000000002
140
- name: Cosine Precision@10
141
- - type: cosine_recall@1
142
- value: 0.6309
143
- name: Cosine Recall@1
144
- - type: cosine_recall@3
145
- value: 0.8409
146
- name: Cosine Recall@3
147
- - type: cosine_recall@5
148
- value: 0.8986
149
- name: Cosine Recall@5
150
- - type: cosine_recall@10
151
- value: 0.9444
152
- name: Cosine Recall@10
153
- - type: cosine_ndcg@10
154
- value: 0.7932643237589305
155
- name: Cosine Ndcg@10
156
- - type: cosine_mrr@10
157
- value: 0.7440336111111036
158
- name: Cosine Mrr@10
159
- - type: cosine_map@100
160
- value: 0.7465739001132767
161
- name: Cosine Map@100
162
- - task:
163
- type: information-retrieval
164
- name: Information Retrieval
165
- dataset:
166
- name: gooaq 512 dev
167
- type: gooaq-512-dev
168
- metrics:
169
- - type: cosine_accuracy@1
170
- value: 0.6271
171
- name: Cosine Accuracy@1
172
- - type: cosine_accuracy@3
173
- value: 0.8366
174
- name: Cosine Accuracy@3
175
- - type: cosine_accuracy@5
176
- value: 0.8946
177
- name: Cosine Accuracy@5
178
- - type: cosine_accuracy@10
179
- value: 0.9431
180
- name: Cosine Accuracy@10
181
- - type: cosine_precision@1
182
- value: 0.6271
183
- name: Cosine Precision@1
184
- - type: cosine_precision@3
185
- value: 0.27886666666666665
186
- name: Cosine Precision@3
187
- - type: cosine_precision@5
188
- value: 0.17892000000000002
189
- name: Cosine Precision@5
190
- - type: cosine_precision@10
191
- value: 0.09431000000000002
192
- name: Cosine Precision@10
193
- - type: cosine_recall@1
194
- value: 0.6271
195
- name: Cosine Recall@1
196
- - type: cosine_recall@3
197
- value: 0.8366
198
- name: Cosine Recall@3
199
- - type: cosine_recall@5
200
- value: 0.8946
201
- name: Cosine Recall@5
202
- - type: cosine_recall@10
203
- value: 0.9431
204
- name: Cosine Recall@10
205
- - type: cosine_ndcg@10
206
- value: 0.7904860196985286
207
- name: Cosine Ndcg@10
208
- - type: cosine_mrr@10
209
- value: 0.7408453174603101
210
- name: Cosine Mrr@10
211
- - type: cosine_map@100
212
- value: 0.7434337897783787
213
- name: Cosine Map@100
214
- - task:
215
- type: information-retrieval
216
- name: Information Retrieval
217
- dataset:
218
- name: gooaq 256 dev
219
- type: gooaq-256-dev
220
- metrics:
221
- - type: cosine_accuracy@1
222
- value: 0.6192
223
- name: Cosine Accuracy@1
224
- - type: cosine_accuracy@3
225
- value: 0.8235
226
- name: Cosine Accuracy@3
227
- - type: cosine_accuracy@5
228
- value: 0.8866
229
- name: Cosine Accuracy@5
230
- - type: cosine_accuracy@10
231
- value: 0.9364
232
- name: Cosine Accuracy@10
233
- - type: cosine_precision@1
234
- value: 0.6192
235
- name: Cosine Precision@1
236
- - type: cosine_precision@3
237
- value: 0.27449999999999997
238
- name: Cosine Precision@3
239
- - type: cosine_precision@5
240
- value: 0.17732000000000003
241
- name: Cosine Precision@5
242
- - type: cosine_precision@10
243
- value: 0.09364000000000001
244
- name: Cosine Precision@10
245
- - type: cosine_recall@1
246
- value: 0.6192
247
- name: Cosine Recall@1
248
- - type: cosine_recall@3
249
- value: 0.8235
250
- name: Cosine Recall@3
251
- - type: cosine_recall@5
252
- value: 0.8866
253
- name: Cosine Recall@5
254
- - type: cosine_recall@10
255
- value: 0.9364
256
- name: Cosine Recall@10
257
- - type: cosine_ndcg@10
258
- value: 0.7821476540310974
259
- name: Cosine Ndcg@10
260
- - type: cosine_mrr@10
261
- value: 0.7321259126984055
262
- name: Cosine Mrr@10
263
- - type: cosine_map@100
264
- value: 0.7348893313013708
265
- name: Cosine Map@100
266
- - task:
267
- type: information-retrieval
268
- name: Information Retrieval
269
- dataset:
270
- name: gooaq 128 dev
271
- type: gooaq-128-dev
272
- metrics:
273
- - type: cosine_accuracy@1
274
- value: 0.5942
275
- name: Cosine Accuracy@1
276
- - type: cosine_accuracy@3
277
- value: 0.804
278
- name: Cosine Accuracy@3
279
- - type: cosine_accuracy@5
280
- value: 0.8721
281
- name: Cosine Accuracy@5
282
- - type: cosine_accuracy@10
283
- value: 0.9249
284
- name: Cosine Accuracy@10
285
- - type: cosine_precision@1
286
- value: 0.5942
287
- name: Cosine Precision@1
288
- - type: cosine_precision@3
289
- value: 0.268
290
- name: Cosine Precision@3
291
- - type: cosine_precision@5
292
- value: 0.17442000000000002
293
- name: Cosine Precision@5
294
- - type: cosine_precision@10
295
- value: 0.09249
296
- name: Cosine Precision@10
297
- - type: cosine_recall@1
298
- value: 0.5942
299
- name: Cosine Recall@1
300
- - type: cosine_recall@3
301
- value: 0.804
302
- name: Cosine Recall@3
303
- - type: cosine_recall@5
304
- value: 0.8721
305
- name: Cosine Recall@5
306
- - type: cosine_recall@10
307
- value: 0.9249
308
- name: Cosine Recall@10
309
- - type: cosine_ndcg@10
310
- value: 0.7627845665665897
311
- name: Cosine Ndcg@10
312
- - type: cosine_mrr@10
313
- value: 0.7103426587301529
314
- name: Cosine Mrr@10
315
- - type: cosine_map@100
316
- value: 0.7133975871277517
317
- name: Cosine Map@100
318
- - task:
319
- type: information-retrieval
320
- name: Information Retrieval
321
- dataset:
322
- name: gooaq 64 dev
323
- type: gooaq-64-dev
324
- metrics:
325
- - type: cosine_accuracy@1
326
- value: 0.556
327
- name: Cosine Accuracy@1
328
- - type: cosine_accuracy@3
329
- value: 0.7553
330
- name: Cosine Accuracy@3
331
- - type: cosine_accuracy@5
332
- value: 0.8267
333
- name: Cosine Accuracy@5
334
- - type: cosine_accuracy@10
335
- value: 0.8945
336
- name: Cosine Accuracy@10
337
- - type: cosine_precision@1
338
- value: 0.556
339
- name: Cosine Precision@1
340
- - type: cosine_precision@3
341
- value: 0.25176666666666664
342
- name: Cosine Precision@3
343
- - type: cosine_precision@5
344
- value: 0.16534000000000001
345
- name: Cosine Precision@5
346
- - type: cosine_precision@10
347
- value: 0.08945
348
- name: Cosine Precision@10
349
- - type: cosine_recall@1
350
- value: 0.556
351
- name: Cosine Recall@1
352
- - type: cosine_recall@3
353
- value: 0.7553
354
- name: Cosine Recall@3
355
- - type: cosine_recall@5
356
- value: 0.8267
357
- name: Cosine Recall@5
358
- - type: cosine_recall@10
359
- value: 0.8945
360
- name: Cosine Recall@10
361
- - type: cosine_ndcg@10
362
- value: 0.7246435400765202
363
- name: Cosine Ndcg@10
364
- - type: cosine_mrr@10
365
- value: 0.6701957142857087
366
- name: Cosine Mrr@10
367
- - type: cosine_map@100
368
- value: 0.6743443703166442
369
- name: Cosine Map@100
370
- - task:
371
- type: information-retrieval
372
- name: Information Retrieval
373
- dataset:
374
- name: gooaq 32 dev
375
- type: gooaq-32-dev
376
- metrics:
377
- - type: cosine_accuracy@1
378
- value: 0.4628
379
- name: Cosine Accuracy@1
380
- - type: cosine_accuracy@3
381
- value: 0.6619
382
- name: Cosine Accuracy@3
383
- - type: cosine_accuracy@5
384
- value: 0.7415
385
- name: Cosine Accuracy@5
386
- - type: cosine_accuracy@10
387
- value: 0.8241
388
- name: Cosine Accuracy@10
389
- - type: cosine_precision@1
390
- value: 0.4628
391
- name: Cosine Precision@1
392
- - type: cosine_precision@3
393
- value: 0.2206333333333333
394
- name: Cosine Precision@3
395
- - type: cosine_precision@5
396
- value: 0.1483
397
- name: Cosine Precision@5
398
- - type: cosine_precision@10
399
- value: 0.08241
400
- name: Cosine Precision@10
401
- - type: cosine_recall@1
402
- value: 0.4628
403
- name: Cosine Recall@1
404
- - type: cosine_recall@3
405
- value: 0.6619
406
- name: Cosine Recall@3
407
- - type: cosine_recall@5
408
- value: 0.7415
409
- name: Cosine Recall@5
410
- - type: cosine_recall@10
411
- value: 0.8241
412
- name: Cosine Recall@10
413
- - type: cosine_ndcg@10
414
- value: 0.6387155548290799
415
- name: Cosine Ndcg@10
416
- - type: cosine_mrr@10
417
- value: 0.5797731349206319
418
- name: Cosine Mrr@10
419
- - type: cosine_map@100
420
- value: 0.5857231820662888
421
- name: Cosine Map@100
422
- ---
423
-
424
- # Static Embeddings with BERT uncased tokenizer finetuned on GooAQ pairs
425
-
426
- This is a [sentence-transformers](https://www.SBERT.net) model trained on the [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
427
-
428
- ## Model Details
429
-
430
- ### Model Description
431
- - **Model Type:** Sentence Transformer
432
- <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
433
- - **Maximum Sequence Length:** inf tokens
434
- - **Output Dimensionality:** 1024 tokens
435
- - **Similarity Function:** Cosine Similarity
436
- - **Training Dataset:**
437
- - [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq)
438
- - **Language:** en
439
- - **License:** apache-2.0
440
-
441
- ### Model Sources
442
-
443
- - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
444
- - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
445
- - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
446
-
447
- ### Full Model Architecture
448
-
449
- ```
450
- SentenceTransformer(
451
- (0): StaticEmbedding(
452
- (embedding): EmbeddingBag(30522, 1024, mode='mean')
453
- )
454
- )
455
- ```
456
-
457
- ## Usage
458
-
459
- ### Direct Usage (Sentence Transformers)
460
-
461
- First install the Sentence Transformers library:
462
-
463
- ```bash
464
- pip install -U sentence-transformers
465
- ```
466
-
467
- Then you can load this model and run inference.
468
- ```python
469
- from sentence_transformers import SentenceTransformer
470
-
471
- # Download from the 🤗 Hub
472
- model = SentenceTransformer("tomaarsen/static-bert-uncased-gooaq")
473
- # Run inference
474
- sentences = [
475
- "how to reverse a video on tiktok that's not yours?",
476
- '[\'Tap "Effects" at the bottom of your screen — it\\\'s an icon that looks like a clock. Open the Effects menu. ... \', \'At the end of the new list that appears, tap "Time." Select "Time" at the end. ... \', \'Select "Reverse" — you\\\'ll then see a preview of your new, reversed video appear on the screen.\']',
477
- 'Relative age is the age of a rock layer (or the fossils it contains) compared to other layers. It can be determined by looking at the position of rock layers. Absolute age is the numeric age of a layer of rocks or fossils. Absolute age can be determined by using radiometric dating.',
478
- ]
479
- embeddings = model.encode(sentences)
480
- print(embeddings.shape)
481
- # [3, 1024]
482
-
483
- # Get the similarity scores for the embeddings
484
- similarities = model.similarity(embeddings, embeddings)
485
- print(similarities.shape)
486
- # [3, 3]
487
- ```
488
-
489
- <!--
490
- ### Direct Usage (Transformers)
491
-
492
- <details><summary>Click to see the direct usage in Transformers</summary>
493
-
494
- </details>
495
- -->
496
-
497
- <!--
498
- ### Downstream Usage (Sentence Transformers)
499
-
500
- You can finetune this model on your own dataset.
501
-
502
- <details><summary>Click to expand</summary>
503
-
504
- </details>
505
- -->
506
-
507
- <!--
508
- ### Out-of-Scope Use
509
-
510
- *List how the model may foreseeably be misused and address what users ought not to do with the model.*
511
- -->
512
-
513
- ## Evaluation
514
-
515
- ### Metrics
516
-
517
- #### Information Retrieval
518
- * Dataset: `gooaq-1024-dev`
519
- * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
520
-
521
- | Metric | Value |
522
- |:--------------------|:-----------|
523
- | cosine_accuracy@1 | 0.6309 |
524
- | cosine_accuracy@3 | 0.8409 |
525
- | cosine_accuracy@5 | 0.8986 |
526
- | cosine_accuracy@10 | 0.9444 |
527
- | cosine_precision@1 | 0.6309 |
528
- | cosine_precision@3 | 0.2803 |
529
- | cosine_precision@5 | 0.1797 |
530
- | cosine_precision@10 | 0.0944 |
531
- | cosine_recall@1 | 0.6309 |
532
- | cosine_recall@3 | 0.8409 |
533
- | cosine_recall@5 | 0.8986 |
534
- | cosine_recall@10 | 0.9444 |
535
- | cosine_ndcg@10 | 0.7933 |
536
- | cosine_mrr@10 | 0.744 |
537
- | **cosine_map@100** | **0.7466** |
538
-
539
- #### Information Retrieval
540
- * Dataset: `gooaq-512-dev`
541
- * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
542
-
543
- | Metric | Value |
544
- |:--------------------|:-----------|
545
- | cosine_accuracy@1 | 0.6271 |
546
- | cosine_accuracy@3 | 0.8366 |
547
- | cosine_accuracy@5 | 0.8946 |
548
- | cosine_accuracy@10 | 0.9431 |
549
- | cosine_precision@1 | 0.6271 |
550
- | cosine_precision@3 | 0.2789 |
551
- | cosine_precision@5 | 0.1789 |
552
- | cosine_precision@10 | 0.0943 |
553
- | cosine_recall@1 | 0.6271 |
554
- | cosine_recall@3 | 0.8366 |
555
- | cosine_recall@5 | 0.8946 |
556
- | cosine_recall@10 | 0.9431 |
557
- | cosine_ndcg@10 | 0.7905 |
558
- | cosine_mrr@10 | 0.7408 |
559
- | **cosine_map@100** | **0.7434** |
560
-
561
- #### Information Retrieval
562
- * Dataset: `gooaq-256-dev`
563
- * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
564
-
565
- | Metric | Value |
566
- |:--------------------|:-----------|
567
- | cosine_accuracy@1 | 0.6192 |
568
- | cosine_accuracy@3 | 0.8235 |
569
- | cosine_accuracy@5 | 0.8866 |
570
- | cosine_accuracy@10 | 0.9364 |
571
- | cosine_precision@1 | 0.6192 |
572
- | cosine_precision@3 | 0.2745 |
573
- | cosine_precision@5 | 0.1773 |
574
- | cosine_precision@10 | 0.0936 |
575
- | cosine_recall@1 | 0.6192 |
576
- | cosine_recall@3 | 0.8235 |
577
- | cosine_recall@5 | 0.8866 |
578
- | cosine_recall@10 | 0.9364 |
579
- | cosine_ndcg@10 | 0.7821 |
580
- | cosine_mrr@10 | 0.7321 |
581
- | **cosine_map@100** | **0.7349** |
582
-
583
- #### Information Retrieval
584
- * Dataset: `gooaq-128-dev`
585
- * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
586
-
587
- | Metric | Value |
588
- |:--------------------|:-----------|
589
- | cosine_accuracy@1 | 0.5942 |
590
- | cosine_accuracy@3 | 0.804 |
591
- | cosine_accuracy@5 | 0.8721 |
592
- | cosine_accuracy@10 | 0.9249 |
593
- | cosine_precision@1 | 0.5942 |
594
- | cosine_precision@3 | 0.268 |
595
- | cosine_precision@5 | 0.1744 |
596
- | cosine_precision@10 | 0.0925 |
597
- | cosine_recall@1 | 0.5942 |
598
- | cosine_recall@3 | 0.804 |
599
- | cosine_recall@5 | 0.8721 |
600
- | cosine_recall@10 | 0.9249 |
601
- | cosine_ndcg@10 | 0.7628 |
602
- | cosine_mrr@10 | 0.7103 |
603
- | **cosine_map@100** | **0.7134** |
604
-
605
- #### Information Retrieval
606
- * Dataset: `gooaq-64-dev`
607
- * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
608
-
609
- | Metric | Value |
610
- |:--------------------|:-----------|
611
- | cosine_accuracy@1 | 0.556 |
612
- | cosine_accuracy@3 | 0.7553 |
613
- | cosine_accuracy@5 | 0.8267 |
614
- | cosine_accuracy@10 | 0.8945 |
615
- | cosine_precision@1 | 0.556 |
616
- | cosine_precision@3 | 0.2518 |
617
- | cosine_precision@5 | 0.1653 |
618
- | cosine_precision@10 | 0.0895 |
619
- | cosine_recall@1 | 0.556 |
620
- | cosine_recall@3 | 0.7553 |
621
- | cosine_recall@5 | 0.8267 |
622
- | cosine_recall@10 | 0.8945 |
623
- | cosine_ndcg@10 | 0.7246 |
624
- | cosine_mrr@10 | 0.6702 |
625
- | **cosine_map@100** | **0.6743** |
626
-
627
- #### Information Retrieval
628
- * Dataset: `gooaq-32-dev`
629
- * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
630
-
631
- | Metric | Value |
632
- |:--------------------|:-----------|
633
- | cosine_accuracy@1 | 0.4628 |
634
- | cosine_accuracy@3 | 0.6619 |
635
- | cosine_accuracy@5 | 0.7415 |
636
- | cosine_accuracy@10 | 0.8241 |
637
- | cosine_precision@1 | 0.4628 |
638
- | cosine_precision@3 | 0.2206 |
639
- | cosine_precision@5 | 0.1483 |
640
- | cosine_precision@10 | 0.0824 |
641
- | cosine_recall@1 | 0.4628 |
642
- | cosine_recall@3 | 0.6619 |
643
- | cosine_recall@5 | 0.7415 |
644
- | cosine_recall@10 | 0.8241 |
645
- | cosine_ndcg@10 | 0.6387 |
646
- | cosine_mrr@10 | 0.5798 |
647
- | **cosine_map@100** | **0.5857** |
648
-
649
- <!--
650
- ## Bias, Risks and Limitations
651
-
652
- *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
653
- -->
654
-
655
- <!--
656
- ### Recommendations
657
-
658
- *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
659
- -->
660
-
661
- ## Training Details
662
-
663
- ### Training Dataset
664
-
665
- #### gooaq
666
-
667
- * Dataset: [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
668
- * Size: 3,012,496 training samples
669
- * Columns: <code>question</code> and <code>answer</code>
670
- * Approximate statistics based on the first 1000 samples:
671
- | | question | answer |
672
- |:--------|:-----------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|
673
- | type | string | string |
674
- | details | <ul><li>min: 18 characters</li><li>mean: 43.23 characters</li><li>max: 96 characters</li></ul> | <ul><li>min: 55 characters</li><li>mean: 253.36 characters</li><li>max: 371 characters</li></ul> |
675
- * Samples:
676
- | question | answer |
677
- |:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
678
- | <code>what is the difference between broilers and layers?</code> | <code>An egg laying poultry is called egger or layer whereas broilers are reared for obtaining meat. So a layer should be able to produce more number of large sized eggs, without growing too much. On the other hand, a broiler should yield more meat and hence should be able to grow well.</code> |
679
- | <code>what is the difference between chronological order and spatial order?</code> | <code>As a writer, you should always remember that unlike chronological order and the other organizational methods for data, spatial order does not take into account the time. Spatial order is primarily focused on the location. All it does is take into account the location of objects and not the time.</code> |
680
- | <code>is kamagra same as viagra?</code> | <code>Kamagra is thought to contain the same active ingredient as Viagra, sildenafil citrate. In theory, it should work in much the same way as Viagra, taking about 45 minutes to take effect, and lasting for around 4-6 hours. However, this will vary from person to person.</code> |
681
- * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
682
- ```json
683
- {
684
- "loss": "MultipleNegativesRankingLoss",
685
- "matryoshka_dims": [
686
- 1024,
687
- 512,
688
- 256,
689
- 128,
690
- 64,
691
- 32
692
- ],
693
- "matryoshka_weights": [
694
- 1,
695
- 1,
696
- 1,
697
- 1,
698
- 1,
699
- 1
700
- ],
701
- "n_dims_per_step": -1
702
- }
703
- ```
704
-
705
- ### Evaluation Dataset
706
-
707
- #### gooaq
708
-
709
- * Dataset: [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
710
- * Size: 3,012,496 evaluation samples
711
- * Columns: <code>question</code> and <code>answer</code>
712
- * Approximate statistics based on the first 1000 samples:
713
- | | question | answer |
714
- |:--------|:-----------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|
715
- | type | string | string |
716
- | details | <ul><li>min: 18 characters</li><li>mean: 43.17 characters</li><li>max: 98 characters</li></ul> | <ul><li>min: 51 characters</li><li>mean: 254.12 characters</li><li>max: 360 characters</li></ul> |
717
- * Samples:
718
- | question | answer |
719
- |:-----------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
720
- | <code>how do i program my directv remote with my tv?</code> | <code>['Press MENU on your remote.', 'Select Settings & Help > Settings > Remote Control > Program Remote.', 'Choose the device (TV, audio, DVD) you wish to program. ... ', 'Follow the on-screen prompts to complete programming.']</code> |
721
- | <code>are rodrigues fruit bats nocturnal?</code> | <code>Before its numbers were threatened by habitat destruction, storms, and hunting, some of those groups could number 500 or more members. Sunrise, sunset. Rodrigues fruit bats are most active at dawn, at dusk, and at night.</code> |
722
- | <code>why does your heart rate increase during exercise bbc bitesize?</code> | <code>During exercise there is an increase in physical activity and muscle cells respire more than they do when the body is at rest. The heart rate increases during exercise. The rate and depth of breathing increases - this makes sure that more oxygen is absorbed into the blood, and more carbon dioxide is removed from it.</code> |
723
- * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
724
- ```json
725
- {
726
- "loss": "MultipleNegativesRankingLoss",
727
- "matryoshka_dims": [
728
- 1024,
729
- 512,
730
- 256,
731
- 128,
732
- 64,
733
- 32
734
- ],
735
- "matryoshka_weights": [
736
- 1,
737
- 1,
738
- 1,
739
- 1,
740
- 1,
741
- 1
742
- ],
743
- "n_dims_per_step": -1
744
- }
745
- ```
746
-
747
- ### Training Hyperparameters
748
- #### Non-Default Hyperparameters
749
-
750
- - `eval_strategy`: steps
751
- - `per_device_train_batch_size`: 2048
752
- - `per_device_eval_batch_size`: 2048
753
- - `learning_rate`: 0.2
754
- - `num_train_epochs`: 1
755
- - `warmup_ratio`: 0.1
756
- - `bf16`: True
757
- - `batch_sampler`: no_duplicates
758
-
759
- #### All Hyperparameters
760
- <details><summary>Click to expand</summary>
761
-
762
- - `overwrite_output_dir`: False
763
- - `do_predict`: False
764
- - `eval_strategy`: steps
765
- - `prediction_loss_only`: True
766
- - `per_device_train_batch_size`: 2048
767
- - `per_device_eval_batch_size`: 2048
768
- - `per_gpu_train_batch_size`: None
769
- - `per_gpu_eval_batch_size`: None
770
- - `gradient_accumulation_steps`: 1
771
- - `eval_accumulation_steps`: None
772
- - `torch_empty_cache_steps`: None
773
- - `learning_rate`: 0.2
774
- - `weight_decay`: 0.0
775
- - `adam_beta1`: 0.9
776
- - `adam_beta2`: 0.999
777
- - `adam_epsilon`: 1e-08
778
- - `max_grad_norm`: 1.0
779
- - `num_train_epochs`: 1
780
- - `max_steps`: -1
781
- - `lr_scheduler_type`: linear
782
- - `lr_scheduler_kwargs`: {}
783
- - `warmup_ratio`: 0.1
784
- - `warmup_steps`: 0
785
- - `log_level`: passive
786
- - `log_level_replica`: warning
787
- - `log_on_each_node`: True
788
- - `logging_nan_inf_filter`: True
789
- - `save_safetensors`: True
790
- - `save_on_each_node`: False
791
- - `save_only_model`: False
792
- - `restore_callback_states_from_checkpoint`: False
793
- - `no_cuda`: False
794
- - `use_cpu`: False
795
- - `use_mps_device`: False
796
- - `seed`: 42
797
- - `data_seed`: None
798
- - `jit_mode_eval`: False
799
- - `use_ipex`: False
800
- - `bf16`: True
801
- - `fp16`: False
802
- - `fp16_opt_level`: O1
803
- - `half_precision_backend`: auto
804
- - `bf16_full_eval`: False
805
- - `fp16_full_eval`: False
806
- - `tf32`: None
807
- - `local_rank`: 0
808
- - `ddp_backend`: None
809
- - `tpu_num_cores`: None
810
- - `tpu_metrics_debug`: False
811
- - `debug`: []
812
- - `dataloader_drop_last`: False
813
- - `dataloader_num_workers`: 0
814
- - `dataloader_prefetch_factor`: None
815
- - `past_index`: -1
816
- - `disable_tqdm`: False
817
- - `remove_unused_columns`: True
818
- - `label_names`: None
819
- - `load_best_model_at_end`: False
820
- - `ignore_data_skip`: False
821
- - `fsdp`: []
822
- - `fsdp_min_num_params`: 0
823
- - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
824
- - `fsdp_transformer_layer_cls_to_wrap`: None
825
- - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
826
- - `deepspeed`: None
827
- - `label_smoothing_factor`: 0.0
828
- - `optim`: adamw_torch
829
- - `optim_args`: None
830
- - `adafactor`: False
831
- - `group_by_length`: False
832
- - `length_column_name`: length
833
- - `ddp_find_unused_parameters`: None
834
- - `ddp_bucket_cap_mb`: None
835
- - `ddp_broadcast_buffers`: False
836
- - `dataloader_pin_memory`: True
837
- - `dataloader_persistent_workers`: False
838
- - `skip_memory_metrics`: True
839
- - `use_legacy_prediction_loop`: False
840
- - `push_to_hub`: False
841
- - `resume_from_checkpoint`: None
842
- - `hub_model_id`: None
843
- - `hub_strategy`: every_save
844
- - `hub_private_repo`: False
845
- - `hub_always_push`: False
846
- - `gradient_checkpointing`: False
847
- - `gradient_checkpointing_kwargs`: None
848
- - `include_inputs_for_metrics`: False
849
- - `eval_do_concat_batches`: True
850
- - `fp16_backend`: auto
851
- - `push_to_hub_model_id`: None
852
- - `push_to_hub_organization`: None
853
- - `mp_parameters`:
854
- - `auto_find_batch_size`: False
855
- - `full_determinism`: False
856
- - `torchdynamo`: None
857
- - `ray_scope`: last
858
- - `ddp_timeout`: 1800
859
- - `torch_compile`: False
860
- - `torch_compile_backend`: None
861
- - `torch_compile_mode`: None
862
- - `dispatch_batches`: None
863
- - `split_batches`: None
864
- - `include_tokens_per_second`: False
865
- - `include_num_input_tokens_seen`: False
866
- - `neftune_noise_alpha`: None
867
- - `optim_target_modules`: None
868
- - `batch_eval_metrics`: False
869
- - `eval_on_start`: False
870
- - `eval_use_gather_object`: False
871
- - `batch_sampler`: no_duplicates
872
- - `multi_dataset_batch_sampler`: proportional
873
-
874
- </details>
875
-
876
- ### Training Logs
877
- | Epoch | Step | Training Loss | Validation Loss | gooaq-1024-dev_cosine_map@100 | gooaq-512-dev_cosine_map@100 | gooaq-256-dev_cosine_map@100 | gooaq-128-dev_cosine_map@100 | gooaq-64-dev_cosine_map@100 | gooaq-32-dev_cosine_map@100 |
878
- |:------:|:----:|:-------------:|:---------------:|:-----------------------------:|:----------------------------:|:----------------------------:|:----------------------------:|:---------------------------:|:---------------------------:|
879
- | 0 | 0 | - | - | 0.2095 | 0.2010 | 0.1735 | 0.1381 | 0.0750 | 0.0331 |
880
- | 0.0007 | 1 | 34.953 | - | - | - | - | - | - | - |
881
- | 0.0682 | 100 | 16.2504 | - | - | - | - | - | - | - |
882
- | 0.1363 | 200 | 5.9502 | - | - | - | - | - | - | - |
883
- | 0.1704 | 250 | - | 1.6781 | 0.6791 | 0.6729 | 0.6619 | 0.6409 | 0.5904 | 0.4934 |
884
- | 0.2045 | 300 | 4.8411 | - | - | - | - | - | - | - |
885
- | 0.2727 | 400 | 4.336 | - | - | - | - | - | - | - |
886
- | 0.3408 | 500 | 4.0484 | 1.3935 | 0.7104 | 0.7055 | 0.6968 | 0.6756 | 0.6322 | 0.5358 |
887
- | 0.4090 | 600 | 3.8378 | - | - | - | - | - | - | - |
888
- | 0.4772 | 700 | 3.6765 | - | - | - | - | - | - | - |
889
- | 0.5112 | 750 | - | 1.2549 | 0.7246 | 0.7216 | 0.7133 | 0.6943 | 0.6482 | 0.5582 |
890
- | 0.5453 | 800 | 3.5439 | - | - | - | - | - | - | - |
891
- | 0.6135 | 900 | 3.4284 | - | - | - | - | - | - | - |
892
- | 0.6817 | 1000 | 3.3576 | 1.1656 | 0.7359 | 0.7338 | 0.7252 | 0.7040 | 0.6604 | 0.5715 |
893
- | 0.7498 | 1100 | 3.2456 | - | - | - | - | - | - | - |
894
- | 0.8180 | 1200 | 3.2014 | - | - | - | - | - | - | - |
895
- | 0.8521 | 1250 | - | 1.1133 | 0.7438 | 0.7398 | 0.7310 | 0.7099 | 0.6704 | 0.5796 |
896
- | 0.8862 | 1300 | 3.1536 | - | - | - | - | - | - | - |
897
- | 0.9543 | 1400 | 3.0696 | - | - | - | - | - | - | - |
898
- | 1.0 | 1467 | - | - | 0.7466 | 0.7434 | 0.7349 | 0.7134 | 0.6743 | 0.5857 |
899
-
900
-
901
- ### Environmental Impact
902
- Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon).
903
- - **Energy Consumed**: 0.017 kWh
904
- - **Carbon Emitted**: 0.006 kg of CO2
905
- - **Hours Used**: 0.109 hours
906
-
907
- ### Training Hardware
908
- - **On Cloud**: No
909
- - **GPU Model**: 1 x NVIDIA GeForce RTX 3090
910
- - **CPU Model**: 13th Gen Intel(R) Core(TM) i7-13700K
911
- - **RAM Size**: 31.78 GB
912
-
913
- ### Framework Versions
914
- - Python: 3.11.6
915
- - Sentence Transformers: 3.2.0.dev0
916
- - Transformers: 4.43.4
917
- - PyTorch: 2.5.0.dev20240807+cu121
918
- - Accelerate: 0.31.0
919
- - Datasets: 2.20.0
920
- - Tokenizers: 0.19.1
921
-
922
- ## Citation
923
-
924
- ### BibTeX
925
-
926
- #### Sentence Transformers
927
- ```bibtex
928
- @inproceedings{reimers-2019-sentence-bert,
929
- title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
930
- author = "Reimers, Nils and Gurevych, Iryna",
931
- booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
932
- month = "11",
933
- year = "2019",
934
- publisher = "Association for Computational Linguistics",
935
- url = "https://arxiv.org/abs/1908.10084",
936
- }
937
- ```
938
-
939
- #### MatryoshkaLoss
940
- ```bibtex
941
- @misc{kusupati2024matryoshka,
942
- title={Matryoshka Representation Learning},
943
- author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
944
- year={2024},
945
- eprint={2205.13147},
946
- archivePrefix={arXiv},
947
- primaryClass={cs.LG}
948
- }
949
- ```
950
-
951
- #### MultipleNegativesRankingLoss
952
- ```bibtex
953
- @misc{henderson2017efficient,
954
- title={Efficient Natural Language Response Suggestion for Smart Reply},
955
- author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
956
- year={2017},
957
- eprint={1705.00652},
958
- archivePrefix={arXiv},
959
- primaryClass={cs.CL}
960
- }
961
- ```
962
-
963
- <!--
964
- ## Glossary
965
-
966
- *Clearly define terms in order to be accessible across audiences.*
967
- -->
968
-
969
- <!--
970
- ## Model Card Authors
971
-
972
- *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
973
- -->
974
-
975
- <!--
976
- ## Model Card Contact
977
-
978
- *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
 
 
979
  -->
 
1
+ ---
2
+ datasets:
3
+ - sentence-transformers/gooaq
4
+ language:
5
+ - en
6
+ library_name: sentence-transformers
7
+ license: apache-2.0
8
+ metrics:
9
+ - cosine_accuracy@1
10
+ - cosine_accuracy@3
11
+ - cosine_accuracy@5
12
+ - cosine_accuracy@10
13
+ - cosine_precision@1
14
+ - cosine_precision@3
15
+ - cosine_precision@5
16
+ - cosine_precision@10
17
+ - cosine_recall@1
18
+ - cosine_recall@3
19
+ - cosine_recall@5
20
+ - cosine_recall@10
21
+ - cosine_ndcg@10
22
+ - cosine_mrr@10
23
+ - cosine_map@100
24
+ pipeline_tag: sentence-similarity
25
+ tags:
26
+ - sentence-transformers
27
+ - sentence-similarity
28
+ - feature-extraction
29
+ - generated_from_trainer
30
+ - dataset_size:3012496
31
+ - loss:MatryoshkaLoss
32
+ - loss:MultipleNegativesRankingLoss
33
+ widget:
34
+ - source_sentence: how to sign legal documents as power of attorney?
35
+ sentences:
36
+ - 'After the principal''s name, write “by” and then sign your own name. Under or
37
+ after the signature line, indicate your status as POA by including any of the
38
+ following identifiers: as POA, as Agent, as Attorney in Fact or as Power of Attorney.'
39
+ - '[''From the Home screen, swipe left to Apps.'', ''Tap Transfer my Data.'', ''Tap
40
+ Menu (...).'', ''Tap Export to SD card.'']'
41
+ - Ginger Dank Nugs (Grape) - 350mg. Feast your eyes on these unique and striking
42
+ gourmet chocolates; Coco Nugs created by Ginger Dank. Crafted to resemble perfect
43
+ nugs of cannabis, each of the 10 buds contains 35mg of THC. ... This is a perfect
44
+ product for both cannabis and chocolate lovers, who appreciate a little twist.
45
+ - source_sentence: how to delete vdom in fortigate?
46
+ sentences:
47
+ - Go to System -> VDOM -> VDOM2 and select 'Delete'. This VDOM is now successfully
48
+ removed from the configuration.
49
+ - 'Both combination birth control pills and progestin-only pills may cause headaches
50
+ as a side effect. Additional side effects of birth control pills may include:
51
+ breast tenderness. nausea.'
52
+ - White cheese tends to show imperfections more readily and as consumers got more
53
+ used to yellow-orange cheese, it became an expected option. Today, many cheddars
54
+ are yellow. While most cheesemakers use annatto, some use an artificial coloring
55
+ agent instead, according to Sachs.
56
+ - source_sentence: where are earthquakes most likely to occur on earth?
57
+ sentences:
58
+ - Zelle in the Bank of the America app is a fast, safe, and easy way to send and
59
+ receive money with family and friends who have a bank account in the U.S., all
60
+ with no fees. Money moves in minutes directly between accounts that are already
61
+ enrolled with Zelle.
62
+ - It takes about 3 days for a spacecraft to reach the Moon. During that time a spacecraft
63
+ travels at least 240,000 miles (386,400 kilometers) which is the distance between
64
+ Earth and the Moon.
65
+ - Most earthquakes occur along the edge of the oceanic and continental plates. The
66
+ earth's crust (the outer layer of the planet) is made up of several pieces, called
67
+ plates. The plates under the oceans are called oceanic plates and the rest are
68
+ continental plates.
69
+ - source_sentence: fix iphone is disabled connect to itunes without itunes?
70
+ sentences:
71
+ - To fix a disabled iPhone or iPad without iTunes, you have to erase your device.
72
+ Click on the "Erase iPhone" option and confirm your selection. Wait for a while
73
+ as the "Find My iPhone" feature will remotely erase your iOS device. Needless
74
+ to say, it will also disable its lock.
75
+ - How Māui brought fire to the world. One evening, after eating a hearty meal, Māui
76
+ lay beside his fire staring into the flames. ... In the middle of the night, while
77
+ everyone was sleeping, Māui went from village to village and extinguished all
78
+ the fires until not a single fire burned in the world.
79
+ - Angry Orchard makes a variety of year-round craft cider styles, including Angry
80
+ Orchard Crisp Apple, a fruit-forward hard cider that balances the sweetness of
81
+ culinary apples with dryness and bright acidity of bittersweet apples for a complex,
82
+ refreshing taste.
83
+ - source_sentence: how to reverse a video on tiktok that's not yours?
84
+ sentences:
85
+ - '[''Tap "Effects" at the bottom of your screen — it\''s an icon that looks like
86
+ a clock. Open the Effects menu. ... '', ''At the end of the new list that appears,
87
+ tap "Time." Select "Time" at the end. ... '', ''Select "Reverse" — you\''ll then
88
+ see a preview of your new, reversed video appear on the screen.'']'
89
+ - Franchise Facts Poke Bar has a franchise fee of up to $30,000, with a total initial
90
+ investment range of $157,800 to $438,000. The initial cost of a franchise includes
91
+ several fees -- Unlock this franchise to better understand the costs such as training
92
+ and territory fees.
93
+ - Relative age is the age of a rock layer (or the fossils it contains) compared
94
+ to other layers. It can be determined by looking at the position of rock layers.
95
+ Absolute age is the numeric age of a layer of rocks or fossils. Absolute age can
96
+ be determined by using radiometric dating.
97
+ co2_eq_emissions:
98
+ emissions: 6.448001991119035
99
+ energy_consumed: 0.0165885485310573
100
+ source: codecarbon
101
+ training_type: fine-tuning
102
+ on_cloud: false
103
+ cpu_model: 13th Gen Intel(R) Core(TM) i7-13700K
104
+ ram_total_size: 31.777088165283203
105
+ hours_used: 0.109
106
+ hardware_used: 1 x NVIDIA GeForce RTX 3090
107
+ model-index:
108
+ - name: Static Embeddings with BERT uncased tokenizer finetuned on GooAQ pairs
109
+ results:
110
+ - task:
111
+ type: information-retrieval
112
+ name: Information Retrieval
113
+ dataset:
114
+ name: gooaq 1024 dev
115
+ type: gooaq-1024-dev
116
+ metrics:
117
+ - type: cosine_accuracy@1
118
+ value: 0.6309
119
+ name: Cosine Accuracy@1
120
+ - type: cosine_accuracy@3
121
+ value: 0.8409
122
+ name: Cosine Accuracy@3
123
+ - type: cosine_accuracy@5
124
+ value: 0.8986
125
+ name: Cosine Accuracy@5
126
+ - type: cosine_accuracy@10
127
+ value: 0.9444
128
+ name: Cosine Accuracy@10
129
+ - type: cosine_precision@1
130
+ value: 0.6309
131
+ name: Cosine Precision@1
132
+ - type: cosine_precision@3
133
+ value: 0.28029999999999994
134
+ name: Cosine Precision@3
135
+ - type: cosine_precision@5
136
+ value: 0.17972000000000002
137
+ name: Cosine Precision@5
138
+ - type: cosine_precision@10
139
+ value: 0.09444000000000002
140
+ name: Cosine Precision@10
141
+ - type: cosine_recall@1
142
+ value: 0.6309
143
+ name: Cosine Recall@1
144
+ - type: cosine_recall@3
145
+ value: 0.8409
146
+ name: Cosine Recall@3
147
+ - type: cosine_recall@5
148
+ value: 0.8986
149
+ name: Cosine Recall@5
150
+ - type: cosine_recall@10
151
+ value: 0.9444
152
+ name: Cosine Recall@10
153
+ - type: cosine_ndcg@10
154
+ value: 0.7932643237589305
155
+ name: Cosine Ndcg@10
156
+ - type: cosine_mrr@10
157
+ value: 0.7440336111111036
158
+ name: Cosine Mrr@10
159
+ - type: cosine_map@100
160
+ value: 0.7465739001132767
161
+ name: Cosine Map@100
162
+ - task:
163
+ type: information-retrieval
164
+ name: Information Retrieval
165
+ dataset:
166
+ name: gooaq 512 dev
167
+ type: gooaq-512-dev
168
+ metrics:
169
+ - type: cosine_accuracy@1
170
+ value: 0.6271
171
+ name: Cosine Accuracy@1
172
+ - type: cosine_accuracy@3
173
+ value: 0.8366
174
+ name: Cosine Accuracy@3
175
+ - type: cosine_accuracy@5
176
+ value: 0.8946
177
+ name: Cosine Accuracy@5
178
+ - type: cosine_accuracy@10
179
+ value: 0.9431
180
+ name: Cosine Accuracy@10
181
+ - type: cosine_precision@1
182
+ value: 0.6271
183
+ name: Cosine Precision@1
184
+ - type: cosine_precision@3
185
+ value: 0.27886666666666665
186
+ name: Cosine Precision@3
187
+ - type: cosine_precision@5
188
+ value: 0.17892000000000002
189
+ name: Cosine Precision@5
190
+ - type: cosine_precision@10
191
+ value: 0.09431000000000002
192
+ name: Cosine Precision@10
193
+ - type: cosine_recall@1
194
+ value: 0.6271
195
+ name: Cosine Recall@1
196
+ - type: cosine_recall@3
197
+ value: 0.8366
198
+ name: Cosine Recall@3
199
+ - type: cosine_recall@5
200
+ value: 0.8946
201
+ name: Cosine Recall@5
202
+ - type: cosine_recall@10
203
+ value: 0.9431
204
+ name: Cosine Recall@10
205
+ - type: cosine_ndcg@10
206
+ value: 0.7904860196985286
207
+ name: Cosine Ndcg@10
208
+ - type: cosine_mrr@10
209
+ value: 0.7408453174603101
210
+ name: Cosine Mrr@10
211
+ - type: cosine_map@100
212
+ value: 0.7434337897783787
213
+ name: Cosine Map@100
214
+ - task:
215
+ type: information-retrieval
216
+ name: Information Retrieval
217
+ dataset:
218
+ name: gooaq 256 dev
219
+ type: gooaq-256-dev
220
+ metrics:
221
+ - type: cosine_accuracy@1
222
+ value: 0.6192
223
+ name: Cosine Accuracy@1
224
+ - type: cosine_accuracy@3
225
+ value: 0.8235
226
+ name: Cosine Accuracy@3
227
+ - type: cosine_accuracy@5
228
+ value: 0.8866
229
+ name: Cosine Accuracy@5
230
+ - type: cosine_accuracy@10
231
+ value: 0.9364
232
+ name: Cosine Accuracy@10
233
+ - type: cosine_precision@1
234
+ value: 0.6192
235
+ name: Cosine Precision@1
236
+ - type: cosine_precision@3
237
+ value: 0.27449999999999997
238
+ name: Cosine Precision@3
239
+ - type: cosine_precision@5
240
+ value: 0.17732000000000003
241
+ name: Cosine Precision@5
242
+ - type: cosine_precision@10
243
+ value: 0.09364000000000001
244
+ name: Cosine Precision@10
245
+ - type: cosine_recall@1
246
+ value: 0.6192
247
+ name: Cosine Recall@1
248
+ - type: cosine_recall@3
249
+ value: 0.8235
250
+ name: Cosine Recall@3
251
+ - type: cosine_recall@5
252
+ value: 0.8866
253
+ name: Cosine Recall@5
254
+ - type: cosine_recall@10
255
+ value: 0.9364
256
+ name: Cosine Recall@10
257
+ - type: cosine_ndcg@10
258
+ value: 0.7821476540310974
259
+ name: Cosine Ndcg@10
260
+ - type: cosine_mrr@10
261
+ value: 0.7321259126984055
262
+ name: Cosine Mrr@10
263
+ - type: cosine_map@100
264
+ value: 0.7348893313013708
265
+ name: Cosine Map@100
266
+ - task:
267
+ type: information-retrieval
268
+ name: Information Retrieval
269
+ dataset:
270
+ name: gooaq 128 dev
271
+ type: gooaq-128-dev
272
+ metrics:
273
+ - type: cosine_accuracy@1
274
+ value: 0.5942
275
+ name: Cosine Accuracy@1
276
+ - type: cosine_accuracy@3
277
+ value: 0.804
278
+ name: Cosine Accuracy@3
279
+ - type: cosine_accuracy@5
280
+ value: 0.8721
281
+ name: Cosine Accuracy@5
282
+ - type: cosine_accuracy@10
283
+ value: 0.9249
284
+ name: Cosine Accuracy@10
285
+ - type: cosine_precision@1
286
+ value: 0.5942
287
+ name: Cosine Precision@1
288
+ - type: cosine_precision@3
289
+ value: 0.268
290
+ name: Cosine Precision@3
291
+ - type: cosine_precision@5
292
+ value: 0.17442000000000002
293
+ name: Cosine Precision@5
294
+ - type: cosine_precision@10
295
+ value: 0.09249
296
+ name: Cosine Precision@10
297
+ - type: cosine_recall@1
298
+ value: 0.5942
299
+ name: Cosine Recall@1
300
+ - type: cosine_recall@3
301
+ value: 0.804
302
+ name: Cosine Recall@3
303
+ - type: cosine_recall@5
304
+ value: 0.8721
305
+ name: Cosine Recall@5
306
+ - type: cosine_recall@10
307
+ value: 0.9249
308
+ name: Cosine Recall@10
309
+ - type: cosine_ndcg@10
310
+ value: 0.7627845665665897
311
+ name: Cosine Ndcg@10
312
+ - type: cosine_mrr@10
313
+ value: 0.7103426587301529
314
+ name: Cosine Mrr@10
315
+ - type: cosine_map@100
316
+ value: 0.7133975871277517
317
+ name: Cosine Map@100
318
+ - task:
319
+ type: information-retrieval
320
+ name: Information Retrieval
321
+ dataset:
322
+ name: gooaq 64 dev
323
+ type: gooaq-64-dev
324
+ metrics:
325
+ - type: cosine_accuracy@1
326
+ value: 0.556
327
+ name: Cosine Accuracy@1
328
+ - type: cosine_accuracy@3
329
+ value: 0.7553
330
+ name: Cosine Accuracy@3
331
+ - type: cosine_accuracy@5
332
+ value: 0.8267
333
+ name: Cosine Accuracy@5
334
+ - type: cosine_accuracy@10
335
+ value: 0.8945
336
+ name: Cosine Accuracy@10
337
+ - type: cosine_precision@1
338
+ value: 0.556
339
+ name: Cosine Precision@1
340
+ - type: cosine_precision@3
341
+ value: 0.25176666666666664
342
+ name: Cosine Precision@3
343
+ - type: cosine_precision@5
344
+ value: 0.16534000000000001
345
+ name: Cosine Precision@5
346
+ - type: cosine_precision@10
347
+ value: 0.08945
348
+ name: Cosine Precision@10
349
+ - type: cosine_recall@1
350
+ value: 0.556
351
+ name: Cosine Recall@1
352
+ - type: cosine_recall@3
353
+ value: 0.7553
354
+ name: Cosine Recall@3
355
+ - type: cosine_recall@5
356
+ value: 0.8267
357
+ name: Cosine Recall@5
358
+ - type: cosine_recall@10
359
+ value: 0.8945
360
+ name: Cosine Recall@10
361
+ - type: cosine_ndcg@10
362
+ value: 0.7246435400765202
363
+ name: Cosine Ndcg@10
364
+ - type: cosine_mrr@10
365
+ value: 0.6701957142857087
366
+ name: Cosine Mrr@10
367
+ - type: cosine_map@100
368
+ value: 0.6743443703166442
369
+ name: Cosine Map@100
370
+ - task:
371
+ type: information-retrieval
372
+ name: Information Retrieval
373
+ dataset:
374
+ name: gooaq 32 dev
375
+ type: gooaq-32-dev
376
+ metrics:
377
+ - type: cosine_accuracy@1
378
+ value: 0.4628
379
+ name: Cosine Accuracy@1
380
+ - type: cosine_accuracy@3
381
+ value: 0.6619
382
+ name: Cosine Accuracy@3
383
+ - type: cosine_accuracy@5
384
+ value: 0.7415
385
+ name: Cosine Accuracy@5
386
+ - type: cosine_accuracy@10
387
+ value: 0.8241
388
+ name: Cosine Accuracy@10
389
+ - type: cosine_precision@1
390
+ value: 0.4628
391
+ name: Cosine Precision@1
392
+ - type: cosine_precision@3
393
+ value: 0.2206333333333333
394
+ name: Cosine Precision@3
395
+ - type: cosine_precision@5
396
+ value: 0.1483
397
+ name: Cosine Precision@5
398
+ - type: cosine_precision@10
399
+ value: 0.08241
400
+ name: Cosine Precision@10
401
+ - type: cosine_recall@1
402
+ value: 0.4628
403
+ name: Cosine Recall@1
404
+ - type: cosine_recall@3
405
+ value: 0.6619
406
+ name: Cosine Recall@3
407
+ - type: cosine_recall@5
408
+ value: 0.7415
409
+ name: Cosine Recall@5
410
+ - type: cosine_recall@10
411
+ value: 0.8241
412
+ name: Cosine Recall@10
413
+ - type: cosine_ndcg@10
414
+ value: 0.6387155548290799
415
+ name: Cosine Ndcg@10
416
+ - type: cosine_mrr@10
417
+ value: 0.5797731349206319
418
+ name: Cosine Mrr@10
419
+ - type: cosine_map@100
420
+ value: 0.5857231820662888
421
+ name: Cosine Map@100
422
+ ---
423
+
424
+ # Static Embeddings with BERT uncased tokenizer finetuned on GooAQ pairs
425
+
426
+ This is a [sentence-transformers](https://www.SBERT.net) model trained on the [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
427
+
428
+ This model was trained using the [train_script.py](train_script.py) code.
429
+
430
+ ## Model Details
431
+
432
+ ### Model Description
433
+ - **Model Type:** Sentence Transformer
434
+ <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
435
+ - **Maximum Sequence Length:** inf tokens
436
+ - **Output Dimensionality:** 1024 tokens
437
+ - **Similarity Function:** Cosine Similarity
438
+ - **Training Dataset:**
439
+ - [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq)
440
+ - **Language:** en
441
+ - **License:** apache-2.0
442
+
443
+ ### Model Sources
444
+
445
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
446
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
447
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
448
+
449
+ ### Full Model Architecture
450
+
451
+ ```
452
+ SentenceTransformer(
453
+ (0): StaticEmbedding(
454
+ (embedding): EmbeddingBag(30522, 1024, mode='mean')
455
+ )
456
+ )
457
+ ```
458
+
459
+ ## Usage
460
+
461
+ ### Direct Usage (Sentence Transformers)
462
+
463
+ First install the Sentence Transformers library:
464
+
465
+ ```bash
466
+ pip install -U sentence-transformers
467
+ ```
468
+
469
+ Then you can load this model and run inference.
470
+ ```python
471
+ from sentence_transformers import SentenceTransformer
472
+
473
+ # Download from the 🤗 Hub
474
+ model = SentenceTransformer("tomaarsen/static-bert-uncased-gooaq")
475
+ # Run inference
476
+ sentences = [
477
+ "how to reverse a video on tiktok that's not yours?",
478
+ '[\'Tap "Effects" at the bottom of your screen — it\\\'s an icon that looks like a clock. Open the Effects menu. ... \', \'At the end of the new list that appears, tap "Time." Select "Time" at the end. ... \', \'Select "Reverse" — you\\\'ll then see a preview of your new, reversed video appear on the screen.\']',
479
+ 'Relative age is the age of a rock layer (or the fossils it contains) compared to other layers. It can be determined by looking at the position of rock layers. Absolute age is the numeric age of a layer of rocks or fossils. Absolute age can be determined by using radiometric dating.',
480
+ ]
481
+ embeddings = model.encode(sentences)
482
+ print(embeddings.shape)
483
+ # [3, 1024]
484
+
485
+ # Get the similarity scores for the embeddings
486
+ similarities = model.similarity(embeddings, embeddings)
487
+ print(similarities.shape)
488
+ # [3, 3]
489
+ ```
490
+
491
+ <!--
492
+ ### Direct Usage (Transformers)
493
+
494
+ <details><summary>Click to see the direct usage in Transformers</summary>
495
+
496
+ </details>
497
+ -->
498
+
499
+ <!--
500
+ ### Downstream Usage (Sentence Transformers)
501
+
502
+ You can finetune this model on your own dataset.
503
+
504
+ <details><summary>Click to expand</summary>
505
+
506
+ </details>
507
+ -->
508
+
509
+ <!--
510
+ ### Out-of-Scope Use
511
+
512
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
513
+ -->
514
+
515
+ ## Evaluation
516
+
517
+ ### Metrics
518
+
519
+ #### Information Retrieval
520
+ * Dataset: `gooaq-1024-dev`
521
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
522
+
523
+ | Metric | Value |
524
+ |:--------------------|:-----------|
525
+ | cosine_accuracy@1 | 0.6309 |
526
+ | cosine_accuracy@3 | 0.8409 |
527
+ | cosine_accuracy@5 | 0.8986 |
528
+ | cosine_accuracy@10 | 0.9444 |
529
+ | cosine_precision@1 | 0.6309 |
530
+ | cosine_precision@3 | 0.2803 |
531
+ | cosine_precision@5 | 0.1797 |
532
+ | cosine_precision@10 | 0.0944 |
533
+ | cosine_recall@1 | 0.6309 |
534
+ | cosine_recall@3 | 0.8409 |
535
+ | cosine_recall@5 | 0.8986 |
536
+ | cosine_recall@10 | 0.9444 |
537
+ | cosine_ndcg@10 | 0.7933 |
538
+ | cosine_mrr@10 | 0.744 |
539
+ | **cosine_map@100** | **0.7466** |
540
+
541
+ #### Information Retrieval
542
+ * Dataset: `gooaq-512-dev`
543
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
544
+
545
+ | Metric | Value |
546
+ |:--------------------|:-----------|
547
+ | cosine_accuracy@1 | 0.6271 |
548
+ | cosine_accuracy@3 | 0.8366 |
549
+ | cosine_accuracy@5 | 0.8946 |
550
+ | cosine_accuracy@10 | 0.9431 |
551
+ | cosine_precision@1 | 0.6271 |
552
+ | cosine_precision@3 | 0.2789 |
553
+ | cosine_precision@5 | 0.1789 |
554
+ | cosine_precision@10 | 0.0943 |
555
+ | cosine_recall@1 | 0.6271 |
556
+ | cosine_recall@3 | 0.8366 |
557
+ | cosine_recall@5 | 0.8946 |
558
+ | cosine_recall@10 | 0.9431 |
559
+ | cosine_ndcg@10 | 0.7905 |
560
+ | cosine_mrr@10 | 0.7408 |
561
+ | **cosine_map@100** | **0.7434** |
562
+
563
+ #### Information Retrieval
564
+ * Dataset: `gooaq-256-dev`
565
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
566
+
567
+ | Metric | Value |
568
+ |:--------------------|:-----------|
569
+ | cosine_accuracy@1 | 0.6192 |
570
+ | cosine_accuracy@3 | 0.8235 |
571
+ | cosine_accuracy@5 | 0.8866 |
572
+ | cosine_accuracy@10 | 0.9364 |
573
+ | cosine_precision@1 | 0.6192 |
574
+ | cosine_precision@3 | 0.2745 |
575
+ | cosine_precision@5 | 0.1773 |
576
+ | cosine_precision@10 | 0.0936 |
577
+ | cosine_recall@1 | 0.6192 |
578
+ | cosine_recall@3 | 0.8235 |
579
+ | cosine_recall@5 | 0.8866 |
580
+ | cosine_recall@10 | 0.9364 |
581
+ | cosine_ndcg@10 | 0.7821 |
582
+ | cosine_mrr@10 | 0.7321 |
583
+ | **cosine_map@100** | **0.7349** |
584
+
585
+ #### Information Retrieval
586
+ * Dataset: `gooaq-128-dev`
587
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
588
+
589
+ | Metric | Value |
590
+ |:--------------------|:-----------|
591
+ | cosine_accuracy@1 | 0.5942 |
592
+ | cosine_accuracy@3 | 0.804 |
593
+ | cosine_accuracy@5 | 0.8721 |
594
+ | cosine_accuracy@10 | 0.9249 |
595
+ | cosine_precision@1 | 0.5942 |
596
+ | cosine_precision@3 | 0.268 |
597
+ | cosine_precision@5 | 0.1744 |
598
+ | cosine_precision@10 | 0.0925 |
599
+ | cosine_recall@1 | 0.5942 |
600
+ | cosine_recall@3 | 0.804 |
601
+ | cosine_recall@5 | 0.8721 |
602
+ | cosine_recall@10 | 0.9249 |
603
+ | cosine_ndcg@10 | 0.7628 |
604
+ | cosine_mrr@10 | 0.7103 |
605
+ | **cosine_map@100** | **0.7134** |
606
+
607
+ #### Information Retrieval
608
+ * Dataset: `gooaq-64-dev`
609
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
610
+
611
+ | Metric | Value |
612
+ |:--------------------|:-----------|
613
+ | cosine_accuracy@1 | 0.556 |
614
+ | cosine_accuracy@3 | 0.7553 |
615
+ | cosine_accuracy@5 | 0.8267 |
616
+ | cosine_accuracy@10 | 0.8945 |
617
+ | cosine_precision@1 | 0.556 |
618
+ | cosine_precision@3 | 0.2518 |
619
+ | cosine_precision@5 | 0.1653 |
620
+ | cosine_precision@10 | 0.0895 |
621
+ | cosine_recall@1 | 0.556 |
622
+ | cosine_recall@3 | 0.7553 |
623
+ | cosine_recall@5 | 0.8267 |
624
+ | cosine_recall@10 | 0.8945 |
625
+ | cosine_ndcg@10 | 0.7246 |
626
+ | cosine_mrr@10 | 0.6702 |
627
+ | **cosine_map@100** | **0.6743** |
628
+
629
+ #### Information Retrieval
630
+ * Dataset: `gooaq-32-dev`
631
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
632
+
633
+ | Metric | Value |
634
+ |:--------------------|:-----------|
635
+ | cosine_accuracy@1 | 0.4628 |
636
+ | cosine_accuracy@3 | 0.6619 |
637
+ | cosine_accuracy@5 | 0.7415 |
638
+ | cosine_accuracy@10 | 0.8241 |
639
+ | cosine_precision@1 | 0.4628 |
640
+ | cosine_precision@3 | 0.2206 |
641
+ | cosine_precision@5 | 0.1483 |
642
+ | cosine_precision@10 | 0.0824 |
643
+ | cosine_recall@1 | 0.4628 |
644
+ | cosine_recall@3 | 0.6619 |
645
+ | cosine_recall@5 | 0.7415 |
646
+ | cosine_recall@10 | 0.8241 |
647
+ | cosine_ndcg@10 | 0.6387 |
648
+ | cosine_mrr@10 | 0.5798 |
649
+ | **cosine_map@100** | **0.5857** |
650
+
651
+ <!--
652
+ ## Bias, Risks and Limitations
653
+
654
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
655
+ -->
656
+
657
+ <!--
658
+ ### Recommendations
659
+
660
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
661
+ -->
662
+
663
+ ## Training Details
664
+
665
+ ### Training Dataset
666
+
667
+ #### gooaq
668
+
669
+ * Dataset: [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
670
+ * Size: 3,012,496 training samples
671
+ * Columns: <code>question</code> and <code>answer</code>
672
+ * Approximate statistics based on the first 1000 samples:
673
+ | | question | answer |
674
+ |:--------|:-----------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|
675
+ | type | string | string |
676
+ | details | <ul><li>min: 18 characters</li><li>mean: 43.23 characters</li><li>max: 96 characters</li></ul> | <ul><li>min: 55 characters</li><li>mean: 253.36 characters</li><li>max: 371 characters</li></ul> |
677
+ * Samples:
678
+ | question | answer |
679
+ |:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
680
+ | <code>what is the difference between broilers and layers?</code> | <code>An egg laying poultry is called egger or layer whereas broilers are reared for obtaining meat. So a layer should be able to produce more number of large sized eggs, without growing too much. On the other hand, a broiler should yield more meat and hence should be able to grow well.</code> |
681
+ | <code>what is the difference between chronological order and spatial order?</code> | <code>As a writer, you should always remember that unlike chronological order and the other organizational methods for data, spatial order does not take into account the time. Spatial order is primarily focused on the location. All it does is take into account the location of objects and not the time.</code> |
682
+ | <code>is kamagra same as viagra?</code> | <code>Kamagra is thought to contain the same active ingredient as Viagra, sildenafil citrate. In theory, it should work in much the same way as Viagra, taking about 45 minutes to take effect, and lasting for around 4-6 hours. However, this will vary from person to person.</code> |
683
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
684
+ ```json
685
+ {
686
+ "loss": "MultipleNegativesRankingLoss",
687
+ "matryoshka_dims": [
688
+ 1024,
689
+ 512,
690
+ 256,
691
+ 128,
692
+ 64,
693
+ 32
694
+ ],
695
+ "matryoshka_weights": [
696
+ 1,
697
+ 1,
698
+ 1,
699
+ 1,
700
+ 1,
701
+ 1
702
+ ],
703
+ "n_dims_per_step": -1
704
+ }
705
+ ```
706
+
707
+ ### Evaluation Dataset
708
+
709
+ #### gooaq
710
+
711
+ * Dataset: [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
712
+ * Size: 3,012,496 evaluation samples
713
+ * Columns: <code>question</code> and <code>answer</code>
714
+ * Approximate statistics based on the first 1000 samples:
715
+ | | question | answer |
716
+ |:--------|:-----------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|
717
+ | type | string | string |
718
+ | details | <ul><li>min: 18 characters</li><li>mean: 43.17 characters</li><li>max: 98 characters</li></ul> | <ul><li>min: 51 characters</li><li>mean: 254.12 characters</li><li>max: 360 characters</li></ul> |
719
+ * Samples:
720
+ | question | answer |
721
+ |:-----------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
722
+ | <code>how do i program my directv remote with my tv?</code> | <code>['Press MENU on your remote.', 'Select Settings & Help > Settings > Remote Control > Program Remote.', 'Choose the device (TV, audio, DVD) you wish to program. ... ', 'Follow the on-screen prompts to complete programming.']</code> |
723
+ | <code>are rodrigues fruit bats nocturnal?</code> | <code>Before its numbers were threatened by habitat destruction, storms, and hunting, some of those groups could number 500 or more members. Sunrise, sunset. Rodrigues fruit bats are most active at dawn, at dusk, and at night.</code> |
724
+ | <code>why does your heart rate increase during exercise bbc bitesize?</code> | <code>During exercise there is an increase in physical activity and muscle cells respire more than they do when the body is at rest. The heart rate increases during exercise. The rate and depth of breathing increases - this makes sure that more oxygen is absorbed into the blood, and more carbon dioxide is removed from it.</code> |
725
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
726
+ ```json
727
+ {
728
+ "loss": "MultipleNegativesRankingLoss",
729
+ "matryoshka_dims": [
730
+ 1024,
731
+ 512,
732
+ 256,
733
+ 128,
734
+ 64,
735
+ 32
736
+ ],
737
+ "matryoshka_weights": [
738
+ 1,
739
+ 1,
740
+ 1,
741
+ 1,
742
+ 1,
743
+ 1
744
+ ],
745
+ "n_dims_per_step": -1
746
+ }
747
+ ```
748
+
749
+ ### Training Hyperparameters
750
+ #### Non-Default Hyperparameters
751
+
752
+ - `eval_strategy`: steps
753
+ - `per_device_train_batch_size`: 2048
754
+ - `per_device_eval_batch_size`: 2048
755
+ - `learning_rate`: 0.2
756
+ - `num_train_epochs`: 1
757
+ - `warmup_ratio`: 0.1
758
+ - `bf16`: True
759
+ - `batch_sampler`: no_duplicates
760
+
761
+ #### All Hyperparameters
762
+ <details><summary>Click to expand</summary>
763
+
764
+ - `overwrite_output_dir`: False
765
+ - `do_predict`: False
766
+ - `eval_strategy`: steps
767
+ - `prediction_loss_only`: True
768
+ - `per_device_train_batch_size`: 2048
769
+ - `per_device_eval_batch_size`: 2048
770
+ - `per_gpu_train_batch_size`: None
771
+ - `per_gpu_eval_batch_size`: None
772
+ - `gradient_accumulation_steps`: 1
773
+ - `eval_accumulation_steps`: None
774
+ - `torch_empty_cache_steps`: None
775
+ - `learning_rate`: 0.2
776
+ - `weight_decay`: 0.0
777
+ - `adam_beta1`: 0.9
778
+ - `adam_beta2`: 0.999
779
+ - `adam_epsilon`: 1e-08
780
+ - `max_grad_norm`: 1.0
781
+ - `num_train_epochs`: 1
782
+ - `max_steps`: -1
783
+ - `lr_scheduler_type`: linear
784
+ - `lr_scheduler_kwargs`: {}
785
+ - `warmup_ratio`: 0.1
786
+ - `warmup_steps`: 0
787
+ - `log_level`: passive
788
+ - `log_level_replica`: warning
789
+ - `log_on_each_node`: True
790
+ - `logging_nan_inf_filter`: True
791
+ - `save_safetensors`: True
792
+ - `save_on_each_node`: False
793
+ - `save_only_model`: False
794
+ - `restore_callback_states_from_checkpoint`: False
795
+ - `no_cuda`: False
796
+ - `use_cpu`: False
797
+ - `use_mps_device`: False
798
+ - `seed`: 42
799
+ - `data_seed`: None
800
+ - `jit_mode_eval`: False
801
+ - `use_ipex`: False
802
+ - `bf16`: True
803
+ - `fp16`: False
804
+ - `fp16_opt_level`: O1
805
+ - `half_precision_backend`: auto
806
+ - `bf16_full_eval`: False
807
+ - `fp16_full_eval`: False
808
+ - `tf32`: None
809
+ - `local_rank`: 0
810
+ - `ddp_backend`: None
811
+ - `tpu_num_cores`: None
812
+ - `tpu_metrics_debug`: False
813
+ - `debug`: []
814
+ - `dataloader_drop_last`: False
815
+ - `dataloader_num_workers`: 0
816
+ - `dataloader_prefetch_factor`: None
817
+ - `past_index`: -1
818
+ - `disable_tqdm`: False
819
+ - `remove_unused_columns`: True
820
+ - `label_names`: None
821
+ - `load_best_model_at_end`: False
822
+ - `ignore_data_skip`: False
823
+ - `fsdp`: []
824
+ - `fsdp_min_num_params`: 0
825
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
826
+ - `fsdp_transformer_layer_cls_to_wrap`: None
827
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
828
+ - `deepspeed`: None
829
+ - `label_smoothing_factor`: 0.0
830
+ - `optim`: adamw_torch
831
+ - `optim_args`: None
832
+ - `adafactor`: False
833
+ - `group_by_length`: False
834
+ - `length_column_name`: length
835
+ - `ddp_find_unused_parameters`: None
836
+ - `ddp_bucket_cap_mb`: None
837
+ - `ddp_broadcast_buffers`: False
838
+ - `dataloader_pin_memory`: True
839
+ - `dataloader_persistent_workers`: False
840
+ - `skip_memory_metrics`: True
841
+ - `use_legacy_prediction_loop`: False
842
+ - `push_to_hub`: False
843
+ - `resume_from_checkpoint`: None
844
+ - `hub_model_id`: None
845
+ - `hub_strategy`: every_save
846
+ - `hub_private_repo`: False
847
+ - `hub_always_push`: False
848
+ - `gradient_checkpointing`: False
849
+ - `gradient_checkpointing_kwargs`: None
850
+ - `include_inputs_for_metrics`: False
851
+ - `eval_do_concat_batches`: True
852
+ - `fp16_backend`: auto
853
+ - `push_to_hub_model_id`: None
854
+ - `push_to_hub_organization`: None
855
+ - `mp_parameters`:
856
+ - `auto_find_batch_size`: False
857
+ - `full_determinism`: False
858
+ - `torchdynamo`: None
859
+ - `ray_scope`: last
860
+ - `ddp_timeout`: 1800
861
+ - `torch_compile`: False
862
+ - `torch_compile_backend`: None
863
+ - `torch_compile_mode`: None
864
+ - `dispatch_batches`: None
865
+ - `split_batches`: None
866
+ - `include_tokens_per_second`: False
867
+ - `include_num_input_tokens_seen`: False
868
+ - `neftune_noise_alpha`: None
869
+ - `optim_target_modules`: None
870
+ - `batch_eval_metrics`: False
871
+ - `eval_on_start`: False
872
+ - `eval_use_gather_object`: False
873
+ - `batch_sampler`: no_duplicates
874
+ - `multi_dataset_batch_sampler`: proportional
875
+
876
+ </details>
877
+
878
+ ### Training Logs
879
+ | Epoch | Step | Training Loss | Validation Loss | gooaq-1024-dev_cosine_map@100 | gooaq-512-dev_cosine_map@100 | gooaq-256-dev_cosine_map@100 | gooaq-128-dev_cosine_map@100 | gooaq-64-dev_cosine_map@100 | gooaq-32-dev_cosine_map@100 |
880
+ |:------:|:----:|:-------------:|:---------------:|:-----------------------------:|:----------------------------:|:----------------------------:|:----------------------------:|:---------------------------:|:---------------------------:|
881
+ | 0 | 0 | - | - | 0.2095 | 0.2010 | 0.1735 | 0.1381 | 0.0750 | 0.0331 |
882
+ | 0.0007 | 1 | 34.953 | - | - | - | - | - | - | - |
883
+ | 0.0682 | 100 | 16.2504 | - | - | - | - | - | - | - |
884
+ | 0.1363 | 200 | 5.9502 | - | - | - | - | - | - | - |
885
+ | 0.1704 | 250 | - | 1.6781 | 0.6791 | 0.6729 | 0.6619 | 0.6409 | 0.5904 | 0.4934 |
886
+ | 0.2045 | 300 | 4.8411 | - | - | - | - | - | - | - |
887
+ | 0.2727 | 400 | 4.336 | - | - | - | - | - | - | - |
888
+ | 0.3408 | 500 | 4.0484 | 1.3935 | 0.7104 | 0.7055 | 0.6968 | 0.6756 | 0.6322 | 0.5358 |
889
+ | 0.4090 | 600 | 3.8378 | - | - | - | - | - | - | - |
890
+ | 0.4772 | 700 | 3.6765 | - | - | - | - | - | - | - |
891
+ | 0.5112 | 750 | - | 1.2549 | 0.7246 | 0.7216 | 0.7133 | 0.6943 | 0.6482 | 0.5582 |
892
+ | 0.5453 | 800 | 3.5439 | - | - | - | - | - | - | - |
893
+ | 0.6135 | 900 | 3.4284 | - | - | - | - | - | - | - |
894
+ | 0.6817 | 1000 | 3.3576 | 1.1656 | 0.7359 | 0.7338 | 0.7252 | 0.7040 | 0.6604 | 0.5715 |
895
+ | 0.7498 | 1100 | 3.2456 | - | - | - | - | - | - | - |
896
+ | 0.8180 | 1200 | 3.2014 | - | - | - | - | - | - | - |
897
+ | 0.8521 | 1250 | - | 1.1133 | 0.7438 | 0.7398 | 0.7310 | 0.7099 | 0.6704 | 0.5796 |
898
+ | 0.8862 | 1300 | 3.1536 | - | - | - | - | - | - | - |
899
+ | 0.9543 | 1400 | 3.0696 | - | - | - | - | - | - | - |
900
+ | 1.0 | 1467 | - | - | 0.7466 | 0.7434 | 0.7349 | 0.7134 | 0.6743 | 0.5857 |
901
+
902
+
903
+ ### Environmental Impact
904
+ Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon).
905
+ - **Energy Consumed**: 0.017 kWh
906
+ - **Carbon Emitted**: 0.006 kg of CO2
907
+ - **Hours Used**: 0.109 hours
908
+
909
+ ### Training Hardware
910
+ - **On Cloud**: No
911
+ - **GPU Model**: 1 x NVIDIA GeForce RTX 3090
912
+ - **CPU Model**: 13th Gen Intel(R) Core(TM) i7-13700K
913
+ - **RAM Size**: 31.78 GB
914
+
915
+ ### Framework Versions
916
+ - Python: 3.11.6
917
+ - Sentence Transformers: 3.2.0.dev0
918
+ - Transformers: 4.43.4
919
+ - PyTorch: 2.5.0.dev20240807+cu121
920
+ - Accelerate: 0.31.0
921
+ - Datasets: 2.20.0
922
+ - Tokenizers: 0.19.1
923
+
924
+ ## Citation
925
+
926
+ ### BibTeX
927
+
928
+ #### Sentence Transformers
929
+ ```bibtex
930
+ @inproceedings{reimers-2019-sentence-bert,
931
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
932
+ author = "Reimers, Nils and Gurevych, Iryna",
933
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
934
+ month = "11",
935
+ year = "2019",
936
+ publisher = "Association for Computational Linguistics",
937
+ url = "https://arxiv.org/abs/1908.10084",
938
+ }
939
+ ```
940
+
941
+ #### MatryoshkaLoss
942
+ ```bibtex
943
+ @misc{kusupati2024matryoshka,
944
+ title={Matryoshka Representation Learning},
945
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
946
+ year={2024},
947
+ eprint={2205.13147},
948
+ archivePrefix={arXiv},
949
+ primaryClass={cs.LG}
950
+ }
951
+ ```
952
+
953
+ #### MultipleNegativesRankingLoss
954
+ ```bibtex
955
+ @misc{henderson2017efficient,
956
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
957
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
958
+ year={2017},
959
+ eprint={1705.00652},
960
+ archivePrefix={arXiv},
961
+ primaryClass={cs.CL}
962
+ }
963
+ ```
964
+
965
+ <!--
966
+ ## Glossary
967
+
968
+ *Clearly define terms in order to be accessible across audiences.*
969
+ -->
970
+
971
+ <!--
972
+ ## Model Card Authors
973
+
974
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
975
+ -->
976
+
977
+ <!--
978
+ ## Model Card Contact
979
+
980
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
981
  -->