all layer trained for every step.AdaptiveLayerLoss(model=model,
Browse filesloss=train_loss,
n_layers_per_step = -1,
last_layer_weight = 1.5,
prior_layers_weight= 0.1,
kl_div_weight = 0.5,
kl_temperature= 1,
)
num_epochs = 2
learning_rate = 2e-5
warmup_ratio=0.25
weight_decay = 1e-6
schedule = "cosine_with_restarts"
num_cycles = 3
- README.md +450 -105
- pytorch_model.bin +2 -2
README.md
CHANGED
@@ -7,7 +7,7 @@ tags:
|
|
7 |
- sentence-similarity
|
8 |
- feature-extraction
|
9 |
- generated_from_trainer
|
10 |
-
- dataset_size:
|
11 |
- loss:AdaptiveLayerLoss
|
12 |
- loss:CoSENTLoss
|
13 |
- loss:GISTEmbedLoss
|
@@ -30,36 +30,182 @@ datasets:
|
|
30 |
- sentence-transformers/trivia-qa
|
31 |
- sentence-transformers/quora-duplicates
|
32 |
- sentence-transformers/gooaq
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
33 |
widget:
|
34 |
-
- source_sentence:
|
|
|
35 |
sentences:
|
36 |
-
-
|
37 |
-
-
|
38 |
-
-
|
39 |
-
|
40 |
-
|
41 |
sentences:
|
42 |
-
-
|
43 |
- A pair of people play video games together on a couch.
|
44 |
-
-
|
45 |
-
- source_sentence: A
|
|
|
46 |
sentences:
|
47 |
-
- A
|
48 |
-
-
|
49 |
-
- A
|
50 |
-
- source_sentence: A
|
51 |
-
|
52 |
sentences:
|
53 |
-
-
|
54 |
-
- A
|
55 |
-
-
|
56 |
-
- source_sentence:
|
57 |
sentences:
|
58 |
-
-
|
59 |
-
|
60 |
-
-
|
61 |
-
|
|
|
|
|
|
|
62 |
pipeline_tag: sentence-similarity
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
63 |
---
|
64 |
|
65 |
# SentenceTransformer based on microsoft/deberta-v3-small
|
@@ -127,9 +273,9 @@ from sentence_transformers import SentenceTransformer
|
|
127 |
model = SentenceTransformer("bobox/DeBERTaV3-small-GeneralSentenceTransformer-v2-AllSoft")
|
128 |
# Run inference
|
129 |
sentences = [
|
130 |
-
'
|
131 |
-
"
|
132 |
-
'
|
133 |
]
|
134 |
embeddings = model.encode(sentences)
|
135 |
print(embeddings.shape)
|
@@ -165,6 +311,78 @@ You can finetune this model on your own dataset.
|
|
165 |
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
|
166 |
-->
|
167 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
168 |
<!--
|
169 |
## Bias, Risks and Limitations
|
170 |
|
@@ -184,7 +402,7 @@ You can finetune this model on your own dataset.
|
|
184 |
#### nli-pairs
|
185 |
|
186 |
* Dataset: [nli-pairs](https://huggingface.co/datasets/sentence-transformers/all-nli) at [d482672](https://huggingface.co/datasets/sentence-transformers/all-nli/tree/d482672c8e74ce18da116f430137434ba2e52fab)
|
187 |
-
* Size:
|
188 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
189 |
* Approximate statistics based on the first 1000 samples:
|
190 |
| | sentence1 | sentence2 |
|
@@ -236,19 +454,19 @@ You can finetune this model on your own dataset.
|
|
236 |
#### vitaminc-pairs
|
237 |
|
238 |
* Dataset: [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc) at [be6febb](https://huggingface.co/datasets/tals/vitaminc/tree/be6febb761b0b2807687e61e0b5282e459df2fa0)
|
239 |
-
* Size:
|
240 |
* Columns: <code>label</code>, <code>sentence1</code>, and <code>sentence2</code>
|
241 |
* Approximate statistics based on the first 1000 samples:
|
242 |
-
| | label | sentence1 | sentence2
|
243 |
-
|
244 |
-
| type | int | string | string
|
245 |
-
| details | <ul><li>1: 100.00%</li></ul> | <ul><li>min:
|
246 |
* Samples:
|
247 |
-
| label | sentence1
|
248 |
-
|
249 |
-
| <code>1</code> | <code>
|
250 |
-
| <code>1</code> | <code>
|
251 |
-
| <code>1</code> | <code>
|
252 |
* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
|
253 |
```json
|
254 |
{
|
@@ -264,19 +482,19 @@ You can finetune this model on your own dataset.
|
|
264 |
#### qnli-contrastive
|
265 |
|
266 |
* Dataset: [qnli-contrastive](https://huggingface.co/datasets/nyu-mll/glue) at [bcdcba7](https://huggingface.co/datasets/nyu-mll/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
|
267 |
-
* Size:
|
268 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
269 |
* Approximate statistics based on the first 1000 samples:
|
270 |
-
| | sentence1 | sentence2
|
271 |
-
|
272 |
-
| type | string | string
|
273 |
-
| details | <ul><li>min: 6 tokens</li><li>mean: 13.
|
274 |
* Samples:
|
275 |
-
| sentence1
|
276 |
-
|
277 |
-
| <code>
|
278 |
-
| <code>
|
279 |
-
| <code>What
|
280 |
* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
|
281 |
```json
|
282 |
{
|
@@ -292,19 +510,19 @@ You can finetune this model on your own dataset.
|
|
292 |
#### scitail-pairs-qa
|
293 |
|
294 |
* Dataset: [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
|
295 |
-
* Size:
|
296 |
* Columns: <code>sentence2</code> and <code>sentence1</code>
|
297 |
* Approximate statistics based on the first 1000 samples:
|
298 |
-
| | sentence2
|
299 |
-
|
300 |
-
| type | string
|
301 |
-
| details | <ul><li>min: 7 tokens</li><li>mean:
|
302 |
* Samples:
|
303 |
-
| sentence2
|
304 |
-
|
305 |
-
| <code>
|
306 |
-
| <code>
|
307 |
-
| <code>
|
308 |
* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
|
309 |
```json
|
310 |
{
|
@@ -320,19 +538,19 @@ You can finetune this model on your own dataset.
|
|
320 |
#### scitail-pairs-pos
|
321 |
|
322 |
* Dataset: [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
|
323 |
-
* Size:
|
324 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
325 |
* Approximate statistics based on the first 1000 samples:
|
326 |
-
| | sentence1
|
327 |
-
|
328 |
-
| type | string
|
329 |
-
| details | <ul><li>min:
|
330 |
* Samples:
|
331 |
-
| sentence1
|
332 |
-
|
333 |
-
| <code>
|
334 |
-
| <code>
|
335 |
-
| <code>
|
336 |
* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
|
337 |
```json
|
338 |
{
|
@@ -348,19 +566,19 @@ You can finetune this model on your own dataset.
|
|
348 |
#### xsum-pairs
|
349 |
|
350 |
* Dataset: [xsum-pairs](https://huggingface.co/datasets/sentence-transformers/xsum) at [788ddaf](https://huggingface.co/datasets/sentence-transformers/xsum/tree/788ddafe04e539956d56b567bc32a036ee7b9206)
|
351 |
-
* Size:
|
352 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
353 |
* Approximate statistics based on the first 1000 samples:
|
354 |
-
| | sentence1
|
355 |
-
|
356 |
-
| type | string
|
357 |
-
| details | <ul><li>min:
|
358 |
* Samples:
|
359 |
-
| sentence1
|
360 |
-
|
361 |
-
| <code>
|
362 |
-
| <code>
|
363 |
-
| <code>
|
364 |
* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
|
365 |
```json
|
366 |
{
|
@@ -376,7 +594,7 @@ You can finetune this model on your own dataset.
|
|
376 |
#### compression-pairs
|
377 |
|
378 |
* Dataset: [compression-pairs](https://huggingface.co/datasets/sentence-transformers/sentence-compression) at [605bc91](https://huggingface.co/datasets/sentence-transformers/sentence-compression/tree/605bc91d95631895ba25b6eda51a3cb596976c90)
|
379 |
-
* Size:
|
380 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
381 |
* Approximate statistics based on the first 1000 samples:
|
382 |
| | sentence1 | sentence2 |
|
@@ -394,7 +612,7 @@ You can finetune this model on your own dataset.
|
|
394 |
{
|
395 |
"loss": "MultipleNegativesSymmetricRankingLoss",
|
396 |
"n_layers_per_step": -1,
|
397 |
-
"last_layer_weight":
|
398 |
"prior_layers_weight": 0.1,
|
399 |
"kl_div_weight": 0.5,
|
400 |
"kl_temperature": 1
|
@@ -404,7 +622,7 @@ You can finetune this model on your own dataset.
|
|
404 |
#### sciq_pairs
|
405 |
|
406 |
* Dataset: [sciq_pairs](https://huggingface.co/datasets/allenai/sciq) at [2c94ad3](https://huggingface.co/datasets/allenai/sciq/tree/2c94ad3e1aafab77146f384e23536f97a4849815)
|
407 |
-
* Size:
|
408 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
409 |
* Approximate statistics based on the first 1000 samples:
|
410 |
| | sentence1 | sentence2 |
|
@@ -432,7 +650,7 @@ You can finetune this model on your own dataset.
|
|
432 |
#### qasc_pairs
|
433 |
|
434 |
* Dataset: [qasc_pairs](https://huggingface.co/datasets/allenai/qasc) at [a34ba20](https://huggingface.co/datasets/allenai/qasc/tree/a34ba204eb9a33b919c10cc08f4f1c8dae5ec070)
|
435 |
-
* Size:
|
436 |
* Columns: <code>id</code>, <code>sentence1</code>, and <code>sentence2</code>
|
437 |
* Approximate statistics based on the first 1000 samples:
|
438 |
| | id | sentence1 | sentence2 |
|
@@ -488,7 +706,7 @@ You can finetune this model on your own dataset.
|
|
488 |
#### msmarco_pairs
|
489 |
|
490 |
* Dataset: [msmarco_pairs](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3) at [28ff31e](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3/tree/28ff31e4c97cddd53d298497f766e653f1e666f9)
|
491 |
-
* Size:
|
492 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
493 |
* Approximate statistics based on the first 1000 samples:
|
494 |
| | sentence1 | sentence2 |
|
@@ -516,7 +734,7 @@ You can finetune this model on your own dataset.
|
|
516 |
#### nq_pairs
|
517 |
|
518 |
* Dataset: [nq_pairs](https://huggingface.co/datasets/sentence-transformers/natural-questions) at [f9e894e](https://huggingface.co/datasets/sentence-transformers/natural-questions/tree/f9e894e1081e206e577b4eaa9ee6de2b06ae6f17)
|
519 |
-
* Size:
|
520 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
521 |
* Approximate statistics based on the first 1000 samples:
|
522 |
| | sentence1 | sentence2 |
|
@@ -544,7 +762,7 @@ You can finetune this model on your own dataset.
|
|
544 |
#### trivia_pairs
|
545 |
|
546 |
* Dataset: [trivia_pairs](https://huggingface.co/datasets/sentence-transformers/trivia-qa) at [a7c36e3](https://huggingface.co/datasets/sentence-transformers/trivia-qa/tree/a7c36e3c8c8c01526bc094d79bf80d4c848b0ad0)
|
547 |
-
* Size:
|
548 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
549 |
* Approximate statistics based on the first 1000 samples:
|
550 |
| | sentence1 | sentence2 |
|
@@ -572,7 +790,7 @@ You can finetune this model on your own dataset.
|
|
572 |
#### quora_pairs
|
573 |
|
574 |
* Dataset: [quora_pairs](https://huggingface.co/datasets/sentence-transformers/quora-duplicates) at [451a485](https://huggingface.co/datasets/sentence-transformers/quora-duplicates/tree/451a4850bd141edb44ade1b5828c259abd762cdb)
|
575 |
-
* Size:
|
576 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
577 |
* Approximate statistics based on the first 1000 samples:
|
578 |
| | sentence1 | sentence2 |
|
@@ -600,7 +818,7 @@ You can finetune this model on your own dataset.
|
|
600 |
#### gooaq_pairs
|
601 |
|
602 |
* Dataset: [gooaq_pairs](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
|
603 |
-
* Size:
|
604 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
605 |
* Approximate statistics based on the first 1000 samples:
|
606 |
| | sentence1 | sentence2 |
|
@@ -630,13 +848,13 @@ You can finetune this model on your own dataset.
|
|
630 |
#### nli-pairs
|
631 |
|
632 |
* Dataset: [nli-pairs](https://huggingface.co/datasets/sentence-transformers/all-nli) at [d482672](https://huggingface.co/datasets/sentence-transformers/all-nli/tree/d482672c8e74ce18da116f430137434ba2e52fab)
|
633 |
-
* Size:
|
634 |
* Columns: <code>anchor</code> and <code>positive</code>
|
635 |
* Approximate statistics based on the first 1000 samples:
|
636 |
| | anchor | positive |
|
637 |
|:--------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
|
638 |
| type | string | string |
|
639 |
-
| details | <ul><li>min: 5 tokens</li><li>mean: 17.
|
640 |
* Samples:
|
641 |
| anchor | positive |
|
642 |
|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------|
|
@@ -658,13 +876,13 @@ You can finetune this model on your own dataset.
|
|
658 |
#### scitail-pairs-pos
|
659 |
|
660 |
* Dataset: [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
|
661 |
-
* Size:
|
662 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
663 |
* Approximate statistics based on the first 1000 samples:
|
664 |
-
| | sentence1 | sentence2
|
665 |
-
|
666 |
-
| type | string | string
|
667 |
-
| details | <ul><li>min: 5 tokens</li><li>mean: 22.
|
668 |
* Samples:
|
669 |
| sentence1 | sentence2 | label |
|
670 |
|:----------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------|:---------------|
|
@@ -686,13 +904,13 @@ You can finetune this model on your own dataset.
|
|
686 |
#### qnli-contrastive
|
687 |
|
688 |
* Dataset: [qnli-contrastive](https://huggingface.co/datasets/nyu-mll/glue) at [bcdcba7](https://huggingface.co/datasets/nyu-mll/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
|
689 |
-
* Size:
|
690 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
691 |
* Approximate statistics based on the first 1000 samples:
|
692 |
| | sentence1 | sentence2 | label |
|
693 |
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------|
|
694 |
| type | string | string | int |
|
695 |
-
| details | <ul><li>min: 6 tokens</li><li>mean: 14.
|
696 |
* Samples:
|
697 |
| sentence1 | sentence2 | label |
|
698 |
|:--------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
|
@@ -716,17 +934,17 @@ You can finetune this model on your own dataset.
|
|
716 |
|
717 |
- `eval_strategy`: steps
|
718 |
- `per_device_train_batch_size`: 28
|
719 |
-
- `per_device_eval_batch_size`:
|
720 |
-
- `learning_rate`:
|
721 |
- `weight_decay`: 1e-06
|
722 |
-
- `num_train_epochs`:
|
723 |
- `lr_scheduler_type`: cosine_with_restarts
|
724 |
- `lr_scheduler_kwargs`: {'num_cycles': 3}
|
725 |
-
- `warmup_ratio`: 0.
|
726 |
- `save_safetensors`: False
|
727 |
- `fp16`: True
|
728 |
- `push_to_hub`: True
|
729 |
-
- `hub_model_id`: bobox/DeBERTaV3-small-
|
730 |
- `hub_strategy`: checkpoint
|
731 |
- `batch_sampler`: no_duplicates
|
732 |
|
@@ -738,22 +956,22 @@ You can finetune this model on your own dataset.
|
|
738 |
- `eval_strategy`: steps
|
739 |
- `prediction_loss_only`: True
|
740 |
- `per_device_train_batch_size`: 28
|
741 |
-
- `per_device_eval_batch_size`:
|
742 |
- `per_gpu_train_batch_size`: None
|
743 |
- `per_gpu_eval_batch_size`: None
|
744 |
- `gradient_accumulation_steps`: 1
|
745 |
- `eval_accumulation_steps`: None
|
746 |
-
- `learning_rate`:
|
747 |
- `weight_decay`: 1e-06
|
748 |
- `adam_beta1`: 0.9
|
749 |
- `adam_beta2`: 0.999
|
750 |
- `adam_epsilon`: 1e-08
|
751 |
- `max_grad_norm`: 1.0
|
752 |
-
- `num_train_epochs`:
|
753 |
- `max_steps`: -1
|
754 |
- `lr_scheduler_type`: cosine_with_restarts
|
755 |
- `lr_scheduler_kwargs`: {'num_cycles': 3}
|
756 |
-
- `warmup_ratio`: 0.
|
757 |
- `warmup_steps`: 0
|
758 |
- `log_level`: passive
|
759 |
- `log_level_replica`: warning
|
@@ -812,7 +1030,7 @@ You can finetune this model on your own dataset.
|
|
812 |
- `use_legacy_prediction_loop`: False
|
813 |
- `push_to_hub`: True
|
814 |
- `resume_from_checkpoint`: None
|
815 |
-
- `hub_model_id`: bobox/DeBERTaV3-small-
|
816 |
- `hub_strategy`: checkpoint
|
817 |
- `hub_private_repo`: False
|
818 |
- `hub_always_push`: False
|
@@ -844,6 +1062,133 @@ You can finetune this model on your own dataset.
|
|
844 |
|
845 |
</details>
|
846 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
847 |
### Framework Versions
|
848 |
- Python: 3.10.13
|
849 |
- Sentence Transformers: 3.0.1
|
|
|
7 |
- sentence-similarity
|
8 |
- feature-extraction
|
9 |
- generated_from_trainer
|
10 |
+
- dataset_size:78183
|
11 |
- loss:AdaptiveLayerLoss
|
12 |
- loss:CoSENTLoss
|
13 |
- loss:GISTEmbedLoss
|
|
|
30 |
- sentence-transformers/trivia-qa
|
31 |
- sentence-transformers/quora-duplicates
|
32 |
- sentence-transformers/gooaq
|
33 |
+
metrics:
|
34 |
+
- pearson_cosine
|
35 |
+
- spearman_cosine
|
36 |
+
- pearson_manhattan
|
37 |
+
- spearman_manhattan
|
38 |
+
- pearson_euclidean
|
39 |
+
- spearman_euclidean
|
40 |
+
- pearson_dot
|
41 |
+
- spearman_dot
|
42 |
+
- pearson_max
|
43 |
+
- spearman_max
|
44 |
widget:
|
45 |
+
- source_sentence: The X and Y chromosomes in human beings that determine the sex
|
46 |
+
of an individual.
|
47 |
sentences:
|
48 |
+
- A glacier leaves behind bare rock when it retreats.
|
49 |
+
- Prokaryotes are unicellular organisms that lack organelles surrounded by membranes.
|
50 |
+
- Mammalian sex determination is determined genetically by the presence of chromosomes
|
51 |
+
identified by the letters x and y.
|
52 |
+
- source_sentence: Police officer with riot shield stands in front of crowd.
|
53 |
sentences:
|
54 |
+
- A police officer stands in front of a crowd.
|
55 |
- A pair of people play video games together on a couch.
|
56 |
+
- People are outside digging a hole.
|
57 |
+
- source_sentence: A young girl sitting on a white comforter on a bed covered with
|
58 |
+
clothing, holding a yellow stuffed duck.
|
59 |
sentences:
|
60 |
+
- A man standing in a room is pointing up.
|
61 |
+
- A Little girl is enjoying cake outside.
|
62 |
+
- A yellow duck being held by a girl.
|
63 |
+
- source_sentence: A teenage girl in winter clothes slides down a decline in a red
|
64 |
+
sled.
|
65 |
sentences:
|
66 |
+
- A woman preparing vegetables.
|
67 |
+
- A girl is sliding on a red sled.
|
68 |
+
- A person is on a beach.
|
69 |
+
- source_sentence: How many hymns of Luther were included in the Achtliederbuch?
|
70 |
sentences:
|
71 |
+
- the ABC News building was renamed Peter Jennings Way in 2006 in honor of the recently
|
72 |
+
deceased longtime ABC News chief anchor and anchor of World News Tonight.
|
73 |
+
- In early 2009, Disney–ABC Television Group merged ABC Entertainment and ABC Studios
|
74 |
+
into a new division, ABC Entertainment Group, which would be responsible for both
|
75 |
+
its production and broadcasting operations.
|
76 |
+
- Luther's hymns were included in early Lutheran hymnals and spread the ideas of
|
77 |
+
the Reformation.
|
78 |
pipeline_tag: sentence-similarity
|
79 |
+
model-index:
|
80 |
+
- name: SentenceTransformer based on microsoft/deberta-v3-small
|
81 |
+
results:
|
82 |
+
- task:
|
83 |
+
type: semantic-similarity
|
84 |
+
name: Semantic Similarity
|
85 |
+
dataset:
|
86 |
+
name: sts test
|
87 |
+
type: sts-test
|
88 |
+
metrics:
|
89 |
+
- type: pearson_cosine
|
90 |
+
value: 0.4121931859939639
|
91 |
+
name: Pearson Cosine
|
92 |
+
- type: spearman_cosine
|
93 |
+
value: 0.4188435395565816
|
94 |
+
name: Spearman Cosine
|
95 |
+
- type: pearson_manhattan
|
96 |
+
value: 0.43722674169112186
|
97 |
+
name: Pearson Manhattan
|
98 |
+
- type: spearman_manhattan
|
99 |
+
value: 0.4419489193187135
|
100 |
+
name: Spearman Manhattan
|
101 |
+
- type: pearson_euclidean
|
102 |
+
value: 0.4165228130620452
|
103 |
+
name: Pearson Euclidean
|
104 |
+
- type: spearman_euclidean
|
105 |
+
value: 0.42369527784158983
|
106 |
+
name: Spearman Euclidean
|
107 |
+
- type: pearson_dot
|
108 |
+
value: 0.13511926964573803
|
109 |
+
name: Pearson Dot
|
110 |
+
- type: spearman_dot
|
111 |
+
value: 0.13030376975519165
|
112 |
+
name: Spearman Dot
|
113 |
+
- type: pearson_max
|
114 |
+
value: 0.43722674169112186
|
115 |
+
name: Pearson Max
|
116 |
+
- type: spearman_max
|
117 |
+
value: 0.4419489193187135
|
118 |
+
name: Spearman Max
|
119 |
+
- type: pearson_cosine
|
120 |
+
value: 0.7746195773286169
|
121 |
+
name: Pearson Cosine
|
122 |
+
- type: spearman_cosine
|
123 |
+
value: 0.7690423402274569
|
124 |
+
name: Spearman Cosine
|
125 |
+
- type: pearson_manhattan
|
126 |
+
value: 0.7641811345210845
|
127 |
+
name: Pearson Manhattan
|
128 |
+
- type: spearman_manhattan
|
129 |
+
value: 0.754454714808573
|
130 |
+
name: Spearman Manhattan
|
131 |
+
- type: pearson_euclidean
|
132 |
+
value: 0.7621768998872902
|
133 |
+
name: Pearson Euclidean
|
134 |
+
- type: spearman_euclidean
|
135 |
+
value: 0.7522944339564277
|
136 |
+
name: Spearman Euclidean
|
137 |
+
- type: pearson_dot
|
138 |
+
value: 0.643272843908074
|
139 |
+
name: Pearson Dot
|
140 |
+
- type: spearman_dot
|
141 |
+
value: 0.6187202562345202
|
142 |
+
name: Spearman Dot
|
143 |
+
- type: pearson_max
|
144 |
+
value: 0.7746195773286169
|
145 |
+
name: Pearson Max
|
146 |
+
- type: spearman_max
|
147 |
+
value: 0.7690423402274569
|
148 |
+
name: Spearman Max
|
149 |
+
- type: pearson_cosine
|
150 |
+
value: 0.7408543477349779
|
151 |
+
name: Pearson Cosine
|
152 |
+
- type: spearman_cosine
|
153 |
+
value: 0.7193195268794856
|
154 |
+
name: Spearman Cosine
|
155 |
+
- type: pearson_manhattan
|
156 |
+
value: 0.7347205138738226
|
157 |
+
name: Pearson Manhattan
|
158 |
+
- type: spearman_manhattan
|
159 |
+
value: 0.716277121285963
|
160 |
+
name: Spearman Manhattan
|
161 |
+
- type: pearson_euclidean
|
162 |
+
value: 0.7317357204840789
|
163 |
+
name: Pearson Euclidean
|
164 |
+
- type: spearman_euclidean
|
165 |
+
value: 0.7133569462956698
|
166 |
+
name: Spearman Euclidean
|
167 |
+
- type: pearson_dot
|
168 |
+
value: 0.5412116736741877
|
169 |
+
name: Pearson Dot
|
170 |
+
- type: spearman_dot
|
171 |
+
value: 0.5324862690078268
|
172 |
+
name: Spearman Dot
|
173 |
+
- type: pearson_max
|
174 |
+
value: 0.7408543477349779
|
175 |
+
name: Pearson Max
|
176 |
+
- type: spearman_max
|
177 |
+
value: 0.7193195268794856
|
178 |
+
name: Spearman Max
|
179 |
+
- type: pearson_cosine
|
180 |
+
value: 0.7408543477349779
|
181 |
+
name: Pearson Cosine
|
182 |
+
- type: spearman_cosine
|
183 |
+
value: 0.7193195268794856
|
184 |
+
name: Spearman Cosine
|
185 |
+
- type: pearson_manhattan
|
186 |
+
value: 0.7347205138738226
|
187 |
+
name: Pearson Manhattan
|
188 |
+
- type: spearman_manhattan
|
189 |
+
value: 0.716277121285963
|
190 |
+
name: Spearman Manhattan
|
191 |
+
- type: pearson_euclidean
|
192 |
+
value: 0.7317357204840789
|
193 |
+
name: Pearson Euclidean
|
194 |
+
- type: spearman_euclidean
|
195 |
+
value: 0.7133569462956698
|
196 |
+
name: Spearman Euclidean
|
197 |
+
- type: pearson_dot
|
198 |
+
value: 0.5412116736741877
|
199 |
+
name: Pearson Dot
|
200 |
+
- type: spearman_dot
|
201 |
+
value: 0.5324862690078268
|
202 |
+
name: Spearman Dot
|
203 |
+
- type: pearson_max
|
204 |
+
value: 0.7408543477349779
|
205 |
+
name: Pearson Max
|
206 |
+
- type: spearman_max
|
207 |
+
value: 0.7193195268794856
|
208 |
+
name: Spearman Max
|
209 |
---
|
210 |
|
211 |
# SentenceTransformer based on microsoft/deberta-v3-small
|
|
|
273 |
model = SentenceTransformer("bobox/DeBERTaV3-small-GeneralSentenceTransformer-v2-AllSoft")
|
274 |
# Run inference
|
275 |
sentences = [
|
276 |
+
'How many hymns of Luther were included in the Achtliederbuch?',
|
277 |
+
"Luther's hymns were included in early Lutheran hymnals and spread the ideas of the Reformation.",
|
278 |
+
'the ABC News building was renamed Peter Jennings Way in 2006 in honor of the recently deceased longtime ABC News chief anchor and anchor of World News Tonight.',
|
279 |
]
|
280 |
embeddings = model.encode(sentences)
|
281 |
print(embeddings.shape)
|
|
|
311 |
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
|
312 |
-->
|
313 |
|
314 |
+
## Evaluation
|
315 |
+
|
316 |
+
### Metrics
|
317 |
+
|
318 |
+
#### Semantic Similarity
|
319 |
+
* Dataset: `sts-test`
|
320 |
+
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
|
321 |
+
|
322 |
+
| Metric | Value |
|
323 |
+
|:--------------------|:-----------|
|
324 |
+
| pearson_cosine | 0.4122 |
|
325 |
+
| **spearman_cosine** | **0.4188** |
|
326 |
+
| pearson_manhattan | 0.4372 |
|
327 |
+
| spearman_manhattan | 0.4419 |
|
328 |
+
| pearson_euclidean | 0.4165 |
|
329 |
+
| spearman_euclidean | 0.4237 |
|
330 |
+
| pearson_dot | 0.1351 |
|
331 |
+
| spearman_dot | 0.1303 |
|
332 |
+
| pearson_max | 0.4372 |
|
333 |
+
| spearman_max | 0.4419 |
|
334 |
+
|
335 |
+
#### Semantic Similarity
|
336 |
+
* Dataset: `sts-test`
|
337 |
+
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
|
338 |
+
|
339 |
+
| Metric | Value |
|
340 |
+
|:--------------------|:----------|
|
341 |
+
| pearson_cosine | 0.7746 |
|
342 |
+
| **spearman_cosine** | **0.769** |
|
343 |
+
| pearson_manhattan | 0.7642 |
|
344 |
+
| spearman_manhattan | 0.7545 |
|
345 |
+
| pearson_euclidean | 0.7622 |
|
346 |
+
| spearman_euclidean | 0.7523 |
|
347 |
+
| pearson_dot | 0.6433 |
|
348 |
+
| spearman_dot | 0.6187 |
|
349 |
+
| pearson_max | 0.7746 |
|
350 |
+
| spearman_max | 0.769 |
|
351 |
+
|
352 |
+
#### Semantic Similarity
|
353 |
+
* Dataset: `sts-test`
|
354 |
+
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
|
355 |
+
|
356 |
+
| Metric | Value |
|
357 |
+
|:--------------------|:-----------|
|
358 |
+
| pearson_cosine | 0.7409 |
|
359 |
+
| **spearman_cosine** | **0.7193** |
|
360 |
+
| pearson_manhattan | 0.7347 |
|
361 |
+
| spearman_manhattan | 0.7163 |
|
362 |
+
| pearson_euclidean | 0.7317 |
|
363 |
+
| spearman_euclidean | 0.7134 |
|
364 |
+
| pearson_dot | 0.5412 |
|
365 |
+
| spearman_dot | 0.5325 |
|
366 |
+
| pearson_max | 0.7409 |
|
367 |
+
| spearman_max | 0.7193 |
|
368 |
+
|
369 |
+
#### Semantic Similarity
|
370 |
+
* Dataset: `sts-test`
|
371 |
+
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
|
372 |
+
|
373 |
+
| Metric | Value |
|
374 |
+
|:--------------------|:-----------|
|
375 |
+
| pearson_cosine | 0.7409 |
|
376 |
+
| **spearman_cosine** | **0.7193** |
|
377 |
+
| pearson_manhattan | 0.7347 |
|
378 |
+
| spearman_manhattan | 0.7163 |
|
379 |
+
| pearson_euclidean | 0.7317 |
|
380 |
+
| spearman_euclidean | 0.7134 |
|
381 |
+
| pearson_dot | 0.5412 |
|
382 |
+
| spearman_dot | 0.5325 |
|
383 |
+
| pearson_max | 0.7409 |
|
384 |
+
| spearman_max | 0.7193 |
|
385 |
+
|
386 |
<!--
|
387 |
## Bias, Risks and Limitations
|
388 |
|
|
|
402 |
#### nli-pairs
|
403 |
|
404 |
* Dataset: [nli-pairs](https://huggingface.co/datasets/sentence-transformers/all-nli) at [d482672](https://huggingface.co/datasets/sentence-transformers/all-nli/tree/d482672c8e74ce18da116f430137434ba2e52fab)
|
405 |
+
* Size: 6,500 training samples
|
406 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
407 |
* Approximate statistics based on the first 1000 samples:
|
408 |
| | sentence1 | sentence2 |
|
|
|
454 |
#### vitaminc-pairs
|
455 |
|
456 |
* Dataset: [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc) at [be6febb](https://huggingface.co/datasets/tals/vitaminc/tree/be6febb761b0b2807687e61e0b5282e459df2fa0)
|
457 |
+
* Size: 3,194 training samples
|
458 |
* Columns: <code>label</code>, <code>sentence1</code>, and <code>sentence2</code>
|
459 |
* Approximate statistics based on the first 1000 samples:
|
460 |
+
| | label | sentence1 | sentence2 |
|
461 |
+
|:--------|:-----------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
|
462 |
+
| type | int | string | string |
|
463 |
+
| details | <ul><li>1: 100.00%</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.76 tokens</li><li>max: 75 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 37.3 tokens</li><li>max: 502 tokens</li></ul> |
|
464 |
* Samples:
|
465 |
+
| label | sentence1 | sentence2 |
|
466 |
+
|:---------------|:------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
467 |
+
| <code>1</code> | <code>The film will be screened in 2200 theaters .</code> | <code>In the United States and Canada , pre-release tracking suggest the film will gross $ 7�8 million from 2,200 theaters in its opening weekend , trailing fellow newcomer 10 Cloverfield Lane ( $ 25�30 million projection ) , but similar t</code> |
|
468 |
+
| <code>1</code> | <code>Neighbors 2 : Sorority Rising ( film ) scored over 65 % on Rotten Tomatoes .</code> | <code>On Rotten Tomatoes , the film has a rating of 67 % , based on 105 reviews , with an average rating of 5.9/10 .</code> |
|
469 |
+
| <code>1</code> | <code>Averaged on more than 65 reviews , The Handmaiden scored 94 % .</code> | <code>On Rotten Tomatoes , the film has a rating of 94 % , based on 67 reviews , with an average rating of 8/10 .</code> |
|
470 |
* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
|
471 |
```json
|
472 |
{
|
|
|
482 |
#### qnli-contrastive
|
483 |
|
484 |
* Dataset: [qnli-contrastive](https://huggingface.co/datasets/nyu-mll/glue) at [bcdcba7](https://huggingface.co/datasets/nyu-mll/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
|
485 |
+
* Size: 4,000 training samples
|
486 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
487 |
* Approximate statistics based on the first 1000 samples:
|
488 |
+
| | sentence1 | sentence2 | label |
|
489 |
+
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------|
|
490 |
+
| type | string | string | int |
|
491 |
+
| details | <ul><li>min: 6 tokens</li><li>mean: 13.64 tokens</li><li>max: 30 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 34.57 tokens</li><li>max: 149 tokens</li></ul> | <ul><li>0: 100.00%</li></ul> |
|
492 |
* Samples:
|
493 |
+
| sentence1 | sentence2 | label |
|
494 |
+
|:-----------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
|
495 |
+
| <code>What professors established the importance of Whitehead's work?</code> | <code>Professors such as Wieman, Charles Hartshorne, Bernard Loomer, Bernard Meland, and Daniel Day Williams made Whitehead's philosophy arguably the most important intellectual thread running through the Divinity School.</code> | <code>0</code> |
|
496 |
+
| <code>When did people start living on the edge of the desert?</code> | <code>It was long believed that the region had been this way since about 1600 BCE, after shifts in the Earth's axis increased temperatures and decreased precipitation.</code> | <code>0</code> |
|
497 |
+
| <code>What was the title of Gertrude Stein's 1906-1908 book?</code> | <code>Picasso in turn was an important influence on Stein's writing.</code> | <code>0</code> |
|
498 |
* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
|
499 |
```json
|
500 |
{
|
|
|
510 |
#### scitail-pairs-qa
|
511 |
|
512 |
* Dataset: [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
|
513 |
+
* Size: 4,300 training samples
|
514 |
* Columns: <code>sentence2</code> and <code>sentence1</code>
|
515 |
* Approximate statistics based on the first 1000 samples:
|
516 |
+
| | sentence2 | sentence1 |
|
517 |
+
|:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
|
518 |
+
| type | string | string |
|
519 |
+
| details | <ul><li>min: 7 tokens</li><li>mean: 16.2 tokens</li><li>max: 41 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 14.65 tokens</li><li>max: 33 tokens</li></ul> |
|
520 |
* Samples:
|
521 |
+
| sentence2 | sentence1 |
|
522 |
+
|:-------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------|
|
523 |
+
| <code>Ash that enters the air naturally as a result of a volcano eruption is classified as a primary pollutant.</code> | <code>Ash that enters the air naturally as a result of a volcano eruption is classified as what kind of pollutant?</code> |
|
524 |
+
| <code>Exposure to ultraviolet radiation can increase the amount of pigment in the skin and make it appear darker.</code> | <code>Exposure to what can increase the amount of pigment in the skin and make it appear darker?</code> |
|
525 |
+
| <code>A lysozyme destroys bacteria by digesting their cell walls.</code> | <code>How does lysozyme destroy bacteria?</code> |
|
526 |
* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
|
527 |
```json
|
528 |
{
|
|
|
538 |
#### scitail-pairs-pos
|
539 |
|
540 |
* Dataset: [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
|
541 |
+
* Size: 2,200 training samples
|
542 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
543 |
* Approximate statistics based on the first 1000 samples:
|
544 |
+
| | sentence1 | sentence2 |
|
545 |
+
|:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
|
546 |
+
| type | string | string |
|
547 |
+
| details | <ul><li>min: 7 tokens</li><li>mean: 23.6 tokens</li><li>max: 74 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 15.23 tokens</li><li>max: 41 tokens</li></ul> |
|
548 |
* Samples:
|
549 |
+
| sentence1 | sentence2 |
|
550 |
+
|:-------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------|
|
551 |
+
| <code>An atom that gains electrons would be a negative ion.</code> | <code>Atoms that have gained electrons and become negatively charged are called negative ions.</code> |
|
552 |
+
| <code>Scientists will use data collected during the collisions to explore the particles known as quarks and gluons that make up protons and neutrons.</code> | <code>Protons and neutrons are made of quarks, which are fundamental particles of matter.</code> |
|
553 |
+
| <code>Watersheds and divides All of the land area whose water drains into a stream system is called the system's watershed.</code> | <code>All of the land drained by a river system is called its basin, or the "wet" term watershed</code> |
|
554 |
* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
|
555 |
```json
|
556 |
{
|
|
|
566 |
#### xsum-pairs
|
567 |
|
568 |
* Dataset: [xsum-pairs](https://huggingface.co/datasets/sentence-transformers/xsum) at [788ddaf](https://huggingface.co/datasets/sentence-transformers/xsum/tree/788ddafe04e539956d56b567bc32a036ee7b9206)
|
569 |
+
* Size: 2,500 training samples
|
570 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
571 |
* Approximate statistics based on the first 1000 samples:
|
572 |
+
| | sentence1 | sentence2 |
|
573 |
+
|:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
|
574 |
+
| type | string | string |
|
575 |
+
| details | <ul><li>min: 2 tokens</li><li>mean: 350.46 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 27.13 tokens</li><li>max: 70 tokens</li></ul> |
|
576 |
* Samples:
|
577 |
+
| sentence1 | sentence2 |
|
578 |
+
|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
579 |
+
| <code>An eyewitness told BBC Persian that the crowds were sharply divided between hardliners and moderates, but it was clear many people had responded to a call from former President Mohammad Khatami to attend the funeral as a show of support for the opposition reform movement.<br>Some were chanting opposition slogans, and others carried placards emphasising Mr Rafsanjani's links to the moderate and reformist camps.<br>"Long live Khatami, Long Live Rouhani. Hashemi, your soul is at peace!" said one banner.<br>"The circle became too closed for the centre," said another, using a quotation from Persian poetry to underline the growing distance in recent years between Mr Rafsanjani and Iran's hardline political establishment.<br>At one stage state television played loud music over its live broadcast of the event in order to drown out opposition slogans being chanted by the crowd.<br>As the official funeral eulogies were relayed to the crowds on the streets, they responded with calls of support for former President Khatami, and opposition leader Mir Hossein Mousavi, and shouts of: "You have the loudspeakers, we have the voice! Shame on you, Shame on State TV!"<br>On Iranian social media the funeral has been the number one topic with many opposition supporters using the hashtag #weallgathered to indicate their support and sympathy.<br>People have been posting photos and videos emphasising the number of opposition supporters out on the streets and showing the opposition slogans which state TV has been trying to obscure.<br>But government supporters have also taken to Twitter to play down the opposition showing at the funeral, accusing them of political opportunism.<br>"A huge army came out of love of the Supreme Leader," wrote a cleric called Sheikh Reza. "While a few foot soldiers came with their cameras to show off."<br>Another conversation engaging many on Twitter involved the wording of the prayers used at the funeral.<br>Did the Supreme Leader Ayatollah Ali Khamenei deliberately leave out a section praising the goodness of the deceased, some opposition supporters asked. And was this a comment on the political tensions between the two?<br>"No," responded another Twitter user, cleric Abbas Zolghadri. "The words of the prayer can be changed. There are no strict rules."<br>He followed this with a poignant photo of an empty grave - "Hashemi's final resting place" was the caption, summing up the sense of loss felt by Iranians of many different political persuasions despite the deep and bitter divisions.</code> | <code>Tehran has seen some of the biggest crowds on the streets since the 2009 "Green Movement" opposition demonstrations, as an estimated 2.5 million people gathered to bid farewell to Akbar Hashemi Rafsanjani, the man universally known as "Hashemi".</code> |
|
580 |
+
| <code>Mark Evans is retracing the same route across the Rub Al Khali, also known as the "Empty Quarter", taken by Bristol pioneer Bertram Thomas in 1930.<br>The 54-year-old Shropshire-born explorer is leading a three-man team to walk the 800 mile (1,300 km) journey from Salalah, Oman to Doha, Qatar.<br>The trek is expected to take 60 days.<br>The Rub Al Khali desert is considered one of the hottest, driest and most inhospitable places on earth.<br>Nearly two decades after Thomas completed his trek, British explorer and writer Sir Wilfred Thesiger crossed the Empty Quarter - mapping it in detail along the way.<br>60 days<br>To cross the Rub' Al Khali desert<br>* From Salalah in Oman to Doha, Qatar<br>* Walking with camels for 1,300km<br>* Area nearly three times the size of the UK<br>Completed by explorer Bertram Thomas in 1930<br>Bertram Thomas, who hailed from Pill, near Bristol, received telegrams of congratulation from both King George V and Sultan Taimur, then ruler of Oman.<br>He went on to lecture all over the world about the journey and to write a book called Arabia Felix.<br>Unlike Mr Evans, Thomas did not obtain permission for his expedition.<br>He said: "The biggest challenges for Thomas were warring tribes, lack of water in the waterholes and his total dependence on his Omani companion Sheikh Saleh to negotiate their way through the desert.<br>"The biggest challenge for those who wanted to make the crossing in recent decades has been obtaining government permissions to walk through this desolate and unknown territory."</code> | <code>An explorer has embarked on a challenge to become only the third British person in history to cross the largest sand desert in the world.</code> |
|
581 |
+
| <code>An Olympic gold medallist, he was also three-time world heavyweight champion and took part in some of the most memorable fights in boxing history.<br>He had a professional career spanning 21 years and BBC Sport takes a look at his 61 fights in more detail.</code> | <code>Boxing legend Muhammad Ali, who died at the age of 74, became a sporting icon during his career.</code> |
|
582 |
* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
|
583 |
```json
|
584 |
{
|
|
|
594 |
#### compression-pairs
|
595 |
|
596 |
* Dataset: [compression-pairs](https://huggingface.co/datasets/sentence-transformers/sentence-compression) at [605bc91](https://huggingface.co/datasets/sentence-transformers/sentence-compression/tree/605bc91d95631895ba25b6eda51a3cb596976c90)
|
597 |
+
* Size: 4,000 training samples
|
598 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
599 |
* Approximate statistics based on the first 1000 samples:
|
600 |
| | sentence1 | sentence2 |
|
|
|
612 |
{
|
613 |
"loss": "MultipleNegativesSymmetricRankingLoss",
|
614 |
"n_layers_per_step": -1,
|
615 |
+
"last_layer_weight": 1.5,
|
616 |
"prior_layers_weight": 0.1,
|
617 |
"kl_div_weight": 0.5,
|
618 |
"kl_temperature": 1
|
|
|
622 |
#### sciq_pairs
|
623 |
|
624 |
* Dataset: [sciq_pairs](https://huggingface.co/datasets/allenai/sciq) at [2c94ad3](https://huggingface.co/datasets/allenai/sciq/tree/2c94ad3e1aafab77146f384e23536f97a4849815)
|
625 |
+
* Size: 6,500 training samples
|
626 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
627 |
* Approximate statistics based on the first 1000 samples:
|
628 |
| | sentence1 | sentence2 |
|
|
|
650 |
#### qasc_pairs
|
651 |
|
652 |
* Dataset: [qasc_pairs](https://huggingface.co/datasets/allenai/qasc) at [a34ba20](https://huggingface.co/datasets/allenai/qasc/tree/a34ba204eb9a33b919c10cc08f4f1c8dae5ec070)
|
653 |
+
* Size: 6,500 training samples
|
654 |
* Columns: <code>id</code>, <code>sentence1</code>, and <code>sentence2</code>
|
655 |
* Approximate statistics based on the first 1000 samples:
|
656 |
| | id | sentence1 | sentence2 |
|
|
|
706 |
#### msmarco_pairs
|
707 |
|
708 |
* Dataset: [msmarco_pairs](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3) at [28ff31e](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3/tree/28ff31e4c97cddd53d298497f766e653f1e666f9)
|
709 |
+
* Size: 6,500 training samples
|
710 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
711 |
* Approximate statistics based on the first 1000 samples:
|
712 |
| | sentence1 | sentence2 |
|
|
|
734 |
#### nq_pairs
|
735 |
|
736 |
* Dataset: [nq_pairs](https://huggingface.co/datasets/sentence-transformers/natural-questions) at [f9e894e](https://huggingface.co/datasets/sentence-transformers/natural-questions/tree/f9e894e1081e206e577b4eaa9ee6de2b06ae6f17)
|
737 |
+
* Size: 6,500 training samples
|
738 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
739 |
* Approximate statistics based on the first 1000 samples:
|
740 |
| | sentence1 | sentence2 |
|
|
|
762 |
#### trivia_pairs
|
763 |
|
764 |
* Dataset: [trivia_pairs](https://huggingface.co/datasets/sentence-transformers/trivia-qa) at [a7c36e3](https://huggingface.co/datasets/sentence-transformers/trivia-qa/tree/a7c36e3c8c8c01526bc094d79bf80d4c848b0ad0)
|
765 |
+
* Size: 6,500 training samples
|
766 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
767 |
* Approximate statistics based on the first 1000 samples:
|
768 |
| | sentence1 | sentence2 |
|
|
|
790 |
#### quora_pairs
|
791 |
|
792 |
* Dataset: [quora_pairs](https://huggingface.co/datasets/sentence-transformers/quora-duplicates) at [451a485](https://huggingface.co/datasets/sentence-transformers/quora-duplicates/tree/451a4850bd141edb44ade1b5828c259abd762cdb)
|
793 |
+
* Size: 4,000 training samples
|
794 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
795 |
* Approximate statistics based on the first 1000 samples:
|
796 |
| | sentence1 | sentence2 |
|
|
|
818 |
#### gooaq_pairs
|
819 |
|
820 |
* Dataset: [gooaq_pairs](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
|
821 |
+
* Size: 6,500 training samples
|
822 |
* Columns: <code>sentence1</code> and <code>sentence2</code>
|
823 |
* Approximate statistics based on the first 1000 samples:
|
824 |
| | sentence1 | sentence2 |
|
|
|
848 |
#### nli-pairs
|
849 |
|
850 |
* Dataset: [nli-pairs](https://huggingface.co/datasets/sentence-transformers/all-nli) at [d482672](https://huggingface.co/datasets/sentence-transformers/all-nli/tree/d482672c8e74ce18da116f430137434ba2e52fab)
|
851 |
+
* Size: 750 evaluation samples
|
852 |
* Columns: <code>anchor</code> and <code>positive</code>
|
853 |
* Approximate statistics based on the first 1000 samples:
|
854 |
| | anchor | positive |
|
855 |
|:--------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
|
856 |
| type | string | string |
|
857 |
+
| details | <ul><li>min: 5 tokens</li><li>mean: 17.61 tokens</li><li>max: 51 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 9.71 tokens</li><li>max: 29 tokens</li></ul> |
|
858 |
* Samples:
|
859 |
| anchor | positive |
|
860 |
|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------|
|
|
|
876 |
#### scitail-pairs-pos
|
877 |
|
878 |
* Dataset: [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
|
879 |
+
* Size: 750 evaluation samples
|
880 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
881 |
* Approximate statistics based on the first 1000 samples:
|
882 |
+
| | sentence1 | sentence2 | label |
|
883 |
+
|:--------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:------------------------------------------------|
|
884 |
+
| type | string | string | int |
|
885 |
+
| details | <ul><li>min: 5 tokens</li><li>mean: 22.43 tokens</li><li>max: 61 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 15.3 tokens</li><li>max: 36 tokens</li></ul> | <ul><li>0: ~50.00%</li><li>1: ~50.00%</li></ul> |
|
886 |
* Samples:
|
887 |
| sentence1 | sentence2 | label |
|
888 |
|:----------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------|:---------------|
|
|
|
904 |
#### qnli-contrastive
|
905 |
|
906 |
* Dataset: [qnli-contrastive](https://huggingface.co/datasets/nyu-mll/glue) at [bcdcba7](https://huggingface.co/datasets/nyu-mll/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
|
907 |
+
* Size: 750 evaluation samples
|
908 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
909 |
* Approximate statistics based on the first 1000 samples:
|
910 |
| | sentence1 | sentence2 | label |
|
911 |
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------|
|
912 |
| type | string | string | int |
|
913 |
+
| details | <ul><li>min: 6 tokens</li><li>mean: 14.15 tokens</li><li>max: 36 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 36.98 tokens</li><li>max: 225 tokens</li></ul> | <ul><li>0: 100.00%</li></ul> |
|
914 |
* Samples:
|
915 |
| sentence1 | sentence2 | label |
|
916 |
|:--------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
|
|
|
934 |
|
935 |
- `eval_strategy`: steps
|
936 |
- `per_device_train_batch_size`: 28
|
937 |
+
- `per_device_eval_batch_size`: 18
|
938 |
+
- `learning_rate`: 2e-05
|
939 |
- `weight_decay`: 1e-06
|
940 |
+
- `num_train_epochs`: 2
|
941 |
- `lr_scheduler_type`: cosine_with_restarts
|
942 |
- `lr_scheduler_kwargs`: {'num_cycles': 3}
|
943 |
+
- `warmup_ratio`: 0.25
|
944 |
- `save_safetensors`: False
|
945 |
- `fp16`: True
|
946 |
- `push_to_hub`: True
|
947 |
+
- `hub_model_id`: bobox/DeBERTaV3-small-GeneralSentenceTransformer-v2-2-checkpoints-tmp
|
948 |
- `hub_strategy`: checkpoint
|
949 |
- `batch_sampler`: no_duplicates
|
950 |
|
|
|
956 |
- `eval_strategy`: steps
|
957 |
- `prediction_loss_only`: True
|
958 |
- `per_device_train_batch_size`: 28
|
959 |
+
- `per_device_eval_batch_size`: 18
|
960 |
- `per_gpu_train_batch_size`: None
|
961 |
- `per_gpu_eval_batch_size`: None
|
962 |
- `gradient_accumulation_steps`: 1
|
963 |
- `eval_accumulation_steps`: None
|
964 |
+
- `learning_rate`: 2e-05
|
965 |
- `weight_decay`: 1e-06
|
966 |
- `adam_beta1`: 0.9
|
967 |
- `adam_beta2`: 0.999
|
968 |
- `adam_epsilon`: 1e-08
|
969 |
- `max_grad_norm`: 1.0
|
970 |
+
- `num_train_epochs`: 2
|
971 |
- `max_steps`: -1
|
972 |
- `lr_scheduler_type`: cosine_with_restarts
|
973 |
- `lr_scheduler_kwargs`: {'num_cycles': 3}
|
974 |
+
- `warmup_ratio`: 0.25
|
975 |
- `warmup_steps`: 0
|
976 |
- `log_level`: passive
|
977 |
- `log_level_replica`: warning
|
|
|
1030 |
- `use_legacy_prediction_loop`: False
|
1031 |
- `push_to_hub`: True
|
1032 |
- `resume_from_checkpoint`: None
|
1033 |
+
- `hub_model_id`: bobox/DeBERTaV3-small-GeneralSentenceTransformer-v2-2-checkpoints-tmp
|
1034 |
- `hub_strategy`: checkpoint
|
1035 |
- `hub_private_repo`: False
|
1036 |
- `hub_always_push`: False
|
|
|
1062 |
|
1063 |
</details>
|
1064 |
|
1065 |
+
### Training Logs
|
1066 |
+
<details><summary>Click to expand</summary>
|
1067 |
+
|
1068 |
+
| Epoch | Step | Training Loss | nli-pairs loss | qnli-contrastive loss | scitail-pairs-pos loss | sts-test_spearman_cosine |
|
1069 |
+
|:------:|:----:|:-------------:|:--------------:|:---------------------:|:----------------------:|:------------------------:|
|
1070 |
+
| 0 | 0 | - | - | - | - | 0.4188 |
|
1071 |
+
| 0.0253 | 71 | 9.7048 | - | - | - | - |
|
1072 |
+
| 0.0503 | 141 | - | 7.9860 | 8.4771 | 6.6165 | - |
|
1073 |
+
| 0.0507 | 142 | 8.6743 | - | - | - | - |
|
1074 |
+
| 0.0760 | 213 | 8.101 | - | - | - | - |
|
1075 |
+
| 0.1006 | 282 | - | 6.8505 | 7.5583 | 4.4099 | - |
|
1076 |
+
| 0.1014 | 284 | 7.5594 | - | - | - | - |
|
1077 |
+
| 0.1267 | 355 | 6.3548 | - | - | - | - |
|
1078 |
+
| 0.1510 | 423 | - | 5.2238 | 6.2964 | 2.3430 | - |
|
1079 |
+
| 0.1520 | 426 | 5.869 | - | - | - | - |
|
1080 |
+
| 0.1774 | 497 | 5.1134 | - | - | - | - |
|
1081 |
+
| 0.2013 | 564 | - | 4.5785 | 5.6786 | 1.8733 | - |
|
1082 |
+
| 0.2027 | 568 | 5.1262 | - | - | - | - |
|
1083 |
+
| 0.2281 | 639 | 3.7625 | - | - | - | - |
|
1084 |
+
| 0.2516 | 705 | - | 3.9531 | 5.1247 | 1.6374 | - |
|
1085 |
+
| 0.2534 | 710 | 4.5256 | - | - | - | - |
|
1086 |
+
| 0.2787 | 781 | 3.8572 | - | - | - | - |
|
1087 |
+
| 0.3019 | 846 | - | 3.5362 | 4.5487 | 1.5215 | - |
|
1088 |
+
| 0.3041 | 852 | 3.9294 | - | - | - | - |
|
1089 |
+
| 0.3294 | 923 | 3.281 | - | - | - | - |
|
1090 |
+
| 0.3522 | 987 | - | 3.1562 | 3.7942 | 1.4236 | - |
|
1091 |
+
| 0.3547 | 994 | 3.2531 | - | - | - | - |
|
1092 |
+
| 0.3801 | 1065 | 3.9305 | - | - | - | - |
|
1093 |
+
| 0.4026 | 1128 | - | 2.7059 | 3.4370 | 1.2689 | - |
|
1094 |
+
| 0.4054 | 1136 | 3.0324 | - | - | - | - |
|
1095 |
+
| 0.4308 | 1207 | 3.3544 | - | - | - | - |
|
1096 |
+
| 0.4529 | 1269 | - | 2.5396 | 3.0366 | 1.2415 | - |
|
1097 |
+
| 0.4561 | 1278 | 3.2331 | - | - | - | - |
|
1098 |
+
| 0.4814 | 1349 | 3.1913 | - | - | - | - |
|
1099 |
+
| 0.5032 | 1410 | - | 2.2846 | 2.7076 | 1.1422 | - |
|
1100 |
+
| 0.5068 | 1420 | 2.7389 | - | - | - | - |
|
1101 |
+
| 0.5321 | 1491 | 2.9541 | - | - | - | - |
|
1102 |
+
| 0.5535 | 1551 | - | 2.1732 | 2.3780 | 1.2127 | - |
|
1103 |
+
| 0.5575 | 1562 | 3.0911 | - | - | - | - |
|
1104 |
+
| 0.5828 | 1633 | 2.932 | - | - | - | - |
|
1105 |
+
| 0.6039 | 1692 | - | 2.0257 | 1.9252 | 1.1056 | - |
|
1106 |
+
| 0.6081 | 1704 | 3.082 | - | - | - | - |
|
1107 |
+
| 0.6335 | 1775 | 3.0328 | - | - | - | - |
|
1108 |
+
| 0.6542 | 1833 | - | 1.9588 | 2.0366 | 1.1187 | - |
|
1109 |
+
| 0.6588 | 1846 | 2.9508 | - | - | - | - |
|
1110 |
+
| 0.6842 | 1917 | 2.7445 | - | - | - | - |
|
1111 |
+
| 0.7045 | 1974 | - | 1.8310 | 1.9980 | 1.0991 | - |
|
1112 |
+
| 0.7095 | 1988 | 2.8922 | - | - | - | - |
|
1113 |
+
| 0.7348 | 2059 | 2.7352 | - | - | - | - |
|
1114 |
+
| 0.7548 | 2115 | - | 1.7650 | 1.5015 | 1.1103 | - |
|
1115 |
+
| 0.7602 | 2130 | 3.2009 | - | - | - | - |
|
1116 |
+
| 0.7855 | 2201 | 2.6261 | - | - | - | - |
|
1117 |
+
| 0.8051 | 2256 | - | 1.6932 | 1.6964 | 1.0409 | - |
|
1118 |
+
| 0.8108 | 2272 | 2.6623 | - | - | - | - |
|
1119 |
+
| 0.8362 | 2343 | 2.8281 | - | - | - | - |
|
1120 |
+
| 0.8555 | 2397 | - | 1.6844 | 1.7854 | 1.0300 | - |
|
1121 |
+
| 0.8615 | 2414 | 2.3096 | - | - | - | - |
|
1122 |
+
| 0.8869 | 2485 | 2.4088 | - | - | - | - |
|
1123 |
+
| 0.9058 | 2538 | - | 1.6698 | 1.8310 | 1.0275 | - |
|
1124 |
+
| 0.9122 | 2556 | 2.6051 | - | - | - | - |
|
1125 |
+
| 0.9375 | 2627 | 2.972 | - | - | - | - |
|
1126 |
+
| 0.9561 | 2679 | - | 1.6643 | 1.8173 | 1.0215 | - |
|
1127 |
+
| 0.9629 | 2698 | 2.4207 | - | - | - | - |
|
1128 |
+
| 0.9882 | 2769 | 2.2772 | - | - | - | - |
|
1129 |
+
| 1.0064 | 2820 | - | 1.7130 | 1.7650 | 1.0496 | - |
|
1130 |
+
| 1.0136 | 2840 | 2.6348 | - | - | - | - |
|
1131 |
+
| 1.0389 | 2911 | 2.8271 | - | - | - | - |
|
1132 |
+
| 1.0567 | 2961 | - | 1.6939 | 2.1074 | 0.9858 | - |
|
1133 |
+
| 1.0642 | 2982 | 2.5215 | - | - | - | - |
|
1134 |
+
| 1.0896 | 3053 | 2.7442 | - | - | - | - |
|
1135 |
+
| 1.1071 | 3102 | - | 1.6633 | 1.5590 | 0.9903 | - |
|
1136 |
+
| 1.1149 | 3124 | 2.6155 | - | - | - | - |
|
1137 |
+
| 1.1403 | 3195 | 2.7053 | - | - | - | - |
|
1138 |
+
| 1.1574 | 3243 | - | 1.6242 | 1.6429 | 0.9740 | - |
|
1139 |
+
| 1.1656 | 3266 | 2.9191 | - | - | - | - |
|
1140 |
+
| 1.1909 | 3337 | 2.1112 | - | - | - | - |
|
1141 |
+
| 1.2077 | 3384 | - | 1.6535 | 1.6226 | 0.9516 | - |
|
1142 |
+
| 1.2163 | 3408 | 2.3519 | - | - | - | - |
|
1143 |
+
| 1.2416 | 3479 | 1.9416 | - | - | - | - |
|
1144 |
+
| 1.2580 | 3525 | - | 1.6103 | 1.6530 | 0.9357 | - |
|
1145 |
+
| 1.2670 | 3550 | 2.0859 | - | - | - | - |
|
1146 |
+
| 1.2923 | 3621 | 2.0109 | - | - | - | - |
|
1147 |
+
| 1.3084 | 3666 | - | 1.5773 | 1.4672 | 0.9155 | - |
|
1148 |
+
| 1.3176 | 3692 | 2.366 | - | - | - | - |
|
1149 |
+
| 1.3430 | 3763 | 1.5532 | - | - | - | - |
|
1150 |
+
| 1.3587 | 3807 | - | 1.5514 | 1.4451 | 0.8979 | - |
|
1151 |
+
| 1.3683 | 3834 | 1.9982 | - | - | - | - |
|
1152 |
+
| 1.3936 | 3905 | 2.4375 | - | - | - | - |
|
1153 |
+
| 1.4090 | 3948 | - | 1.5254 | 1.4050 | 0.8834 | - |
|
1154 |
+
| 1.4190 | 3976 | 1.7548 | - | - | - | - |
|
1155 |
+
| 1.4443 | 4047 | 2.2272 | - | - | - | - |
|
1156 |
+
| 1.4593 | 4089 | - | 1.5186 | 1.3720 | 0.8835 | - |
|
1157 |
+
| 1.4697 | 4118 | 2.2145 | - | - | - | - |
|
1158 |
+
| 1.4950 | 4189 | 1.8696 | - | - | - | - |
|
1159 |
+
| 1.5096 | 4230 | - | 1.5696 | 1.0682 | 0.9336 | - |
|
1160 |
+
| 1.5203 | 4260 | 1.4926 | - | - | - | - |
|
1161 |
+
| 1.5457 | 4331 | 2.1193 | - | - | - | - |
|
1162 |
+
| 1.5600 | 4371 | - | 1.5469 | 0.8180 | 0.9663 | - |
|
1163 |
+
| 1.5710 | 4402 | 2.0298 | - | - | - | - |
|
1164 |
+
| 1.5964 | 4473 | 1.9959 | - | - | - | - |
|
1165 |
+
| 1.6103 | 4512 | - | 1.4656 | 1.1725 | 0.8815 | - |
|
1166 |
+
| 1.6217 | 4544 | 2.3452 | - | - | - | - |
|
1167 |
+
| 1.6470 | 4615 | 1.9529 | - | - | - | - |
|
1168 |
+
| 1.6606 | 4653 | - | 1.4709 | 1.1081 | 0.9079 | - |
|
1169 |
+
| 1.6724 | 4686 | 1.7932 | - | - | - | - |
|
1170 |
+
| 1.6977 | 4757 | 2.1881 | - | - | - | - |
|
1171 |
+
| 1.7109 | 4794 | - | 1.4526 | 0.9851 | 0.9167 | - |
|
1172 |
+
| 1.7231 | 4828 | 2.1128 | - | - | - | - |
|
1173 |
+
| 1.7484 | 4899 | 2.4772 | - | - | - | - |
|
1174 |
+
| 1.7612 | 4935 | - | 1.4204 | 0.8683 | 0.8896 | - |
|
1175 |
+
| 1.7737 | 4970 | 2.4336 | - | - | - | - |
|
1176 |
+
| 1.7991 | 5041 | 1.9101 | - | - | - | - |
|
1177 |
+
| 1.8116 | 5076 | - | 1.3821 | 1.0420 | 0.8538 | - |
|
1178 |
+
| 1.8244 | 5112 | 2.3882 | - | - | - | - |
|
1179 |
+
| 1.8498 | 5183 | 2.2165 | - | - | - | - |
|
1180 |
+
| 1.8619 | 5217 | - | 1.3747 | 1.0753 | 0.8580 | - |
|
1181 |
+
| 1.8751 | 5254 | 1.6554 | - | - | - | - |
|
1182 |
+
| 1.9004 | 5325 | 2.3828 | - | - | - | - |
|
1183 |
+
| 1.9122 | 5358 | - | 1.3637 | 1.0699 | 0.8557 | - |
|
1184 |
+
| 1.9258 | 5396 | 2.3499 | - | - | - | - |
|
1185 |
+
| 1.9511 | 5467 | 2.3972 | - | - | - | - |
|
1186 |
+
| 1.9625 | 5499 | - | 1.3583 | 1.0596 | 0.8536 | - |
|
1187 |
+
| 1.9764 | 5538 | 1.931 | - | - | - | - |
|
1188 |
+
| 2.0 | 5604 | - | 1.3586 | 1.0555 | 0.8543 | 0.7193 |
|
1189 |
+
|
1190 |
+
</details>
|
1191 |
+
|
1192 |
### Framework Versions
|
1193 |
- Python: 3.10.13
|
1194 |
- Sentence Transformers: 3.0.1
|
pytorch_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:52f8c3b52930fac8db5a5fe984c3799b56c40fd6ae38521081187798685d9cc5
|
3 |
+
size 480181136
|