SentenceTransformer

This is a sentence-transformers model trained on the bge-full-data dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: ModernBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("NohTow/ModernBERT-base-DPR-fullneg-gte-0.0002")
# Run inference
sentences = [
    'Who won 23 World Rally Championships, two in particular with the Lancia Delta Group A rally car?',
    "Lancia Delta Group A The Lancia Delta Group A is a Group A rally car built for the Martini Lancia by Lancia to compete in the World Rally Championship. It is based upon the Lancia Delta road car and replaced the Lancia Delta S4. The car was introduced for the 1987 World Rally Championship season and dominated the World Rally Championship, scoring 46 WRC victories overall and winning the constructors' championship a record six times in a row from 1987 to 1992, in addition to drivers' championship titles for Juha Kankkunen (1987 and 1991) and Miki Biasion (1988 and 1989), making Lancia the most successful marque in the history of the WRC and the Delta the most successful car.",
    'Luis Moya Luis Rodríguez Moya, better known as Luis Moya (born 23 September 1960 in La Coruña, Spain) is a now-retired rally co-driver, synonymous with driver Carlos Sainz. He is the third most successful co-driver in the history of the World Rally Championship (WRC), after Daniel Elena and Timo Rautiainen',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

  • Datasets: NanoClimateFEVER, NanoDBPedia, NanoFEVER, NanoFiQA2018, NanoHotpotQA, NanoMSMARCO, NanoNFCorpus, NanoNQ, NanoQuoraRetrieval, NanoSCIDOCS, NanoArguAna, NanoSciFact and NanoTouche2020
  • Evaluated with InformationRetrievalEvaluator
Metric NanoClimateFEVER NanoDBPedia NanoFEVER NanoFiQA2018 NanoHotpotQA NanoMSMARCO NanoNFCorpus NanoNQ NanoQuoraRetrieval NanoSCIDOCS NanoArguAna NanoSciFact NanoTouche2020
cosine_accuracy@1 0.22 0.7 0.88 0.46 0.82 0.34 0.36 0.46 0.94 0.48 0.26 0.34 0.4898
cosine_accuracy@3 0.52 0.74 0.96 0.62 0.94 0.6 0.5 0.66 0.98 0.66 0.64 0.48 0.8367
cosine_accuracy@5 0.6 0.8 1.0 0.68 0.94 0.72 0.56 0.7 0.98 0.74 0.8 0.54 0.898
cosine_accuracy@10 0.64 0.86 1.0 0.74 0.96 0.82 0.62 0.78 1.0 0.86 0.9 0.6 0.9796
cosine_precision@1 0.22 0.7 0.88 0.46 0.82 0.34 0.36 0.46 0.94 0.48 0.26 0.34 0.4898
cosine_precision@3 0.2067 0.48 0.3333 0.2867 0.3867 0.2 0.2933 0.22 0.4067 0.3333 0.2133 0.18 0.5034
cosine_precision@5 0.144 0.432 0.208 0.224 0.248 0.144 0.296 0.144 0.252 0.276 0.16 0.128 0.4653
cosine_precision@10 0.084 0.376 0.108 0.134 0.132 0.082 0.22 0.084 0.136 0.202 0.09 0.07 0.3612
cosine_recall@1 0.0883 0.0726 0.8267 0.2445 0.41 0.34 0.0156 0.45 0.8173 0.1017 0.26 0.305 0.0355
cosine_recall@3 0.2667 0.1134 0.9233 0.4038 0.58 0.6 0.0349 0.61 0.9453 0.2067 0.64 0.47 0.1075
cosine_recall@5 0.3083 0.1586 0.9533 0.489 0.62 0.72 0.0641 0.66 0.956 0.2847 0.8 0.54 0.1652
cosine_recall@10 0.3567 0.2345 0.9733 0.5964 0.66 0.82 0.0797 0.75 0.9933 0.4157 0.9 0.6 0.243
cosine_ndcg@10 0.284 0.4733 0.9203 0.4901 0.67 0.5747 0.2547 0.6061 0.9594 0.3972 0.5856 0.4572 0.418
cosine_mrr@10 0.3747 0.7389 0.9267 0.5513 0.8795 0.4967 0.4444 0.5691 0.9625 0.5928 0.4839 0.4177 0.6742
cosine_map@100 0.2232 0.3348 0.8908 0.4201 0.5984 0.505 0.0923 0.5645 0.9423 0.3043 0.4893 0.4156 0.308

Nano BEIR

Metric Value
cosine_accuracy@1 0.5192
cosine_accuracy@3 0.7028
cosine_accuracy@5 0.766
cosine_accuracy@10 0.8277
cosine_precision@1 0.5192
cosine_precision@3 0.311
cosine_precision@5 0.2401
cosine_precision@10 0.1599
cosine_recall@1 0.3052
cosine_recall@3 0.454
cosine_recall@5 0.5169
cosine_recall@10 0.5864
cosine_ndcg@10 0.5454
cosine_mrr@10 0.624
cosine_map@100 0.4684

Training Details

Training Dataset

bge-full-data

  • Dataset: bge-full-data at 78f5c99
  • Size: 1,770,649 training samples
  • Columns: anchor, positive, negative_0, negative_1, negative_2, negative_3, and negative_4
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative_0 negative_1 negative_2 negative_3 negative_4
    type string string string string string string string
    details
    • min: 4 tokens
    • mean: 20.15 tokens
    • max: 512 tokens
    • min: 3 tokens
    • mean: 173.18 tokens
    • max: 512 tokens
    • min: 5 tokens
    • mean: 170.06 tokens
    • max: 512 tokens
    • min: 4 tokens
    • mean: 167.88 tokens
    • max: 512 tokens
    • min: 6 tokens
    • mean: 167.95 tokens
    • max: 512 tokens
    • min: 6 tokens
    • mean: 166.32 tokens
    • max: 512 tokens
    • min: 5 tokens
    • mean: 167.63 tokens
    • max: 512 tokens
  • Samples:
    anchor positive negative_0 negative_1 negative_2 negative_3 negative_4
    What happens if you eat raw chicken? What are the dangers of eating raw chicken? Does all raw chicken have salmonella? How safe is to eat chicken during pregnancy? What meats are safe to eat raw? What are some natural obligations of chicken? Is it safe to eat raw egg?
    how long does it take for a wren egg to hatch How often does a mother Wren sit on her nest? I don't know for sure about how long Wrens usually spend on the nest at one sitting.. (Sorry couldn't resist the joke) However, the eggs usually hatch in 13-18 days, so if there were no hatchlings when that time elapsed, then you'd know for sure that she hadn't been behaving normally. - When you are trying to hatch Tennessee red quail eggs, it will take approximately 23 days. You should perform lock down on the egg at 20 days. This is a period of time whe … n there should be no disturbances because hatching is likely to begin.urkey eggs usually take 21 to 28 days to hatch depending on what they are incubated in like an incubator or by a hen. How long does it take an egg to hatch? For an average Eagle it would have a time for about 32-36 days, but the average time for an Eagle egg to hatch is about 35 days. 28 people found this useful. - When you are trying to hatch Tennessee red quail eggs, it will take approximately 23 days. You should perform lock down on the egg at 20 days. This is a period of time whe … n there should be no disturbances because hatching is likely to begin.urkey eggs usually take 21 to 28 days to hatch depending on what they are incubated in like an incubator or by a hen. It also depends on how fertile it is and how it is cared … for. - Actually this may vary depending on the kind of bird finch, the eggs hatch in between 12 - 16 days or 3 weeks.The nestlings fledge in 18 - 19 days.ctually this may vary depending on the kind of bird finch, the eggs hatch in between 12 - 16 days or 3 weeks. - Welcome, and thanks for visiting the virtual home of the Whitestown Fire Department. Whether you’re stopping by to obtain information on our department, place a comment, track our progress and events, or just looking at the great pictures of our top notch personnel in action, we hope that you find what you’re after. Please feel free to provide feedback or contact us for any questions you may have.
    can you have schizophrenia and bipolar Can you have both bipolar disorder and schizophrenia? Health Mental Health Can you have both bipolar disorder and schizophrenia? I'm 19 and was diagnosed with Bipolar Disorder almost 2 years ago. I also have some symptoms of schizophrenia such as auditory hallucinations and occasional visual ones as well and occasional paranoia. Ok the paranoia is pretty frequent. So yea, Can you have both of them? I know some of the symptoms can be... show more Follow 6 answers Answers Relevance Rating Newest Oldest Best Answer: yes you can, but some people with bipolar disorder have hallucinations and delusions from the bipolar disorder. only a psychiatrist could diagnose you i guess. Source (s):er nurse Zach · 9 years ago0 0 Comment Asker's rating Yes, one can have both bipolar disorder and schizophrenia, as the cause is one and the same - a spirit (ghost). Not only are the mood swings imparted by the associated spirit, but the alleged hallucinations are as well. The voices that those diagnosed as h... Dual Diagnosis: Understanding Sex Addiction With Bipolar Disorder Dual Diagnosis: Understanding Sex Addiction With Bipolar Disorder February 5, 2015 Dual Diagnosis Bipolar disorder manifests itself in one college student’s “need” to sexually expose himself on campus. Marty was diagnosed with bipolar 1 disorder in the spring of his junior year in college. The symptoms had emerged during adolescence, but it wasn’t until a particularly startling manic episode that Marty’s doctor knew his depression was more than unipolar (i.e., clinical depression by itself). The gifted art student had painted his naked body in elaborate geometric patterns and shown up at the fountain in front of his university’s grand administrative building during the middle of a sunny afternoon. He proceeded to dramatically quote Michel Foucault’s Madness and Civilization, even as he was carried away by campus security. The combination of SSRIs and mood stabilizers prescribed to Marty for the treatment of bipolar disor... Understanding Schizoaffective Disorder Medication Understanding Schizoaffective Disorder Medication Because schizoaffective disorder has symptoms of both psychosis and a mood disorder, ✱ doctors often prescribe different medicines to treat different symptoms of the condition. For example, they may prescribe: An antipsychotic, which helps symptoms like delusions and hallucinations A mood-stabilizing medicine, which can help level out “highs” and “lows”An antidepressant, which can help feelings of sadness, hopelessness, and difficulty with sleep and concentration One medicine for schizoaffective disorder's symptoms INVEGA SUSTENNA ® treats the symptoms of schizoaffective disorder (psychosis and mood), so it may be possible for you to manage symptoms with one medicine if your doctor feels it’s right for you. And that means one less pill to think about every day. Approved for the treatment of schizophrenia and schizoaffective disorder.✱ Please discuss your symptoms with your healthcare pro... Paranoia and schizophrenia: What you need to know Newsletter MNT - Hourly Medical News Since 2003Search Log in Newsletter MNT - Hourly Medical News Since 2003Search Login Paranoia and schizophrenia: What you need to know Last updated Thu 25 May 2017By Yvette Brazier Reviewed by Timothy J. Legg, Ph D, CRNPOverview Symptoms Causes Diagnosis Treatment Complications A person who has a condition on the schizophrenia spectrum may experience delusions and what is commonly known as paranoia. These delusions may give rise to fears that others are plotting against the individual. Everyone can have a paranoid thought from time to time. On a rough day, we may find ourselves saying "Oh boy, the whole world is out to get me!" But we recognize that this is not the case. People with paranoia often have an extensive network of paranoid thoughts and ideas. This can result in a disproportionate amount of time spent thinking up ways for the individual to protect themselves from their perceived persecutors... Same Genes Suspected in Both Depression and Bipolar Illness Same Genes Suspected in Both Depression and Bipolar Illness Increased Risk May Stem From Variation in Gene On/Off Switch January 28, 2010 • Science Update Protein produced by PBRM1 gene Researchers, for the first time, have pinpointed a genetic hotspot that confers risk for both bipolar disorder and depression. People with either of these mood disorders were significantly more likely to have risk versions of genes at this site than healthy controls. One of the genes, which codes for part of a cell's machinery that tells genes when to turn on and off, was also found to be over-expressed in the executive hub of bipolar patients' brains, making it a prime suspect. The results add to mounting evidence that major mental disorders overlap at the molecular level. "People who carry the risk versions may differ in some dimension of brain development that may increase risk for mood disorders later in life," explained Francis Mc Mahon, M... Schizophrenia Definition and Characteristics Schizophrenia Schizophrenia Definition and Characteristics Symptoms, Treatments and Risk Factors By Marcia Purse
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 2048
  • per_device_eval_batch_size: 2048
  • learning_rate: 0.0002
  • num_train_epochs: 2
  • warmup_ratio: 0.05
  • bf16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 2048
  • per_device_eval_batch_size: 2048
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 0.0002
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.05
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 5
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss NanoClimateFEVER_cosine_ndcg@10 NanoDBPedia_cosine_ndcg@10 NanoFEVER_cosine_ndcg@10 NanoFiQA2018_cosine_ndcg@10 NanoHotpotQA_cosine_ndcg@10 NanoMSMARCO_cosine_ndcg@10 NanoNFCorpus_cosine_ndcg@10 NanoNQ_cosine_ndcg@10 NanoQuoraRetrieval_cosine_ndcg@10 NanoSCIDOCS_cosine_ndcg@10 NanoArguAna_cosine_ndcg@10 NanoSciFact_cosine_ndcg@10 NanoTouche2020_cosine_ndcg@10 NanoBEIR_mean_cosine_ndcg@10
0.0185 2 8.9197 - - - - - - - - - - - - - -
0.0370 4 8.4814 - - - - - - - - - - - - - -
0.0556 6 6.6919 - - - - - - - - - - - - - -
0.0741 8 5.2493 - - - - - - - - - - - - - -
0.0926 10 4.2792 - - - - - - - - - - - - - -
0.1111 12 3.4554 0.2385 0.3867 0.7209 0.3194 0.5207 0.4438 0.1702 0.3732 0.8791 0.2758 0.4377 0.4026 0.4623 0.4331
0.1296 14 3.0437 - - - - - - - - - - - - - -
0.1481 16 2.6133 - - - - - - - - - - - - - -
0.1667 18 2.3395 - - - - - - - - - - - - - -
0.1852 20 2.1826 - - - - - - - - - - - - - -
0.2037 22 2.0498 - - - - - - - - - - - - - -
0.2222 24 1.9743 0.2706 0.4493 0.8104 0.4201 0.6036 0.5542 0.2249 0.5859 0.9221 0.3091 0.5671 0.5562 0.4864 0.5200
0.2407 26 1.9111 - - - - - - - - - - - - - -
0.2593 28 1.8534 - - - - - - - - - - - - - -
0.2778 30 1.8137 - - - - - - - - - - - - - -
0.2963 32 1.7587 - - - - - - - - - - - - - -
0.3148 34 1.7124 - - - - - - - - - - - - - -
0.3333 36 1.6841 0.2945 0.4652 0.8333 0.4352 0.6189 0.5619 0.2512 0.5977 0.9403 0.3322 0.5502 0.5778 0.4596 0.5321
0.3519 38 1.6765 - - - - - - - - - - - - - -
0.3704 40 1.6314 - - - - - - - - - - - - - -
0.3889 42 1.5989 - - - - - - - - - - - - - -
0.4074 44 1.592 - - - - - - - - - - - - - -
0.4259 46 1.572 - - - - - - - - - - - - - -
0.4444 48 1.5525 0.3045 0.4626 0.8526 0.4507 0.6275 0.5617 0.2575 0.5676 0.9406 0.3661 0.5666 0.5693 0.4231 0.5346
0.4630 50 1.51 - - - - - - - - - - - - - -
0.4815 52 1.5156 - - - - - - - - - - - - - -
0.5 54 1.5076 - - - - - - - - - - - - - -
0.5185 56 1.4781 - - - - - - - - - - - - - -
0.5370 58 1.4833 - - - - - - - - - - - - - -
0.5556 60 1.4576 0.3042 0.4727 0.8456 0.4578 0.6338 0.5599 0.2513 0.5883 0.9370 0.3792 0.5656 0.5229 0.4431 0.5355
0.5741 62 1.4402 - - - - - - - - - - - - - -
0.5926 64 1.438 - - - - - - - - - - - - - -
0.6111 66 1.4504 - - - - - - - - - - - - - -
0.6296 68 1.4142 - - - - - - - - - - - - - -
0.6481 70 1.4141 - - - - - - - - - - - - - -
0.6667 72 1.3917 0.3225 0.4697 0.8632 0.4529 0.6474 0.5575 0.2341 0.5942 0.9464 0.3846 0.5467 0.4924 0.4124 0.5326
0.6852 74 1.4108 - - - - - - - - - - - - - -
0.7037 76 1.4 - - - - - - - - - - - - - -
0.7222 78 1.385 - - - - - - - - - - - - - -
0.7407 80 1.3946 - - - - - - - - - - - - - -
0.7593 82 1.3762 - - - - - - - - - - - - - -
0.7778 84 1.3606 0.3325 0.4747 0.8730 0.4891 0.6511 0.5941 0.2530 0.5835 0.9452 0.3776 0.5490 0.4680 0.4447 0.5412
0.7963 86 1.3615 - - - - - - - - - - - - - -
0.8148 88 1.3811 - - - - - - - - - - - - - -
0.8333 90 1.3462 - - - - - - - - - - - - - -
0.8519 92 1.3617 - - - - - - - - - - - - - -
0.8704 94 1.3345 - - - - - - - - - - - - - -
0.8889 96 1.3291 0.3249 0.4780 0.8791 0.4925 0.6518 0.6018 0.2678 0.5981 0.9451 0.3799 0.5474 0.4423 0.4340 0.5418
0.9074 98 1.3253 - - - - - - - - - - - - - -
0.9259 100 1.3375 - - - - - - - - - - - - - -
0.9444 102 1.3177 - - - - - - - - - - - - - -
0.9630 104 1.3318 - - - - - - - - - - - - - -
0.9815 106 1.297 - - - - - - - - - - - - - -
1.0093 108 1.3128 0.3211 0.4761 0.8869 0.4904 0.6531 0.5906 0.2660 0.6035 0.9473 0.3810 0.5749 0.4420 0.4286 0.5432
1.0278 110 1.3088 - - - - - - - - - - - - - -
1.0463 112 1.3071 - - - - - - - - - - - - - -
1.0648 114 1.2936 - - - - - - - - - - - - - -
1.0833 116 1.2839 - - - - - - - - - - - - - -
1.1019 118 1.2693 - - - - - - - - - - - - - -
1.1204 120 1.291 0.3022 0.4793 0.8822 0.5117 0.6691 0.5708 0.2637 0.6140 0.9521 0.3913 0.5773 0.4487 0.4281 0.5454
1.1389 122 1.2636 - - - - - - - - - - - - - -
1.1574 124 1.2427 - - - - - - - - - - - - - -
1.1759 126 1.2167 - - - - - - - - - - - - - -
1.1944 128 1.202 - - - - - - - - - - - - - -
1.2130 130 1.1931 - - - - - - - - - - - - - -
1.2315 132 1.178 0.2842 0.4731 0.8755 0.5114 0.6814 0.5611 0.2731 0.6122 0.9477 0.3926 0.5723 0.4647 0.4441 0.5457
1.25 134 1.1955 - - - - - - - - - - - - - -
1.2685 136 1.18 - - - - - - - - - - - - - -
1.2870 138 1.1771 - - - - - - - - - - - - - -
1.3056 140 1.173 - - - - - - - - - - - - - -
1.3241 142 1.141 - - - - - - - - - - - - - -
1.3426 144 1.1531 0.2816 0.4822 0.9067 0.5164 0.6609 0.5758 0.2713 0.6295 0.9596 0.4018 0.5862 0.4615 0.4309 0.5511
1.3611 146 1.1608 - - - - - - - - - - - - - -
1.3796 148 1.1489 - - - - - - - - - - - - - -
1.3981 150 1.1531 - - - - - - - - - - - - - -
1.4167 152 1.1391 - - - - - - - - - - - - - -
1.4352 154 1.1405 - - - - - - - - - - - - - -
1.4537 156 1.1336 0.3180 0.4810 0.8891 0.5077 0.6655 0.5609 0.2797 0.5979 0.9557 0.3988 0.6011 0.5093 0.4176 0.5525
1.4722 158 1.1165 - - - - - - - - - - - - - -
1.4907 160 1.1316 - - - - - - - - - - - - - -
1.5093 162 1.1328 - - - - - - - - - - - - - -
1.5278 164 1.1229 - - - - - - - - - - - - - -
1.5463 166 1.1312 - - - - - - - - - - - - - -
1.5648 168 1.1112 0.2801 0.4865 0.9104 0.5040 0.6631 0.5666 0.2847 0.6059 0.9599 0.4003 0.5906 0.4927 0.4312 0.5520
1.5833 170 1.1304 - - - - - - - - - - - - - -
1.6019 172 1.1257 - - - - - - - - - - - - - -
1.6204 174 1.139 - - - - - - - - - - - - - -
1.6389 176 1.1116 - - - - - - - - - - - - - -
1.6574 178 1.1161 - - - - - - - - - - - - - -
1.6759 180 1.1024 0.2991 0.4822 0.9009 0.4886 0.6652 0.5659 0.2577 0.6147 0.9597 0.4051 0.5747 0.4585 0.4207 0.5456
1.6944 182 1.1239 - - - - - - - - - - - - - -
1.7130 184 1.1266 - - - - - - - - - - - - - -
1.7315 186 1.1154 - - - - - - - - - - - - - -
1.75 188 1.1382 - - - - - - - - - - - - - -
1.7685 190 1.102 - - - - - - - - - - - - - -
1.7870 192 1.1046 0.3107 0.4764 0.9040 0.4828 0.6680 0.5747 0.2625 0.5969 0.9567 0.3948 0.5801 0.4641 0.4313 0.5464
1.8056 194 1.1241 - - - - - - - - - - - - - -
1.8241 196 1.1266 - - - - - - - - - - - - - -
1.8426 198 1.1257 - - - - - - - - - - - - - -
1.8611 200 1.1148 - - - - - - - - - - - - - -
1.8796 202 1.1133 - - - - - - - - - - - - - -
1.8981 204 1.1149 0.2840 0.4733 0.9203 0.4901 0.6700 0.5747 0.2547 0.6061 0.9594 0.3972 0.5856 0.4572 0.4180 0.5454
1.9167 206 1.1122 - - - - - - - - - - - - - -
1.9352 208 1.1259 - - - - - - - - - - - - - -
1.9537 210 1.1215 - - - - - - - - - - - - - -
1.9722 212 1.1047 - - - - - - - - - - - - - -
1.9907 214 1.1166 - - - - - - - - - - - - - -

Framework Versions

  • Python: 3.11.9
  • Sentence Transformers: 3.3.1
  • Transformers: 4.48.0.dev0
  • PyTorch: 2.6.0.dev20241112+cu121
  • Accelerate: 1.2.1
  • Datasets: 2.21.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CachedMultipleNegativesRankingLoss

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
Downloads last month
9
Safetensors
Model size
149M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train NohTow/ModernBERT-base-DPR-fullneg-gte-0.0002

Evaluation results