--- base_model: sentence-transformers/all-mpnet-base-v2 datasets: [] language: [] library_name: sentence-transformers metrics: - pearson_cosine - spearman_cosine - pearson_manhattan - spearman_manhattan - pearson_euclidean - spearman_euclidean - pearson_dot - spearman_dot - pearson_max - spearman_max pipeline_tag: sentence-similarity tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:17093 - loss:CosineSimilarityLoss widget: - source_sentence: In the realm of genetics , it is far better to be safe than sorry . sentences: - Marijuana use harms the brain, and legalization will increase mental health problems. - We are god now ! - Likewise , the proposal that addictive drugs should be legalized , regulated and opened to " free market dynamics " is immediately belied by the recognition that the drug market for an addict is no longer a free market – it is clear that they will pay any price when needing their drug . - source_sentence: The worldwide anti-nuclear power movement has provided enormous stimulation to the Australian movement , and the decline in nuclear power expansion since the late 1970s - due substantially to worldwide citizen opposition - has been a great setback for Australian uranium mining interests . sentences: - Just as the state has the authority ( and duty ) to act justly in allocating scarce resources , in meeting minimal needs of its ( deserving ) citizens , in defending its citizens from violence and crime , and in not waging unjust wars ; so too does it have the authority , flowing from its mission to promote justice and the good of its people , to punish the criminal . - The long lead times for construction that invalidate nuclear power as a way of mitigating climate change was a point recognized in 2009 by the body whose mission is to promote the use of nuclear power , the International Atomic Energy Agency ( IAEA ) . - Gun control laws would reduce the societal costs associated with gun violence. - source_sentence: Requiring uniforms enhances school security by permitting identification of non-students who try to enter the campus . sentences: - Many students who are against school uniforms argue that they lose their â € ‹ self identity when they lose their right to express themselves through fashion . - If reproductive cloning is perfected , a quadriplegic can also choose to have himself cloned , so someone can take his place . - A higher minimum wage might also decrease turnover and thus keep training costs down , supporters say . - source_sentence: Minimum wage has long been a minimum standard of living . sentences: - A minimum wage job is suppose to be an entry level stepping stone – not a career goal . - It is argued that just as it would be permissible to " unplug " and thereby cause the death of the person who is using one 's kidneys , so it is permissible to abort the fetus ( who similarly , it is said , has no right to use one 's body 's life-support functions against one 's will ) . - Abortion reduces welfare costs to taxpayers . - source_sentence: Fanatics of the pro – life argument are sometimes so focused on the fetus that they put no value to the mother ’s life and do not even consider the viability of the fetus . sentences: - Life is life , whether it s outside the womb or not . - Legalization of marijuana is phasing out black markets and taking money away from drug cartels, organized crime, and street gangs. - 'Response 2 : A child is not replaceable .' model-index: - name: SentenceTransformer based on sentence-transformers/all-mpnet-base-v2 results: - task: type: semantic-similarity name: Semantic Similarity dataset: name: sts test type: sts-test metrics: - type: pearson_cosine value: 0.7294675022492696 name: Pearson Cosine - type: spearman_cosine value: 0.7234943835496113 name: Spearman Cosine - type: pearson_manhattan value: 0.7104391963353577 name: Pearson Manhattan - type: spearman_manhattan value: 0.7118078150763045 name: Spearman Manhattan - type: pearson_euclidean value: 0.7212412855224142 name: Pearson Euclidean - type: spearman_euclidean value: 0.7234943835496113 name: Spearman Euclidean - type: pearson_dot value: 0.7294674862347428 name: Pearson Dot - type: spearman_dot value: 0.7234943835496113 name: Spearman Dot - type: pearson_max value: 0.7294675022492696 name: Pearson Max - type: spearman_max value: 0.7234943835496113 name: Spearman Max - type: pearson_cosine value: 0.7146126101962849 name: Pearson Cosine - type: spearman_cosine value: 0.6886131469202397 name: Spearman Cosine - type: pearson_manhattan value: 0.7069653659670995 name: Pearson Manhattan - type: spearman_manhattan value: 0.6837201725651982 name: Spearman Manhattan - type: pearson_euclidean value: 0.7115078495768724 name: Pearson Euclidean - type: spearman_euclidean value: 0.6886131469202397 name: Spearman Euclidean - type: pearson_dot value: 0.7146126206763159 name: Pearson Dot - type: spearman_dot value: 0.6886131469202397 name: Spearman Dot - type: pearson_max value: 0.7146126206763159 name: Pearson Max - type: spearman_max value: 0.6886131469202397 name: Spearman Max --- # SentenceTransformer based on sentence-transformers/all-mpnet-base-v2 This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) - **Maximum Sequence Length:** 512 tokens - **Output Dimensionality:** 768 tokens - **Similarity Function:** Cosine Similarity ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Normalize() ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("armaniii/all-mpnet-base-v2-augmentation-indomain-bm25-sts") # Run inference sentences = [ 'Fanatics of the pro – life argument are sometimes so focused on the fetus that they put no value to the mother ’s life and do not even consider the viability of the fetus .', 'Life is life , whether it s outside the womb or not .', 'Legalization of marijuana is phasing out black markets and taking money away from drug cartels, organized crime, and street gangs.', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 768] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Evaluation ### Metrics #### Semantic Similarity * Dataset: `sts-test` * Evaluated with [EmbeddingSimilarityEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator) | Metric | Value | |:--------------------|:-----------| | pearson_cosine | 0.7295 | | **spearman_cosine** | **0.7235** | | pearson_manhattan | 0.7104 | | spearman_manhattan | 0.7118 | | pearson_euclidean | 0.7212 | | spearman_euclidean | 0.7235 | | pearson_dot | 0.7295 | | spearman_dot | 0.7235 | | pearson_max | 0.7295 | | spearman_max | 0.7235 | #### Semantic Similarity * Dataset: `sts-test` * Evaluated with [EmbeddingSimilarityEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator) | Metric | Value | |:--------------------|:-----------| | pearson_cosine | 0.7146 | | **spearman_cosine** | **0.6886** | | pearson_manhattan | 0.707 | | spearman_manhattan | 0.6837 | | pearson_euclidean | 0.7115 | | spearman_euclidean | 0.6886 | | pearson_dot | 0.7146 | | spearman_dot | 0.6886 | | pearson_max | 0.7146 | | spearman_max | 0.6886 | ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 17,093 training samples * Columns: sentence1, sentence2, and score * Approximate statistics based on the first 1000 samples: | | sentence1 | sentence2 | score | |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------| | type | string | string | float | | details | | | | * Samples: | sentence1 | sentence2 | score | |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------| | It is true that a Colorado study found a post-legalization increase in youths being treated for marijuana exposure . | In Colorado , recent figures correlate with the years since marijuana legalization to show a dramatic decrease in overall highway fatalities – and a two-fold increase in the frequency of marijuana-positive drivers in fatal auto crashes . | 0.4642857142857143 | | The idea of a school uniform is that students wear the uniform at school , but do not wear the uniform , say , at a disco or other events outside school . | If it means that the schoolrooms will be more orderly , more disciplined , and that our young people will learn to evaluate themselves by what they are on the inside instead of what they 're wearing on the outside , then our public schools should be able to require their students to wear school uniforms . " | 0.5714285714285714 | | The resulting embryonic stem cells could then theoretically be grown into adult cells to replace the ailing person 's mutated cells . | However , there is a more serious , less cartoonish objection to turning procreation into manufacturing . | 0.4464285714285714 | * Loss: [CosineSimilarityLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters: ```json { "loss_fct": "torch.nn.modules.loss.MSELoss" } ``` ### Evaluation Dataset #### Unnamed Dataset * Size: 340 evaluation samples * Columns: sentence1, sentence2, and score * Approximate statistics based on the first 1000 samples: | | sentence1 | sentence2 | score | |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------| | type | string | string | float | | details | | | | * Samples: | sentence1 | sentence2 | score | |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------| | [ quoting himself from Furman v. Georgia , 408 U.S. 238 , 257 ( 1972 ) ] As such it is a penalty that ' subjects the individual to a fate forbidden by the principle of civilized treatment guaranteed by the [ Clause ] . ' | It provides a deterrent for prisoners already serving a life sentence . | 0.3214285714285714 | | Of those savings , $ 25.7 billion would accrue to state and local governments , while $ 15.6 billion would accrue to the federal government . | Jaime Smith , deputy communications director for the governor ’s office , said , “ The legalization initiative was not driven by a desire for a revenue , but it has provided a small assist for our state budget . ” | 0.5357142857142857 | | If the uterus is designed to sustain an unborn child ’s life , do n’t unborn children have a right to receive nutrition and shelter through the one organ designed to provide them with that ordinary care ? | We as parents are supposed to protect our children at all costs whether they are in the womb or not . | 0.7678571428571428 | * Loss: [CosineSimilarityLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters: ```json { "loss_fct": "torch.nn.modules.loss.MSELoss" } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `eval_strategy`: steps - `per_device_train_batch_size`: 16 - `per_device_eval_batch_size`: 16 - `warmup_ratio`: 0.1 - `bf16`: True #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: steps - `prediction_loss_only`: True - `per_device_train_batch_size`: 16 - `per_device_eval_batch_size`: 16 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 1 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 5e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1.0 - `num_train_epochs`: 3 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.1 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: True - `fp16`: False - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: False - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: False - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `dispatch_batches`: None - `split_batches`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `eval_use_gather_object`: False - `batch_sampler`: batch_sampler - `multi_dataset_batch_sampler`: proportional
### Training Logs | Epoch | Step | Training Loss | loss | sts-test_spearman_cosine | |:------:|:----:|:-------------:|:------:|:------------------------:| | 0.0935 | 100 | 0.0151 | 0.0098 | 0.7013 | | 0.1871 | 200 | 0.0069 | 0.0112 | 0.6857 | | 0.2806 | 300 | 0.0058 | 0.0106 | 0.6860 | | 0.3742 | 400 | 0.0059 | 0.0102 | 0.6915 | | 0.4677 | 500 | 0.0057 | 0.0097 | 0.6903 | | 0.5613 | 600 | 0.0049 | 0.0100 | 0.6797 | | 0.6548 | 700 | 0.0055 | 0.0101 | 0.6766 | | 0.7484 | 800 | 0.0049 | 0.0116 | 0.6529 | | 0.8419 | 900 | 0.0049 | 0.0105 | 0.6572 | | 0.9355 | 1000 | 0.0051 | 0.0115 | 0.6842 | | 1.0290 | 1100 | 0.0038 | 0.0094 | 0.7000 | | 1.1225 | 1200 | 0.0029 | 0.0091 | 0.7027 | | 1.2161 | 1300 | 0.0026 | 0.0093 | 0.7016 | | 1.3096 | 1400 | 0.0027 | 0.0088 | 0.7192 | | 1.4032 | 1500 | 0.0027 | 0.0097 | 0.7065 | | 1.4967 | 1600 | 0.0028 | 0.0091 | 0.7011 | | 1.5903 | 1700 | 0.0027 | 0.0095 | 0.7186 | | 1.6838 | 1800 | 0.0026 | 0.0087 | 0.7277 | | 1.7774 | 1900 | 0.0024 | 0.0085 | 0.7227 | | 1.8709 | 2000 | 0.0025 | 0.0086 | 0.7179 | | 1.9645 | 2100 | 0.0022 | 0.0086 | 0.7195 | | 2.0580 | 2200 | 0.0017 | 0.0088 | 0.7183 | | 2.1515 | 2300 | 0.0014 | 0.0088 | 0.7229 | | 2.2451 | 2400 | 0.0014 | 0.0086 | 0.7200 | | 2.3386 | 2500 | 0.0013 | 0.0088 | 0.7248 | | 2.4322 | 2600 | 0.0014 | 0.0085 | 0.7286 | | 2.5257 | 2700 | 0.0015 | 0.0085 | 0.7283 | | 2.6193 | 2800 | 0.0014 | 0.0085 | 0.7263 | | 2.7128 | 2900 | 0.0014 | 0.0085 | 0.7248 | | 2.8064 | 3000 | 0.0013 | 0.0087 | 0.7191 | | 2.8999 | 3100 | 0.0011 | 0.0086 | 0.7225 | | 2.9935 | 3200 | 0.0012 | 0.0085 | 0.7235 | | 3.0 | 3207 | - | - | 0.6886 | ### Framework Versions - Python: 3.9.2 - Sentence Transformers: 3.0.1 - Transformers: 4.43.1 - PyTorch: 2.3.1+cu121 - Accelerate: 0.34.2 - Datasets: 2.14.7 - Tokenizers: 0.19.1 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ```