new_model_3 / README.md
vineet10's picture
Add new SentenceTransformer model.
12723ef verified
|
raw
history blame
24.5 kB
metadata
base_model: BAAI/bge-base-en-v1.5
datasets: []
language: []
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:26
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: >-
      Answer: Users can contact Customer Care before confirmation to request a
      refund for offline
    sentences:
      - single order?
      - a booking?
      - MOU?
  - source_sentence: >-
      The Employee agrees to be employed on the terms and conditions set out in
      this Agreement.
    sentences:
      - What events constitute Force Majeure under this Agreement?
      - What are the specific terms and conditions of employment?
      - What is the scope of this Agreement?
  - source_sentence: >-
      The term of this Agreement shall continue until terminated by either party
      in accordance with
    sentences:
      - When does this Agreement terminate?
      - What is the term of the Agreement?
      - Can the Company make changes to the job title or duties of the Employee?
  - source_sentence: >-
      The initial job title of the Employee will be Relationship Manager. The
      initial job duties the
    sentences:
      - >-
        What remedies are available in case of a material breach of this
        Agreement?
      - >-
        What representations and warranties does the Employee make to the
        Company?
      - What are the initial job title and duties of the Employee?
  - source_sentence: >-
      The Company has employed the Employee to render services as described
      herein from the
    sentences:
      - What rules and policies must the Employee abide by?
      - What are the general obligations of the Employee?
      - When does the Company employ the Employee?
model-index:
  - name: SentenceTransformer based on BAAI/bge-base-en-v1.5
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 768
          type: dim_768
        metrics:
          - type: cosine_accuracy@1
            value: 0.6666666666666666
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 1
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.6666666666666666
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.3333333333333333
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.20000000000000004
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.10000000000000002
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.6666666666666666
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 1
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.8769765845238192
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.8333333333333334
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.8333333333333334
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 512
          type: dim_512
        metrics:
          - type: cosine_accuracy@1
            value: 0.6666666666666666
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 1
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.6666666666666666
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.3333333333333333
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.20000000000000004
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.10000000000000002
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.6666666666666666
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 1
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.8333333333333334
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.7777777777777777
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.7777777777777777
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 256
          type: dim_256
        metrics:
          - type: cosine_accuracy@1
            value: 0.6666666666666666
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 1
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.6666666666666666
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.3333333333333333
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.20000000000000004
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.10000000000000002
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.6666666666666666
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 1
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.8333333333333334
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.7777777777777777
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.7777777777777777
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 128
          type: dim_128
        metrics:
          - type: cosine_accuracy@1
            value: 0.6666666666666666
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.6666666666666666
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.6666666666666666
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.2222222222222222
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.20000000000000004
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.10000000000000002
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.6666666666666666
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.6666666666666666
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.8102255193577976
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.75
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.75
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 64
          type: dim_64
        metrics:
          - type: cosine_accuracy@1
            value: 0.6666666666666666
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.6666666666666666
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.6666666666666666
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.6666666666666666
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.6666666666666666
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.2222222222222222
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.13333333333333333
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.06666666666666667
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.6666666666666666
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.6666666666666666
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.6666666666666666
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.6666666666666666
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.6666666666666666
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.6666666666666666
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.6969696969696969
            name: Cosine Map@100

SentenceTransformer based on BAAI/bge-base-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-base-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("vineet10/new_model_3")
# Run inference
sentences = [
    'The Company has employed the Employee to render services as described herein from the',
    'When does the Company employ the Employee?',
    'What are the general obligations of the Employee?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.6667
cosine_accuracy@3 1.0
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.6667
cosine_precision@3 0.3333
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.6667
cosine_recall@3 1.0
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.877
cosine_mrr@10 0.8333
cosine_map@100 0.8333

Information Retrieval

Metric Value
cosine_accuracy@1 0.6667
cosine_accuracy@3 1.0
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.6667
cosine_precision@3 0.3333
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.6667
cosine_recall@3 1.0
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.8333
cosine_mrr@10 0.7778
cosine_map@100 0.7778

Information Retrieval

Metric Value
cosine_accuracy@1 0.6667
cosine_accuracy@3 1.0
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.6667
cosine_precision@3 0.3333
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.6667
cosine_recall@3 1.0
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.8333
cosine_mrr@10 0.7778
cosine_map@100 0.7778

Information Retrieval

Metric Value
cosine_accuracy@1 0.6667
cosine_accuracy@3 0.6667
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.6667
cosine_precision@3 0.2222
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.6667
cosine_recall@3 0.6667
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.8102
cosine_mrr@10 0.75
cosine_map@100 0.75

Information Retrieval

Metric Value
cosine_accuracy@1 0.6667
cosine_accuracy@3 0.6667
cosine_accuracy@5 0.6667
cosine_accuracy@10 0.6667
cosine_precision@1 0.6667
cosine_precision@3 0.2222
cosine_precision@5 0.1333
cosine_precision@10 0.0667
cosine_recall@1 0.6667
cosine_recall@3 0.6667
cosine_recall@5 0.6667
cosine_recall@10 0.6667
cosine_ndcg@10 0.6667
cosine_mrr@10 0.6667
cosine_map@100 0.697

Training Details

Training Dataset

Unnamed Dataset

  • Size: 26 training samples
  • Columns: context and question
  • Approximate statistics based on the first 1000 samples:
    context question
    type string string
    details
    • min: 2 tokens
    • mean: 19.15 tokens
    • max: 28 tokens
    • min: 4 tokens
    • mean: 11.35 tokens
    • max: 18 tokens
  • Samples:
    context question
    The Employee agrees to diligently, honestly, and to the best of their abilities, perform all What are the general obligations of the Employee?
    The Company has employed the Employee to render services as described herein from the When does the Company employ the Employee?
    Answer: Users can report delays to Customer Care and expect an automatic refund within order?
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step dim_128_cosine_map@100 dim_256_cosine_map@100 dim_512_cosine_map@100 dim_64_cosine_map@100 dim_768_cosine_map@100
0 0 0.75 0.7778 0.7778 0.6970 0.8333

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.42.4
  • PyTorch: 2.3.1+cu121
  • Accelerate: 0.32.1
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}