gnlp_hw1_encoder / README.md
greatakela's picture
Add new SentenceTransformer model
25b27e5 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:4893
  - loss:TripletLoss
base_model: distilbert/distilroberta-base
widget:
  - source_sentence: >-
      Leave me alone! Have you gone daft? Mister Spock needs me! Let go! That
      will be quite enough. Thank you, doctor.; Please, release her.[SEP]What's
      this all about?
    sentences:
      - ' You know, the lab here, they have a paid intern position. It''s usually given to one of the kids from the universities but, if you want, I could pRobably get you an interview. There''s some entry lEvel stuff, some gofer work. But you''d also have access to a lot of cool things.'
      - >-
        She was doing as I requested, Mister Scott. A Vulcan form of
        self-healing.
      - >-
        Thasians have been referred to in our records as having the power to
        transmute objects or render substances invisible. It has generally been
        regarded as legend, but Charlie does seems to possess this same power.
  - source_sentence: >-
      Why would you do this? Because the needs of the one ...outweigh the needs
      of the many. I have been ...and ever shall be ...your friend. Yes! Yes,
      Spock. The ship. ...Out of danger?[SEP]You saved the ship, ...You saved us
      all. Don't you remember?
    sentences:
      - ' My wife had taken a sleeping pill and gone to bed. It was Christmas Eve. Kyle popped corn in the fireplace. He Managed to knock loose some tinder. Wrapping paper caught on fire. Spread so fast. I got Kyle outta there. When I went back in for... [Chokes, takes a beat, then.]'
      - >-
        In two days, you'll have your own hands, Thalassa. Mechanically
        efficient and quite human-looking. Android robot hands, of course. Hands
        without feeling. Enjoy the taste of life while you can.
      - Jim, ...your name is Jim.
  - source_sentence: >-
      Captain, if something hasn't worked out and therefore has no scientific
      fact Shall we leave it up to the doctor? Since you brought me down here
      for advice, Captain One of the advantages of being a Captain, Doctor, is
      being able to ask for advice without necessarily having to take it. I
      think I'll have to award that round to the Captain, Helen. You're fighting
      over your weight. All right, let's take a look.[SEP]I'm not a criminal! I
      do not require neural neutraliser.
    sentences:
      - Neural neutraliser. Can you explain that, Doctor Van Gelder?
      - ' And the disorientation?'
      - I'm aware of these facts. Please get on with the job. Computer.
  - source_sentence: >-
      We're picking up an object, sir. Much larger, coming toward us. Coming.
      Exceptionally strong contact. Not visual yet. Distant spectrograph.
      Metallic, similar to cube. Much greater energy reading. There, sir. Half
      speed. Prepare for evasive action.[SEP]Reducing to warp two, sir.
    sentences:
      - Tractor beam, Captain. Something's grabbed us, hard.
      - Exactly.
      - ' There''s a blockage in the urinary tract. Simple terms, your baby can''t pee. His bladder is swollen and it''s crushing his lungs.'
  - source_sentence: >-
      My father says you have been my friend. ...You came back for me. You would
      have done the same for me. Why would you do this? Because the needs of the
      one ...outweigh the needs of the many. I have been ...and ever shall be
      ...your friend.[SEP]Yes! Yes, Spock.
    sentences:
      - But a defensible entrance, Captain.
      - ' No, blood tests were all normal. And he clotted in six minutes.'
      - The ship. ...Out of danger?
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy
model-index:
  - name: SentenceTransformer based on distilbert/distilroberta-base
    results:
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: evaluator enc
          type: evaluator_enc
        metrics:
          - type: cosine_accuracy
            value: 0.9989781379699707
            name: Cosine Accuracy
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: evaluator val
          type: evaluator_val
        metrics:
          - type: cosine_accuracy
            value: 0.9872685074806213
            name: Cosine Accuracy

SentenceTransformer based on distilbert/distilroberta-base

This is a sentence-transformers model finetuned from distilbert/distilroberta-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: distilbert/distilroberta-base
  • Maximum Sequence Length: 128 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: RobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("greatakela/gnlp_hw1_encoder")
# Run inference
sentences = [
    'My father says you have been my friend. ...You came back for me. You would have done the same for me. Why would you do this? Because the needs of the one ...outweigh the needs of the many. I have been ...and ever shall be ...your friend.[SEP]Yes! Yes, Spock.',
    'The ship. ...Out of danger?',
    ' No, blood tests were all normal. And he clotted in six minutes.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric evaluator_enc evaluator_val
cosine_accuracy 0.999 0.9873

Training Details

Training Dataset

Unnamed Dataset

  • Size: 4,893 training samples
  • Columns: sentence_0, sentence_1, and sentence_2
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 sentence_2
    type string string string
    details
    • min: 2 tokens
    • mean: 83.38 tokens
    • max: 128 tokens
    • min: 4 tokens
    • mean: 18.38 tokens
    • max: 91 tokens
    • min: 4 tokens
    • mean: 18.48 tokens
    • max: 102 tokens
  • Samples:
    sentence_0 sentence_1 sentence_2
    The usage is correct. The creator was simply testing your memory banks. There was much damage in the accident. Mister Singh. Come here a moment. This unit will see to your needs. Sir? I'll be back in a moment. Gentlemen, come with me.[SEP]You're on to something, Spock. What is it? I've correlated all the available information on the Nomad probe, and I'm convinced that this object is indeed that probe. DIC would explain both the!
    Mister Spock, how many people are on Memory Alpha? It varies with the number of scholars, researchers, and scientists from the various Federation planets who are using the computer complex. Captain, we are within orbit range. Lock into orbit. Aye, sir.[SEP]It is leaving Memory Alpha, Captain. Sensors give no readings of generated energy from Memory Alpha, Captain. Weird huh?
    We're guiding around most of the time ripples now. Mister Spock? All plotted but one, Captain. Coming up on it now. Seems to be fairly heavy displacement. Bones! Get back to your positions. The hypo, Captain.[SEP]It was set for cordrazine. Empty. Actually he's only in the Navy when they sang, In The Navy. The rest of the time he's just in generic fatigues. [House stares at him.] What? You brought it up! [House starts to walk out.] You didn't flush.
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss evaluator_enc_cosine_accuracy evaluator_val_cosine_accuracy
-1 -1 - 0.5866 -
0.4902 300 - 0.9875 -
0.8170 500 1.085 - -
0.9804 600 - 0.9935 -
1.0 612 - 0.9937 -
1.4706 900 - 0.9967 -
1.6340 1000 0.1573 - -
1.9608 1200 - 0.9980 -
2.0 1224 - 0.9980 -
2.4510 1500 0.0733 0.9990 -
2.9412 1800 - 0.9990 -
3.0 1836 - 0.9990 -
-1 -1 - - 0.9873

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.4.1
  • Transformers: 4.49.0
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.3.0
  • Datasets: 3.3.2
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}