SentenceTransformer based on nomic-ai/modernbert-embed-base

This is a sentence-transformers model finetuned from nomic-ai/modernbert-embed-base on the touch-rugby-modernbert-pairs dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Trelis/modernbert-embed-base-touch-rugby-ft-v2")
# Run inference
sentences = [
    'Who indicates to commence play at the start of a Touch Rugby match?',
    '6.2\tThe Team coach(s) and Team officials may move from one position to the other \nbut shall do so without delay.While in a position at the end of the Field of Play, \nthe Team coach(s) or Team official must remain no closer than five (5) metres \nfrom the Dead Ball Line and must not coach or communicate (verbal or non-\nverbal) with either Team or the Referees.7\u2002 Commencement and Recommencement of Play  \n7.1\tTeam captains are to toss a coin in the presence of the Referee(s) with the \nwinning captain’s Team having the choice of the direction the Team wishes \nto run in the first half; the choice of Interchange Areas for the duration of the \nmatch, including any extra time; and the choice of which team will commence \nthe match in Possession.7.2\tA player of the Attacking Team is to commence the match with a Tap at the \ncentre of the Halfway Line following the indication to commence play from the \nReferee.',
    'See Appendix 1.Forced Interchange\nWhen a player is required to undertake a compulsory Interchange for \nan Infringement ruled more serious than a Penalty but less serious \nthan a Permanent Interchange, Sin Bin or Dismissal.Forward\nA position or direction towards the Dead Ball Line beyond the Team’s \nAttacking Try Line.Full Time\nThe expiration of the second period of time allowed for play.Half\nThe player who takes Possession following a Rollball.Half Time\nThe break in play between the two halves of a match.Imminent\nAbout to occur, it is almost certain to occur.Infringement\nThe action of a player contrary to the Rules of the game.In-Goal Area\nThe area in the Field of Play bounded by the Sidelines, the Try Lines \nand the Dead Ball Lines.There are two (2), one (1) at each end of the \nField of Play.See Appendix 1.Interchange\nThe act of an on-field player leaving the Field of Play to be replaced \nby an off-field player entering the Field of Play.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

touch-rugby-modernbert-pairs

  • Dataset: touch-rugby-modernbert-pairs at 7cb0ae2
  • Size: 305 training samples
  • Columns: question and related_chunk
  • Approximate statistics based on the first 305 samples:
    question related_chunk
    type string string
    details
    • min: 10 tokens
    • mean: 18.68 tokens
    • max: 36 tokens
    • min: 147 tokens
    • mean: 231.42 tokens
    • max: 319 tokens
  • Samples:
    question related_chunk
    When may Onside players of the Defending Team move forward if the Half is not within one metre of the Rollball? 13.10 A player ceases to be the Half once the ball is passed to another player.13.11 Defending players are not to interfere with the performance of the Rollball or the
    Half.Ruling = A Penalty to the Attacking Team at a point ten (10) metres directly Forward of the
    Infringement.13.12 Players of the Defending Team must not move Forward of the Onside position
    until the Half has made contact with the ball, unless directed to do so by the
    Referee or in accordance with 13.12.1.13.12.1 When the Half is not within one (1) metre of the Rollball, Onside players
    of the Defending Team may move Forward as soon as the player
    performing the Rollball releases the ball.If the Half is not in position and
    a defending player moves Forward and makes contact with the ball, a
    Change of Possession results.
    Besides awarding tries, what other scoring-related task does the Referee perform? An approach may only be made during a break in play or at
    the discretion of the Referee.FIT Playing Rules - 5th Edition
    18
    COPYRIGHT © Touch Football Australia 2020
    HALFWAY LINE
    SIN BIN AREAS
    IN-GOAL AREA
    TRY LINE
    7 M ZONE
    DEAD BALL LINE
    PERIMETER
    INTERCHANGE
    AREA
    20M
    10M
    10M
    1M
    5M
    7 M
    7 M
    7 M
    7 M
    50M
    3M
    70M
    INTERCHANGE
    AREA
    Appendix 1 – Field of Play
    FIT Playing Rules - 5th Edition
    COPYRIGHT © Touch Football Australia 2020
    19
    FEDERATION OF INTERNATIONAL TOUCH
    What happens if a team has fewer than four players on the field during a match? FIT Playing Rules - 5th Edition
    COPYRIGHT © Touch Football Australia 2020
    7
    7.6 A Tap may not be taken until at least four (4) defending players are in an Onside
    position or unless directed to so by the Referee.Where the number of players
    on the field from the Defending Team falls below four (4), all players must be in
    an Onside position for a Tap to be taken unless directed to do so by the Referee.Ruling = The Player will be directed to return to the Mark and to take the Tap again.7.7 The Tap to commence or recommence play must be performed without delay.Ruling = A Penalty to the non-offending team at the centre of the Halfway line.8  Match Duration

    8.1 A match is 40 minutes in duration, consisting of two (2) x 20 minute halves with
    a Half Time break.8.1.1 There is no time off for injury during a match.8.2 Local competition and tournament conditions may vary the duration of a match.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

touch-rugby-modernbert-pairs

  • Dataset: touch-rugby-modernbert-pairs at 7cb0ae2
  • Size: 305 evaluation samples
  • Columns: question and related_chunk
  • Approximate statistics based on the first 305 samples:
    question related_chunk
    type string string
    details
    • min: 11 tokens
    • mean: 18.06 tokens
    • max: 32 tokens
    • min: 173 tokens
    • mean: 228.39 tokens
    • max: 260 tokens
  • Samples:
    question related_chunk
    What is the definition of the 'Defending Team' in Touch Rugby Rules 5th Edition? Except as permitted under the
    Copyright Act, these Rules must not be reproduced by any process, electronic or otherwise, without the written
    permission of Touch Football Australia.Attacking Try Line
    The line on or over which a player has to place the ball to
    score a Try.Attacking Team
    The Team which has or is gaining Possession.Behind
    A position or direction towards a Team’s Defending Try Line.Change of Possession
    The act of moving control of the ball from one Team to the other.Dead/Dead Ball
    When the ball is out of play including the period following a Try and
    until the match is recommenced and when the ball goes to ground
    and/or outside the boundaries of the Field of Play prior to the
    subsequent Rollball.Dead Ball Line
    The end boundaries of the Field of Play.There is one at each end of
    the Field of Play.See Appendix 1.Defending Try Line
    The line which a Team has to defend to prevent a Try.Defending Team
    The Team without or which is losing Possession.
    What is the minimum number of players required on the field for a touch rugby match to begin or continue? FIT Playing Rules - 5th Edition
    COPYRIGHT © Touch Football Australia 2020
    7
    7.6 A Tap may not be taken until at least four (4) defending players are in an Onside
    position or unless directed to so by the Referee.Where the number of players
    on the field from the Defending Team falls below four (4), all players must be in
    an Onside position for a Tap to be taken unless directed to do so by the Referee.Ruling = The Player will be directed to return to the Mark and to take the Tap again.7.7 The Tap to commence or recommence play must be performed without delay.Ruling = A Penalty to the non-offending team at the centre of the Halfway line.8  Match Duration

    8.1 A match is 40 minutes in duration, consisting of two (2) x 20 minute halves with
    a Half Time break.8.1.1 There is no time off for injury during a match.8.2 Local competition and tournament conditions may vary the duration of a match.
    What are the possible outcomes of a Referee's Ruling? See Appendix 1.Forced Interchange
    When a player is required to undertake a compulsory Interchange for
    an Infringement ruled more serious than a Penalty but less serious
    than a Permanent Interchange, Sin Bin or Dismissal.Forward
    A position or direction towards the Dead Ball Line beyond the Team’s
    Attacking Try Line.Full Time
    The expiration of the second period of time allowed for play.Half
    The player who takes Possession following a Rollball.Half Time
    The break in play between the two halves of a match.Imminent
    About to occur, it is almost certain to occur.Infringement
    The action of a player contrary to the Rules of the game.In-Goal Area
    The area in the Field of Play bounded by the Sidelines, the Try Lines
    and the Dead Ball Lines.There are two (2), one (1) at each end of the
    Field of Play.See Appendix 1.Interchange
    The act of an on-field player leaving the Field of Play to be replaced
    by an off-field player entering the Field of Play.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 5e-06
  • num_train_epochs: 1
  • lr_scheduler_type: constant
  • warmup_ratio: 0.3

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-06
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: constant
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.3
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss
0.2222 2 2.8177 2.5945
0.4444 4 2.9155 2.5693
0.6667 6 2.9114 2.5402
0.8889 8 2.7999 2.5098

Framework Versions

  • Python: 3.12.4
  • Sentence Transformers: 3.3.1
  • Transformers: 4.48.0
  • PyTorch: 2.5.1
  • Accelerate: 1.3.0
  • Datasets: 2.17.1
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
16
Safetensors
Model size
149M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Trelis/modernbert-embed-base-touch-rugby-ft-v2

Finetuned
(15)
this model

Dataset used to train Trelis/modernbert-embed-base-touch-rugby-ft-v2