SentenceTransformer based on BAAI/bge-small-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-small-en-v1.5. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-small-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the ๐Ÿค— Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'how has my portfolio performed since inception',
    '[{"get_portfolio([\'quantity\', \'averageCost\', \'marketValue\'],True,None)": "portfolio"}, {"calculate(\'portfolio\',[\'quantity\', \'averageCost\'],\'multiply\',\'cost_basis\')": "portfolio"}, {"calculate(\'portfolio\',[\'marketValue\', \'cost_basis\'],\'difference\',\'profit\')": "profit_port"}, {"aggregate(\'portfolio\',\'ticker\',\'profit\',\'sum\',None)": "profit_port"}]',
    '[{"get_portfolio(None,True,None)": "portfolio"}, {"factor_contribution(\'portfolio\',\'<DATES>\',\'sector\',\'sector information technology\',\'portfolio\')": "portfolio"}]',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.6656
cosine_accuracy@3 0.9164
cosine_accuracy@5 0.9565
cosine_accuracy@10 0.9799
cosine_precision@1 0.6656
cosine_precision@3 0.3055
cosine_precision@5 0.1913
cosine_precision@10 0.098
cosine_recall@1 0.0185
cosine_recall@3 0.0255
cosine_recall@5 0.0266
cosine_recall@10 0.0272
cosine_ndcg@10 0.1839
cosine_mrr@10 0.7871
cosine_map@100 0.0219

Training Details

Training Dataset

Unnamed Dataset

  • Size: 978 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 978 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 4 tokens
    • mean: 13.54 tokens
    • max: 27 tokens
    • min: 20 tokens
    • mean: 87.0 tokens
    • max: 280 tokens
  • Samples:
    sentence_0 sentence_1
    how are my holdings doing [DATES]? [{"get_portfolio(None, True, None)": "portfolio"}, {"get_attribute('portfolio',['gains'],'')": "portfolio"}, {"sort('portfolio','gains','desc')": "portfolio"}]
    how much did I earn [DATES] [{"get_portfolio(None, True, None)": "portfolio"}, {"get_attribute('portfolio',['gains'],'')": "portfolio"}, {"sort('portfolio','gains','desc')": "portfolio"}]
    how am i doing [DATES]? [{"get_portfolio(None, True, None)": "portfolio"}, {"get_attribute('portfolio',['gains'],'')": "portfolio"}, {"sort('portfolio','gains','desc')": "portfolio"}]
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • num_train_epochs: 6
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 6
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Click to expand
Epoch Step Training Loss cosine_ndcg@10
0.0204 2 - 0.0719
0.0408 4 - 0.0729
0.0612 6 - 0.0741
0.0816 8 - 0.0767
0.1020 10 - 0.0807
0.1224 12 - 0.0846
0.1429 14 - 0.0876
0.1633 16 - 0.0927
0.1837 18 - 0.0978
0.2041 20 - 0.1019
0.2245 22 - 0.1064
0.2449 24 - 0.1091
0.2653 26 - 0.1116
0.2857 28 - 0.1143
0.3061 30 - 0.1174
0.3265 32 - 0.1185
0.3469 34 - 0.1225
0.3673 36 - 0.1254
0.3878 38 - 0.1298
0.4082 40 - 0.1330
0.4286 42 - 0.1341
0.4490 44 - 0.1356
0.4694 46 - 0.1388
0.4898 48 - 0.1421
0.5102 50 - 0.1424
0.5306 52 - 0.1439
0.5510 54 - 0.1464
0.5714 56 - 0.1478
0.5918 58 - 0.1489
0.6122 60 - 0.1502
0.6327 62 - 0.1511
0.6531 64 - 0.1506
0.6735 66 - 0.1511
0.6939 68 - 0.1516
0.7143 70 - 0.1530
0.7347 72 - 0.1523
0.7551 74 - 0.1538
0.7755 76 - 0.1542
0.7959 78 - 0.1555
0.8163 80 - 0.1549
0.8367 82 - 0.1549
0.8571 84 - 0.1544
0.8776 86 - 0.1545
0.8980 88 - 0.1543
0.9184 90 - 0.1548
0.9388 92 - 0.1556
0.9592 94 - 0.1569
0.9796 96 - 0.1579
1.0 98 - 0.1585
1.0204 100 - 0.1588
1.0408 102 - 0.1588
1.0612 104 - 0.1593
1.0816 106 - 0.1606
1.1020 108 - 0.1609
1.1224 110 - 0.1610
1.1429 112 - 0.1602
1.1633 114 - 0.1606
1.1837 116 - 0.1611
1.2041 118 - 0.1611
1.2245 120 - 0.1617
1.2449 122 - 0.1622
1.2653 124 - 0.1620
1.2857 126 - 0.1629
1.3061 128 - 0.1630
1.3265 130 - 0.1634
1.3469 132 - 0.1638
1.3673 134 - 0.1643
1.3878 136 - 0.1650
1.4082 138 - 0.1661
1.4286 140 - 0.1660
1.4490 142 - 0.1667
1.4694 144 - 0.1678
1.4898 146 - 0.1675
1.5102 148 - 0.1675
1.5306 150 - 0.1683
1.5510 152 - 0.1684
1.5714 154 - 0.1683
1.5918 156 - 0.1686
1.6122 158 - 0.1692
1.6327 160 - 0.1694
1.6531 162 - 0.1688
1.6735 164 - 0.1688
1.6939 166 - 0.1690
1.7143 168 - 0.1689
1.7347 170 - 0.1686
1.7551 172 - 0.1688
1.7755 174 - 0.1689
1.7959 176 - 0.1691
1.8163 178 - 0.1693
1.8367 180 - 0.1695
1.8571 182 - 0.1704
1.8776 184 - 0.1701
1.8980 186 - 0.1709
1.9184 188 - 0.1712
1.9388 190 - 0.1713
1.9592 192 - 0.1719
1.9796 194 - 0.1720
2.0 196 - 0.1720
2.0204 198 - 0.1719
2.0408 200 - 0.1722
2.0612 202 - 0.1722
2.0816 204 - 0.1726
2.1020 206 - 0.1729
2.1224 208 - 0.1735
2.1429 210 - 0.1739
2.1633 212 - 0.1738
2.1837 214 - 0.1744
2.2041 216 - 0.1746
2.2245 218 - 0.1743
2.2449 220 - 0.1745
2.2653 222 - 0.1745
2.2857 224 - 0.1743
2.3061 226 - 0.1737
2.3265 228 - 0.1739
2.3469 230 - 0.1734
2.3673 232 - 0.1728
2.3878 234 - 0.1720
2.4082 236 - 0.1721
2.4286 238 - 0.1727
2.4490 240 - 0.1738
2.4694 242 - 0.1735
2.4898 244 - 0.1733
2.5102 246 - 0.1736
2.5306 248 - 0.1735
2.5510 250 - 0.1741
2.5714 252 - 0.1742
2.5918 254 - 0.1747
2.6122 256 - 0.1755
2.6327 258 - 0.1756
2.6531 260 - 0.1759
2.6735 262 - 0.1761
2.6939 264 - 0.1762
2.7143 266 - 0.1759
2.7347 268 - 0.1763
2.7551 270 - 0.1756
2.7755 272 - 0.1753
2.7959 274 - 0.1756
2.8163 276 - 0.1758
2.8367 278 - 0.1760
2.8571 280 - 0.1759
2.8776 282 - 0.1752
2.8980 284 - 0.1757
2.9184 286 - 0.1755
2.9388 288 - 0.1753
2.9592 290 - 0.1751
2.9796 292 - 0.1763
3.0 294 - 0.1767
3.0204 296 - 0.1760
3.0408 298 - 0.1757
3.0612 300 - 0.1756
3.0816 302 - 0.1755
3.1020 304 - 0.1753
3.1224 306 - 0.1752
3.1429 308 - 0.1754
3.1633 310 - 0.1750
3.1837 312 - 0.1741
3.2041 314 - 0.1741
3.2245 316 - 0.1744
3.2449 318 - 0.1748
3.2653 320 - 0.1747
3.2857 322 - 0.1747
3.3061 324 - 0.1751
3.3265 326 - 0.1754
3.3469 328 - 0.1752
3.3673 330 - 0.1754
3.3878 332 - 0.1755
3.4082 334 - 0.1765
3.4286 336 - 0.1768
3.4490 338 - 0.1771
3.4694 340 - 0.1775
3.4898 342 - 0.1766
3.5102 344 - 0.1766
3.5306 346 - 0.1773
3.5510 348 - 0.1775
3.5714 350 - 0.1778
3.5918 352 - 0.1779
3.6122 354 - 0.1776
3.6327 356 - 0.1775
3.6531 358 - 0.1769
3.6735 360 - 0.1773
3.6939 362 - 0.1771
3.7143 364 - 0.1773
3.7347 366 - 0.1773
3.7551 368 - 0.1775
3.7755 370 - 0.1775
3.7959 372 - 0.1775
3.8163 374 - 0.1774
3.8367 376 - 0.1771
3.8571 378 - 0.1770
3.8776 380 - 0.1767
3.8980 382 - 0.1772
3.9184 384 - 0.1781
3.9388 386 - 0.1783
3.9592 388 - 0.1778
3.9796 390 - 0.1778
4.0 392 - 0.1779
4.0204 394 - 0.1778
4.0408 396 - 0.1779
4.0612 398 - 0.1780
4.0816 400 - 0.1784
4.1020 402 - 0.1786
4.1224 404 - 0.1795
4.1429 406 - 0.1799
4.1633 408 - 0.1806
4.1837 410 - 0.1806
4.2041 412 - 0.1806
4.2245 414 - 0.1806
4.2449 416 - 0.1805
4.2653 418 - 0.1805
4.2857 420 - 0.1808
4.3061 422 - 0.1805
4.3265 424 - 0.1805
4.3469 426 - 0.1808
4.3673 428 - 0.1805
4.3878 430 - 0.1805
4.4082 432 - 0.1805
4.4286 434 - 0.1806
4.4490 436 - 0.1806
4.4694 438 - 0.1810
4.4898 440 - 0.1811
4.5102 442 - 0.1807
4.5306 444 - 0.1806
4.5510 446 - 0.1805
4.5714 448 - 0.1807
4.5918 450 - 0.1806
4.6122 452 - 0.1804
4.6327 454 - 0.1804
4.6531 456 - 0.1802
4.6735 458 - 0.1801
4.6939 460 - 0.1804
4.7143 462 - 0.1811
4.7347 464 - 0.1811
4.7551 466 - 0.1810
4.7755 468 - 0.1807
4.7959 470 - 0.1810
4.8163 472 - 0.1810
4.8367 474 - 0.1810
4.8571 476 - 0.1808
4.8776 478 - 0.1810
4.8980 480 - 0.1808
4.9184 482 - 0.1809
4.9388 484 - 0.1809
4.9592 486 - 0.1814
4.9796 488 - 0.1814
5.0 490 - 0.1813
5.0204 492 - 0.1811
5.0408 494 - 0.1812
5.0612 496 - 0.1814
5.0816 498 - 0.1812
5.1020 500 0.3771 0.1815
5.1224 502 - 0.1817
5.1429 504 - 0.1818
5.1633 506 - 0.1819
5.1837 508 - 0.1819
5.2041 510 - 0.1820
5.2245 512 - 0.1818
5.2449 514 - 0.1821
5.2653 516 - 0.1821
5.2857 518 - 0.1821
5.3061 520 - 0.1825
5.3265 522 - 0.1825
5.3469 524 - 0.1825
5.3673 526 - 0.1822
5.3878 528 - 0.1822
5.4082 530 - 0.1822
5.4286 532 - 0.1828
5.4490 534 - 0.1830
5.4694 536 - 0.1827
5.4898 538 - 0.1827
5.5102 540 - 0.1830
5.5306 542 - 0.1833
5.5510 544 - 0.1833
5.5714 546 - 0.1835
5.5918 548 - 0.1835
5.6122 550 - 0.1835
5.6327 552 - 0.1837
5.6531 554 - 0.1837
5.6735 556 - 0.1837
5.6939 558 - 0.1837
5.7143 560 - 0.1836
5.7347 562 - 0.1836
5.7551 564 - 0.1836
5.7755 566 - 0.1839

Framework Versions

  • Python: 3.12.2
  • Sentence Transformers: 3.4.1
  • Transformers: 4.50.0
  • PyTorch: 2.6.0
  • Accelerate: 1.5.2
  • Datasets: 3.4.1
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
3
Safetensors
Model size
33.4M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for magnifi/bge-small-en-v1.5-ft-orc-test

Finetuned
(168)
this model

Evaluation results