BGE base En v1.5 Phase 5
This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: BAAI/bge-base-en-v1.5
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("RishuD7/bge-base-en-v1.5-65-keys-phase-5-exp_v1")
# Run inference
sentences = [
'SECTION THREE RENT A. Base Rent. Tenant shall pay the following Base Rent, without reduction or set-off, during the Term: (i) Initial Term. For the one (1) year period commencing on the Commencement Date the Base Rent shall be Eleven Thousand Seven Hundred and Four Dollars and Fifty Cents ($11,704.50) per month, payable in advance on the first (1") day of each month (the "due date"), provided however, that Rent for the first (1") month shall be due and payable immediately upon Tenant\'s execution of this Lease. During each subsequent one (I) year period during the Initial Term, the monthly Base Rent shall be increased by three percent (3%). Rent received after the tenth (IOlh) day of the month it is due shall result in an 2 additional late charge equal to five percent (5%) of the monthly Base Rent that is late, and such late charge shall be due and payable as Additional Rent with the Base Rent payment for the following calendar month, or if the Initial Term has expired, within fifteen (15) days of Landlord\'s written demand for payment. (ii) Extension Term. For the one (1) year period commencing on the first (1 ") day of the First Extension Term, the Base Rent shall be Twenty-One Thousand One Hundred Thirty-Nine Dollars and Sixty-Three Cents ($21,139.63) per month, payable in advance on the first (1 ") day of each month (the "due date"). During each subsequent one (1) year period during the First Extension Term and the Second Extension Term, the monthly Base Rent shall be increased by three percent (3%). Rent received after the tenth (1oth) day of the month it is due shall result in an additional late charge equal to five percent (5%) of the monthly Base Rent that is late, and such late charge shall be due and payable as Additional Rent with the Base Rent payment for the following calendar month, or if the applicable Extension Term has expired, within fifteen (15) days of Landlord\'s written demand for payment. (iii) Holdover Tenancy.\nAny holdover tenancy shall be month to month, and can be terminated by either party upon thirty (30) days advanced written notice. The monthly Base Rent during the period of any holdover tenancy shall be an amount equal to two (2) times the monthly Base Rent provided in (i) or (ii) above for the most recently completed month of the Initial Term or any applicable Extension Term, payable in advance on the first (1 ") day of each month. Rent received after the tenth (10th) day of the month it is due shall result in an additional late charge equal to five percent (5%) of the monthly Base Rent that is late, and such late charge shall be due and payable as Additional Rent with the Base Rent payment for the following calendar month, or at Landlord\'s option, which Landlord may exercise at any time, within fifteen (15) days of Landlord\'s written demand for payment. Tenant shall be responsible for paying all items of Additional Rent, as specified below (and described elsewhere in this Lease), during the period of any holdover tenancy. (iv) Rent Exhibit. The Base Rent during the Initial Term and Extension Terms is set forth on Exhibit A attached hereto and incorporated herein by this reference. B. Additional Rent. The following items shall be paid by Tenant as Additional Rent during the entire Term, and during the period of any holdover tenancy. In the event this Lease is terminated in a manner permitted by this Lease, or in the event of a holdover tenancy, annual items of Additional Rent shall be pro-rated, and Tenant shall only be responsible for its proportionate share of such Additional Rent for periods prior to the termination of the Lease andlor any holdover tenancy. (i) Late Rent. The late charges specified in A. above for rental payments that are late shall be an item of Additional Rent due with the rental payment for the calendar month immediately following the month in which the late payment was incurred.',
'Late Payment Trigger Period (days)',
'Late Payment Trigger Period Details',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Dataset:
dim_768
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.0056 |
cosine_accuracy@3 | 0.0152 |
cosine_accuracy@5 | 0.028 |
cosine_accuracy@10 | 0.0583 |
cosine_precision@1 | 0.0056 |
cosine_precision@3 | 0.0051 |
cosine_precision@5 | 0.0056 |
cosine_precision@10 | 0.0058 |
cosine_recall@1 | 0.0056 |
cosine_recall@3 | 0.0152 |
cosine_recall@5 | 0.028 |
cosine_recall@10 | 0.0583 |
cosine_ndcg@10 | 0.0261 |
cosine_mrr@10 | 0.0166 |
cosine_map@100 | 0.0297 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 7,315 training samples
- Columns:
positive
andanchor
- Approximate statistics based on the first 1000 samples:
positive anchor type string string details - min: 63 tokens
- mean: 345.31 tokens
- max: 512 tokens
- min: 4 tokens
- mean: 6.58 tokens
- max: 10 tokens
- Samples:
positive anchor · 12/14/2005 11.28 IFAX Chwkfaxlslimanstander.com "* Kelly Ramsden lg]001/013
.
THIS LEASE, dated for reference the 1st day of December, 2004, is
.
BETWEEN:
.
OXFORD DEVELOPMENTS LTD. a company duly incorporated under the
laws of the Province of British Columbia under nwnber 640355 and having it>
registered and records office at 201-45793 Luckakuck Way, Chilliwack, B.C.
.
V2R5P9
(hereinafter called the "Landlord")
.
OF THE FIRST PART
.
AND:
.
HUB INTERNATIONAL BARTON LIMITED having an office
at 45710 Airport Road, Chilliwack, B.C. V2P 6Z9
(hcrcinatlcr called the "Tenant")
.
...Lessee Legal Name
Tenant shall pay to Landlord, as a fee (the "Termination Fee"), an amount equal to the Unamortized Portion (as hereinafter defined) of the following amounts, plus interest thereon at the rate of 7.5% per annum, compounded monthly (collectively, in the aggregate "Transaction Costs"): (1) brokerage commissions incurred by Landlord, and (2) Landlord's reasonable attorney's fees, in each case in connection with entering into this Lease. Tenant shall pay fifty percent (50%) of the Termination Fee to Landlord within thirty (30) days following Tenant's delivery of Tenant's termination notice, and the remaining fifty percent (50%) of the Termination Fee shall be paid by Tenant on or before July 1, 2024
Early Termination Costs for Lessee
Any expenses related to subsequent revisions will be the expense of Tenant Landlord(s):__________ 16 of 20 PA Tenant(s):__________ C. Termination Option Tenant shall have the right to cancel the lease (the “Termination Option”) effective on February 28, 2019 (the “Termination Date”). Tenant shall provide Landlord notice of its intent to exercise this Termination Option no later than May 31, 2018, and shall pay Landlord by the Termination Date a termination fee equal to the unamortized portion of the leasing commission and Tenant Improvement Expenses (as defined in Exhibit B) incurred by Landlord as a result of this lease transaction, plus a “remarketing fee” of $5,000.00. IN WITNESS WHEREOF, the parties have executed this Lease as of the date hereof..
LANDLORD:
S and S Crossroads, LLC
By:
...Early Termination Notice
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: epochper_device_train_batch_size
: 32per_device_eval_batch_size
: 16gradient_accumulation_steps
: 16learning_rate
: 2e-05num_train_epochs
: 30lr_scheduler_type
: cosinewarmup_ratio
: 0.1tf32
: Falseload_best_model_at_end
: Trueoptim
: adamw_torch_fusedbatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: epochprediction_loss_only
: Trueper_device_train_batch_size
: 32per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 16eval_accumulation_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 30max_steps
: -1lr_scheduler_type
: cosinelr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Falselocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torch_fusedoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseprompts
: Nonebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | dim_768_cosine_ndcg@10 |
---|---|---|---|
0.6987 | 10 | 2.5547 | - |
1.3974 | 20 | 1.0737 | - |
2.0961 | 30 | 0.0724 | - |
2.7948 | 40 | 0.0 | - |
3.4934 | 50 | 0.0 | - |
3.7729 | 54 | - | 0.0239 |
1.3537 | 60 | 0.9932 | - |
2.0524 | 70 | 1.193 | - |
2.7511 | 80 | 0.0518 | - |
3.4498 | 90 | 0.0009 | - |
4.1485 | 100 | 0.0 | - |
4.7773 | 109 | - | 0.0228 |
2.0087 | 110 | 0.0154 | - |
2.7074 | 120 | 1.0959 | - |
3.4061 | 130 | 0.2585 | - |
4.1048 | 140 | 0.0006 | - |
4.8035 | 150 | 0.0 | - |
5.5022 | 160 | 0.0 | - |
5.7817 | 164 | - | 0.0274 |
3.3624 | 170 | 0.5192 | - |
4.0611 | 180 | 0.5537 | - |
4.7598 | 190 | 0.0037 | - |
5.4585 | 200 | 0.0 | - |
6.1572 | 210 | 0.0 | - |
6.786 | 219 | - | 0.0283 |
4.0175 | 220 | 0.0219 | - |
4.7162 | 230 | 0.756 | - |
5.4148 | 240 | 0.156 | - |
6.1135 | 250 | 0.0002 | - |
6.8122 | 260 | 0.0 | - |
7.5109 | 270 | 0.0 | - |
7.7904 | 274 | - | 0.0264 |
5.3712 | 280 | 0.4501 | - |
6.0699 | 290 | 0.4103 | - |
6.7686 | 300 | 0.0009 | - |
7.4672 | 310 | 0.0 | - |
8.1659 | 320 | 0.0 | - |
8.7948 | 329 | - | 0.0280 |
6.0262 | 330 | 0.0287 | - |
6.7249 | 340 | 0.6199 | - |
7.4236 | 350 | 0.1078 | - |
8.1223 | 360 | 0.0001 | - |
8.8210 | 370 | 0.0 | - |
9.5197 | 380 | 0.0 | - |
9.7991 | 384 | - | 0.0263 |
7.3799 | 390 | 0.3923 | - |
8.0786 | 400 | 0.3161 | - |
8.7773 | 410 | 0.0006 | - |
9.4760 | 420 | 0.0 | 0.0261 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.3.1
- Transformers: 4.41.2
- PyTorch: 2.1.2+cu121
- Accelerate: 1.1.1
- Datasets: 2.19.1
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 38,733
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for RishuD7/bge-base-en-v1.5-65-keys-phase-5-exp_v1
Base model
BAAI/bge-base-en-v1.5Evaluation results
- Cosine Accuracy@1 on dim 768self-reported0.006
- Cosine Accuracy@3 on dim 768self-reported0.015
- Cosine Accuracy@5 on dim 768self-reported0.028
- Cosine Accuracy@10 on dim 768self-reported0.058
- Cosine Precision@1 on dim 768self-reported0.006
- Cosine Precision@3 on dim 768self-reported0.005
- Cosine Precision@5 on dim 768self-reported0.006
- Cosine Precision@10 on dim 768self-reported0.006
- Cosine Recall@1 on dim 768self-reported0.006
- Cosine Recall@3 on dim 768self-reported0.015