SentenceTransformer based on Snowflake/snowflake-arctic-embed-m-long
This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-m-long on the csv dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Snowflake/snowflake-arctic-embed-m-long
- Maximum Sequence Length: 8192 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
- Training Dataset:
- csv
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: NomicBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("jebish7/snowflake-arctic-embed-m-long_MNR_1")
# Run inference
sentences = [
'How should a Relevant Person ensure and demonstrate compliance with both UNSC Sanctions and U.A.E.-administered Sanctions, specifically Targeted Financial Sanctions, within the ADGM jurisdiction?',
'Where a Relevant Person seeks to rely on a Person in (1) it may only do so if and to the extent that:\n(a)\tit immediately obtains the necessary CDD information from the third party in (1);\n(b)\tit takes adequate steps to satisfy itself that certified copies of the documents used to undertake the relevant elements of CDD will be available from the third party on request without delay;\n(c)\tthe Person in (1)(b) to (d) is subject to regulation, including AML/TFS compliance requirements, by a Non-ADGM Financial Services Regulator or other competent authority in a country with AML/TFS regulations which are equivalent to the standards set out in the FATF Recommendations and it is supervised for compliance with such regulations;\n(d)\tthe Person in (1) has not relied on any exception from the requirement to conduct any relevant elements of CDD which the Relevant Person seeks to rely on; and\n(e)\tin relation to (2), the information is up to date.',
'REGULATORY REQUIREMENTS - SPOT COMMODITY ACTIVITIES\nRIEs operating an MTF or OTF using Accepted Spot Commodities\nAuthorised Persons that are operating an MTF or OTF wishing to also operate a RIE will be required to relinquish their FSP upon obtaining a Recognition Order (to operate the RIE). If licensed by the FSRA to carry out both Regulated Activities (e.g., operating an MTF and operating an RIE), the Recognition Order will include a stipulation to that effect pursuant to MIR Rule 3.4.1.\n',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
csv
- Dataset: csv
- Size: 29,547 training samples
- Columns:
Question
andpositive
- Approximate statistics based on the first 1000 samples:
Question positive type string string details - min: 18 tokens
- mean: 34.91 tokens
- max: 83 tokens
- min: 13 tokens
- mean: 118.51 tokens
- max: 1090 tokens
- Samples:
Question positive Under which circumstances is a Mining Reporting Entity exempt from immediate disclosure of material information about its mining activities according to the FSRA guidelines?
INTERACTION OF CHAPTER 11 WITH OTHER RULE DISCLOSURE OBLIGATIONS. Prior to a Mining Reporting Entity having all the information available to it, the FSRA considers that whatever material information it may have about the mining activity will generally be insufficiently definite to warrant disclosure under the Rules. Therefore, provided the material information is and remains confidential, and the FSRA has not formed the view that the information ceases to remain confidential (e.g., where there are exceptions from disclosing the information), the material information is not immediately required to be disclosed under Rule 7.2.1. For more information, please refer to Chapter 7 of the Rules, and any relevant Guidance that the FSRA may publish from time in relation to the FSRA’s expectations as to how Reporting Entities are to comply with Chapter 7.
What specific IAASB standards or other standards acceptable to the Regulator are required for the audit of a Public Listed Company's financial statements?
Where an Authorised Person does not hold or control any Client Money as at the date on which the Authorised Person's audited statement of financial position was prepared, the Regulator expects that a nil balance be stated to comply with Rule 6.6.6.
How does the ADGM monitor compliance with the principles of effective dialogue with shareholders, and what are the consequences for companies that fail to establish such a dialogue?
Audit committee. The Board as a whole has responsibility for ensuring that a satisfactory dialogue with Shareholders takes place. Such dialogue should be based on the mutual understanding of objectives and provision of adequate information relating to the Reporting Entity including financial information, and how the business and affairs of the Reporting Entity are carried out.
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size
: 4learning_rate
: 2e-05num_train_epochs
: 1warmup_ratio
: 0.1batch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: noprediction_loss_only
: Trueper_device_train_batch_size
: 4per_device_eval_batch_size
: 8per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falsebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss |
---|---|---|
0.0271 | 100 | 0.6411 |
0.0541 | 200 | 0.3289 |
0.0812 | 300 | 0.2395 |
0.1083 | 400 | 0.2711 |
0.1354 | 500 | 0.2746 |
0.1624 | 600 | 0.2602 |
0.1895 | 700 | 0.285 |
0.2166 | 800 | 0.2965 |
0.2436 | 900 | 0.2772 |
0.2707 | 1000 | 0.3043 |
0.2978 | 1100 | 0.3059 |
0.3249 | 1200 | 0.316 |
0.3519 | 1300 | 0.2765 |
0.3790 | 1400 | 0.249 |
0.4061 | 1500 | 0.2601 |
0.4331 | 1600 | 0.2538 |
0.4602 | 1700 | 0.2443 |
0.4873 | 1800 | 0.2151 |
0.5143 | 1900 | 0.2335 |
0.5414 | 2000 | 0.2611 |
0.5685 | 2100 | 0.2557 |
0.5956 | 2200 | 0.2793 |
0.0694 | 100 | 0.2141 |
0.1389 | 200 | 0.273 |
0.2083 | 300 | 0.295 |
0.2778 | 400 | 0.2079 |
0.3472 | 500 | 0.2556 |
0.4167 | 600 | 0.252 |
0.4861 | 700 | 0.2142 |
0.5556 | 800 | 0.2181 |
0.625 | 900 | 0.2347 |
0.6944 | 1000 | 0.1754 |
0.7639 | 1100 | 0.2313 |
0.8333 | 1200 | 0.2104 |
0.9028 | 1300 | 0.2435 |
0.9722 | 1400 | 0.2399 |
Framework Versions
- Python: 3.10.14
- Sentence Transformers: 3.1.1
- Transformers: 4.45.2
- PyTorch: 2.4.0
- Accelerate: 0.34.2
- Datasets: 3.0.1
- Tokenizers: 0.20.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 7
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for jebish7/snowflake_1
Base model
Snowflake/snowflake-arctic-embed-m-long