metadata
base_model: Snowflake/snowflake-arctic-embed-m
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
- dot_accuracy@1
- dot_accuracy@3
- dot_accuracy@5
- dot_accuracy@10
- dot_precision@1
- dot_precision@3
- dot_precision@5
- dot_precision@10
- dot_recall@1
- dot_recall@3
- dot_recall@5
- dot_recall@10
- dot_ndcg@10
- dot_mrr@10
- dot_map@100
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:600
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
widget:
- source_sentence: >-
How can organizations tailor their measurement of GAI risks based on
specific characteristics?
sentences:
- >-
3
the abuse, misuse, and unsafe repurposing by humans (adversarial or
not), and others result
from interactions between a human and an AI system.
•
Time scale: GAI risks may materialize abruptly or across extended
periods. Examples include
immediate (and/or prolonged) emotional harm and potential risks to
physical safety due to the
distribution of harmful deepfake images, or the long-term effect of
disinformation on societal
trust in public institutions.
- >-
12
CSAM. Even when trained on “clean” data, increasingly capable GAI models
can synthesize or produce
synthetic NCII and CSAM. Websites, mobile apps, and custom-built models
that generate synthetic NCII
have moved from niche internet forums to mainstream, automated, and
scaled online businesses.
Trustworthy AI Characteristics: Fair with Harmful Bias Managed, Safe,
Privacy Enhanced
2.12.
Value Chain and Component Integration
- >-
case context.
Organizations may choose to tailor how they measure GAI risks based on
these characteristics. They may
additionally wish to allocate risk management resources relative to the
severity and likelihood of
negative impacts, including where and how these risks manifest, and
their direct and material impacts
harms in the context of GAI use. Mitigations for model or system level
risks may differ from mitigations
for use-case or ecosystem level risks.
- source_sentence: >-
What methods are suggested for measuring the reliability of content
authentication techniques in the context of content provenance?
sentences:
- >-
updates.
Information Integrity; Data Privacy
MG-3.2-003
Document sources and types of training data and their origins, potential
biases
present in the data related to the GAI application and its content
provenance,
architecture, training process of the pre-trained model including
information on
hyperparameters, training duration, and any fine-tuning or
retrieval-augmented
generation processes applied.
Information Integrity; Harmful Bias
and Homogenization; Intellectual
Property
- >-
Security
MS-2.7-005
Measure reliability of content authentication methods, such as
watermarking,
cryptographic signatures, digital fingerprints, as well as access
controls,
conformity assessment, and model integrity verification, which can help
support
the effective implementation of content provenance techniques. Evaluate
the
rate of false positives and false negatives in content provenance, as
well as true
positives and true negatives for verification.
Information Integrity
MS-2.7-006
- >-
GV-1.6-003
In addition to general model, governance, and risk information, consider
the
following items in GAI system inventory entries: Data provenance
information
(e.g., source, signatures, versioning, watermarks); Known issues
reported from
internal bug tracking or external information sharing resources (e.g.,
AI incident
database, AVID, CVE, NVD, or OECD AI incident monitor); Human oversight
roles
and responsibilities; Special rights and considerations for intellectual
property,
- source_sentence: >-
What are the suggested actions an organization can take to manage GAI
risks?
sentences:
- >-
Information Integrity; Dangerous,
Violent, or Hateful Content; CBRN
Information or Capabilities
GV-1.3-007 Devise a plan to halt development or deployment of a GAI
system that poses
unacceptable negative risk.
CBRN Information and Capability;
Information Security; Information
Integrity
AI Actor Tasks: Governance and Oversight
GOVERN 1.4: The risk management process and its outcomes are established
through transparent policies, procedures, and other
- >-
match the statistical properties of real-world data without disclosing
personally
identifiable information or contributing to homogenization.
Data Privacy; Intellectual Property;
Information Integrity;
Confabulation; Harmful Bias and
Homogenization
AI Actor Tasks: AI Deployment, AI Impact Assessment, Governance and
Oversight, Operation and Monitoring
MANAGE 2.3: Procedures are followed to respond to and recover from a
previously unknown risk when it is identified.
Action ID
- >-
•
Suggested Action: Steps an organization or AI actor can take to manage
GAI risks.
•
GAI Risks: Tags linking suggested actions with relevant GAI risks.
•
AI Actor Tasks: Pertinent AI Actor Tasks for each subcategory. Not every
AI Actor Task listed will
apply to every suggested action in the subcategory (i.e., some apply to
AI development and
others apply to AI deployment).
The tables below begin with the AI RMF subcategory, shaded in blue,
followed by suggested actions.
- source_sentence: >-
How can harmful bias and homogenization be addressed in the context of
human-AI configuration?
sentences:
- >-
on GAI, apply general fairness metrics (e.g., demographic parity,
equalized odds,
equal opportunity, statistical hypothesis tests), to the pipeline or
business
outcome where appropriate; Custom, context-specific metrics developed in
collaboration with domain experts and affected communities; Measurements
of
the prevalence of denigration in generated content in deployment (e.g.,
sub-
sampling a fraction of traffic and manually annotating denigrating
content).
Harmful Bias and Homogenization;
- >-
MP-5.1-001 Apply TEVV practices for content provenance (e.g., probing a
system's synthetic
data generation capabilities for potential misuse or vulnerabilities.
Information Integrity; Information
Security
MP-5.1-002
Identify potential content provenance harms of GAI, such as
misinformation or
disinformation, deepfakes, including NCII, or tampered content.
Enumerate and
rank risks based on their likelihood and potential impact, and determine
how well
- >-
MS-1.3-002
Engage in internal and external evaluations, GAI red-teaming, impact
assessments, or other structured human feedback exercises in
consultation
with representative AI Actors with expertise and familiarity in the
context of
use, and/or who are representative of the populations associated with
the
context of use.
Human-AI Configuration; Harmful
Bias and Homogenization; CBRN
Information or Capabilities
MS-1.3-003
- source_sentence: >-
How can structured human feedback exercises, such as GAI red-teaming,
contribute to GAI risk measurement and management?
sentences:
- >-
rank risks based on their likelihood and potential impact, and determine
how well
provenance solutions address specific risks and/or harms.
Information Integrity; Dangerous,
Violent, or Hateful Content;
Obscene, Degrading, and/or
Abusive Content
MP-5.1-003
Consider disclosing use of GAI to end users in relevant contexts, while
considering
the objective of disclosure, the context of use, the likelihood and
magnitude of the
- >-
15
GV-1.3-004 Obtain input from stakeholder communities to identify
unacceptable use, in
accordance with activities in the AI RMF Map function.
CBRN Information or Capabilities;
Obscene, Degrading, and/or
Abusive Content; Harmful Bias
and Homogenization; Dangerous,
Violent, or Hateful Content
GV-1.3-005
Maintain an updated hierarchy of identified and expected GAI risks
connected to
contexts of GAI model advancement and use, potentially including
specialized risk
- >-
AI-generated content, for example by employing techniques like chaos
engineering and seeking stakeholder feedback.
Information Integrity
MS-1.1-008
Define use cases, contexts of use, capabilities, and negative impacts
where
structured human feedback exercises, e.g., GAI red-teaming, would be
most
beneficial for GAI risk measurement and management based on the context
of
use.
Harmful Bias and
Homogenization; CBRN
Information or Capabilities
MS-1.1-009
model-index:
- name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-m
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: Unknown
type: unknown
metrics:
- type: cosine_accuracy@1
value: 0.85
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.96
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.98
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 1
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.85
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.31999999999999995
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.19599999999999995
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.09999999999999998
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.85
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.96
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.98
name: Cosine Recall@5
- type: cosine_recall@10
value: 1
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.9342942871848772
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.9124166666666668
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.9124166666666668
name: Cosine Map@100
- type: dot_accuracy@1
value: 0.85
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.96
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.98
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 1
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.85
name: Dot Precision@1
- type: dot_precision@3
value: 0.31999999999999995
name: Dot Precision@3
- type: dot_precision@5
value: 0.19599999999999995
name: Dot Precision@5
- type: dot_precision@10
value: 0.09999999999999998
name: Dot Precision@10
- type: dot_recall@1
value: 0.85
name: Dot Recall@1
- type: dot_recall@3
value: 0.96
name: Dot Recall@3
- type: dot_recall@5
value: 0.98
name: Dot Recall@5
- type: dot_recall@10
value: 1
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.9342942871848772
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.9124166666666668
name: Dot Mrr@10
- type: dot_map@100
value: 0.9124166666666668
name: Dot Map@100
SentenceTransformer based on Snowflake/snowflake-arctic-embed-m
This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-m. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Snowflake/snowflake-arctic-embed-m
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Cheselle/finetuned-arctic-sentence")
# Run inference
sentences = [
'How can structured human feedback exercises, such as GAI red-teaming, contribute to GAI risk measurement and management?',
'AI-generated content, for example by employing techniques like chaos \nengineering and seeking stakeholder feedback. \nInformation Integrity \nMS-1.1-008 \nDefine use cases, contexts of use, capabilities, and negative impacts where \nstructured human feedback exercises, e.g., GAI red-teaming, would be most \nbeneficial for GAI risk measurement and management based on the context of \nuse. \nHarmful Bias and \nHomogenization; CBRN \nInformation or Capabilities \nMS-1.1-009',
'15 \nGV-1.3-004 Obtain input from stakeholder communities to identify unacceptable use, in \naccordance with activities in the AI RMF Map function. \nCBRN Information or Capabilities; \nObscene, Degrading, and/or \nAbusive Content; Harmful Bias \nand Homogenization; Dangerous, \nViolent, or Hateful Content \nGV-1.3-005 \nMaintain an updated hierarchy of identified and expected GAI risks connected to \ncontexts of GAI model advancement and use, potentially including specialized risk',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.85 |
cosine_accuracy@3 | 0.96 |
cosine_accuracy@5 | 0.98 |
cosine_accuracy@10 | 1.0 |
cosine_precision@1 | 0.85 |
cosine_precision@3 | 0.32 |
cosine_precision@5 | 0.196 |
cosine_precision@10 | 0.1 |
cosine_recall@1 | 0.85 |
cosine_recall@3 | 0.96 |
cosine_recall@5 | 0.98 |
cosine_recall@10 | 1.0 |
cosine_ndcg@10 | 0.9343 |
cosine_mrr@10 | 0.9124 |
cosine_map@100 | 0.9124 |
dot_accuracy@1 | 0.85 |
dot_accuracy@3 | 0.96 |
dot_accuracy@5 | 0.98 |
dot_accuracy@10 | 1.0 |
dot_precision@1 | 0.85 |
dot_precision@3 | 0.32 |
dot_precision@5 | 0.196 |
dot_precision@10 | 0.1 |
dot_recall@1 | 0.85 |
dot_recall@3 | 0.96 |
dot_recall@5 | 0.98 |
dot_recall@10 | 1.0 |
dot_ndcg@10 | 0.9343 |
dot_mrr@10 | 0.9124 |
dot_map@100 | 0.9124 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 600 training samples
- Columns:
sentence_0
andsentence_1
- Approximate statistics based on the first 600 samples:
sentence_0 sentence_1 type string string details - min: 11 tokens
- mean: 21.05 tokens
- max: 34 tokens
- min: 14 tokens
- mean: 91.74 tokens
- max: 335 tokens
- Samples:
sentence_0 sentence_1 What is the title of the publication related to Artificial Intelligence Risk Management by NIST?
NIST Trustworthy and Responsible AI
NIST AI 600-1
Artificial Intelligence Risk Management
Framework: Generative Artificial
Intelligence Profile
This publication is available free of charge from:
https://doi.org/10.6028/NIST.AI.600-1Where can the NIST AI 600-1 publication be accessed for free?
NIST Trustworthy and Responsible AI
NIST AI 600-1
Artificial Intelligence Risk Management
Framework: Generative Artificial
Intelligence Profile
This publication is available free of charge from:
https://doi.org/10.6028/NIST.AI.600-1What is the title of the publication released by NIST in July 2024 regarding AI risk management?
NIST Trustworthy and Responsible AI
NIST AI 600-1
Artificial Intelligence Risk Management
Framework: Generative Artificial
Intelligence Profile
This publication is available free of charge from:
https://doi.org/10.6028/NIST.AI.600-1
July 2024
U.S. Department of Commerce
Gina M. Raimondo, Secretary
National Institute of Standards and Technology
Laurie E. Locascio, NIST Director and Under Secretary of Commerce for Standards and Technology - Loss:
MatryoshkaLoss
with these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 16per_device_eval_batch_size
: 16multi_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 16per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 3max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseeval_use_gather_object
: Falsebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Epoch | Step | cosine_map@100 |
---|---|---|
1.0 | 38 | 0.9033 |
1.3158 | 50 | 0.9067 |
2.0 | 76 | 0.9124 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.1.1
- Transformers: 4.44.2
- PyTorch: 2.4.1+cu121
- Accelerate: 0.34.2
- Datasets: 3.0.0
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}