metadata

base_model: Snowflake/snowflake-arctic-embed-m
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
  - dot_accuracy@1
  - dot_accuracy@3
  - dot_accuracy@5
  - dot_accuracy@10
  - dot_precision@1
  - dot_precision@3
  - dot_precision@5
  - dot_precision@10
  - dot_recall@1
  - dot_recall@3
  - dot_recall@5
  - dot_recall@10
  - dot_ndcg@10
  - dot_mrr@10
  - dot_map@100
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:600
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: >-
      How can organizations tailor their measurement of GAI risks based on
      specific characteristics?
    sentences:
      - >-
        3 

        the abuse, misuse, and unsafe repurposing by humans (adversarial or
        not), and others result 

        from interactions between a human and an AI system.  

        • 

        Time scale: GAI risks may materialize abruptly or across extended
        periods. Examples include 

        immediate (and/or prolonged) emotional harm and potential risks to
        physical safety due to the 

        distribution of harmful deepfake images, or the long-term eﬀect of
        disinformation on societal 

        trust in public institutions.
      - >-
        12 

        CSAM. Even when trained on “clean” data, increasingly capable GAI models
        can synthesize or produce 

        synthetic NCII and CSAM. Websites, mobile apps, and custom-built models
        that generate synthetic NCII 

        have moved from niche internet forums to mainstream, automated, and
        scaled online businesses.  

        Trustworthy AI Characteristics: Fair with Harmful Bias Managed, Safe,
        Privacy Enhanced 

        2.12. 

        Value Chain and Component Integration
      - >-
        case context. 

        Organizations may choose to tailor how they measure GAI risks based on
        these characteristics. They may 

        additionally wish to allocate risk management resources relative to the
        severity and likelihood of 

        negative impacts, including where and how these risks manifest, and
        their direct and material impacts 

        harms in the context of GAI use. Mitigations for model or system level
        risks may diﬀer from mitigations 

        for use-case or ecosystem level risks.
  - source_sentence: >-
      What methods are suggested for measuring the reliability of content
      authentication techniques in the context of content provenance?
    sentences:
      - >-
        updates. 

        Information Integrity; Data Privacy 

        MG-3.2-003 

        Document sources and types of training data and their origins, potential
        biases 

        present in the data related to the GAI application and its content
        provenance, 

        architecture, training process of the pre-trained model including
        information on 

        hyperparameters, training duration, and any ﬁne-tuning or
        retrieval-augmented 

        generation processes applied. 

        Information Integrity; Harmful Bias 

        and Homogenization; Intellectual 

        Property
      - >-
        Security 

        MS-2.7-005 

        Measure reliability of content authentication methods, such as
        watermarking, 

        cryptographic signatures, digital ﬁngerprints, as well as access
        controls, 

        conformity assessment, and model integrity veriﬁcation, which can help
        support 

        the eﬀective implementation of content provenance techniques. Evaluate
        the 

        rate of false positives and false negatives in content provenance, as
        well as true 

        positives and true negatives for veriﬁcation. 

        Information Integrity 

        MS-2.7-006
      - >-
        GV-1.6-003 

        In addition to general model, governance, and risk information, consider
        the 

        following items in GAI system inventory entries: Data provenance
        information 

        (e.g., source, signatures, versioning, watermarks); Known issues
        reported from 

        internal bug tracking or external information sharing resources (e.g.,
        AI incident 

        database, AVID, CVE, NVD, or OECD AI incident monitor); Human oversight
        roles 

        and responsibilities; Special rights and considerations for intellectual
        property,
  - source_sentence: >-
      What are the suggested actions an organization can take to manage GAI
      risks?
    sentences:
      - >-
        Information Integrity; Dangerous, 

        Violent, or Hateful Content; CBRN 

        Information or Capabilities 

        GV-1.3-007 Devise a plan to halt development or deployment of a GAI
        system that poses 

        unacceptable negative risk. 

        CBRN Information and Capability; 

        Information Security; Information 

        Integrity 

        AI Actor Tasks: Governance and Oversight 
         
        GOVERN 1.4: The risk management process and its outcomes are established
        through transparent policies, procedures, and other
      - >-
        match the statistical properties of real-world data without disclosing
        personally 

        identiﬁable information or contributing to homogenization. 

        Data Privacy; Intellectual Property; 

        Information Integrity; 

        Confabulation; Harmful Bias and 

        Homogenization 

        AI Actor Tasks: AI Deployment, AI Impact Assessment, Governance and
        Oversight, Operation and Monitoring 
         
        MANAGE 2.3: Procedures are followed to respond to and recover from a
        previously unknown risk when it is identiﬁed. 

        Action ID
      - >-
        • 

        Suggested Action: Steps an organization or AI actor can take to manage
        GAI risks.  

        • 

        GAI Risks: Tags linking suggested actions with relevant GAI risks.  

        • 

        AI Actor Tasks: Pertinent AI Actor Tasks for each subcategory. Not every
        AI Actor Task listed will 

        apply to every suggested action in the subcategory (i.e., some apply to
        AI development and 

        others apply to AI deployment).  

        The tables below begin with the AI RMF subcategory, shaded in blue,
        followed by suggested actions.
  - source_sentence: >-
      How can harmful bias and homogenization be addressed in the context of
      human-AI configuration?
    sentences:
      - >-
        on GAI, apply general fairness metrics (e.g., demographic parity,
        equalized odds, 

        equal opportunity, statistical hypothesis tests), to the pipeline or
        business 

        outcome where appropriate; Custom, context-speciﬁc metrics developed in 

        collaboration with domain experts and aﬀected communities; Measurements
        of 

        the prevalence of denigration in generated content in deployment (e.g.,
        sub-

        sampling a fraction of traﬃc and manually annotating denigrating
        content). 

        Harmful Bias and Homogenization;
      - >-
        MP-5.1-001 Apply TEVV practices for content provenance (e.g., probing a
        system's synthetic 

        data generation capabilities for potential misuse or vulnerabilities. 

        Information Integrity; Information 

        Security 

        MP-5.1-002 

        Identify potential content provenance harms of GAI, such as
        misinformation or 

        disinformation, deepfakes, including NCII, or tampered content.
        Enumerate and 

        rank risks based on their likelihood and potential impact, and determine
        how well
      - >-
        MS-1.3-002 

        Engage in internal and external evaluations, GAI red-teaming, impact 

        assessments, or other structured human feedback exercises in
        consultation 

        with representative AI Actors with expertise and familiarity in the
        context of 

        use, and/or who are representative of the populations associated with
        the 

        context of use. 

        Human-AI Conﬁguration; Harmful 

        Bias and Homogenization; CBRN 

        Information or Capabilities 

        MS-1.3-003
  - source_sentence: >-
      How can structured human feedback exercises, such as GAI red-teaming,
      contribute to GAI risk measurement and management?
    sentences:
      - >-
        rank risks based on their likelihood and potential impact, and determine
        how well 

        provenance solutions address speciﬁc risks and/or harms. 

        Information Integrity; Dangerous, 

        Violent, or Hateful Content; 

        Obscene, Degrading, and/or 

        Abusive Content 

        MP-5.1-003 

        Consider disclosing use of GAI to end users in relevant contexts, while
        considering 

        the objective of disclosure, the context of use, the likelihood and
        magnitude of the
      - >-
        15 

        GV-1.3-004 Obtain input from stakeholder communities to identify
        unacceptable use, in 

        accordance with activities in the AI RMF Map function. 

        CBRN Information or Capabilities; 

        Obscene, Degrading, and/or 

        Abusive Content; Harmful Bias 

        and Homogenization; Dangerous, 

        Violent, or Hateful Content 

        GV-1.3-005 

        Maintain an updated hierarchy of identiﬁed and expected GAI risks
        connected to 

        contexts of GAI model advancement and use, potentially including
        specialized risk
      - >-
        AI-generated content, for example by employing techniques like chaos 

        engineering and seeking stakeholder feedback. 

        Information Integrity 

        MS-1.1-008 

        Deﬁne use cases, contexts of use, capabilities, and negative impacts
        where 

        structured human feedback exercises, e.g., GAI red-teaming, would be
        most 

        beneﬁcial for GAI risk measurement and management based on the context
        of 

        use. 

        Harmful Bias and 

        Homogenization; CBRN 

        Information or Capabilities 

        MS-1.1-009
model-index:
  - name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-m
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: Unknown
          type: unknown
        metrics:
          - type: cosine_accuracy@1
            value: 0.85
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.96
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.98
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.85
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.31999999999999995
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19599999999999995
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09999999999999998
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.85
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.96
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.98
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9342942871848772
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.9124166666666668
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.9124166666666668
            name: Cosine Map@100
          - type: dot_accuracy@1
            value: 0.85
            name: Dot Accuracy@1
          - type: dot_accuracy@3
            value: 0.96
            name: Dot Accuracy@3
          - type: dot_accuracy@5
            value: 0.98
            name: Dot Accuracy@5
          - type: dot_accuracy@10
            value: 1
            name: Dot Accuracy@10
          - type: dot_precision@1
            value: 0.85
            name: Dot Precision@1
          - type: dot_precision@3
            value: 0.31999999999999995
            name: Dot Precision@3
          - type: dot_precision@5
            value: 0.19599999999999995
            name: Dot Precision@5
          - type: dot_precision@10
            value: 0.09999999999999998
            name: Dot Precision@10
          - type: dot_recall@1
            value: 0.85
            name: Dot Recall@1
          - type: dot_recall@3
            value: 0.96
            name: Dot Recall@3
          - type: dot_recall@5
            value: 0.98
            name: Dot Recall@5
          - type: dot_recall@10
            value: 1
            name: Dot Recall@10
          - type: dot_ndcg@10
            value: 0.9342942871848772
            name: Dot Ndcg@10
          - type: dot_mrr@10
            value: 0.9124166666666668
            name: Dot Mrr@10
          - type: dot_map@100
            value: 0.9124166666666668
            name: Dot Map@100

SentenceTransformer based on Snowflake/snowflake-arctic-embed-m

This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-m. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: Snowflake/snowflake-arctic-embed-m
Maximum Sequence Length: 512 tokens
Output Dimensionality: 768 tokens
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Cheselle/finetuned-arctic-sentence")
# Run inference
sentences = [
    'How can structured human feedback exercises, such as GAI red-teaming, contribute to GAI risk measurement and management?',
    'AI-generated content, for example by employing techniques like chaos \nengineering and seeking stakeholder feedback. \nInformation Integrity \nMS-1.1-008 \nDeﬁne use cases, contexts of use, capabilities, and negative impacts where \nstructured human feedback exercises, e.g., GAI red-teaming, would be most \nbeneﬁcial for GAI risk measurement and management based on the context of \nuse. \nHarmful Bias and \nHomogenization; CBRN \nInformation or Capabilities \nMS-1.1-009',
    '15 \nGV-1.3-004 Obtain input from stakeholder communities to identify unacceptable use, in \naccordance with activities in the AI RMF Map function. \nCBRN Information or Capabilities; \nObscene, Degrading, and/or \nAbusive Content; Harmful Bias \nand Homogenization; Dangerous, \nViolent, or Hateful Content \nGV-1.3-005 \nMaintain an updated hierarchy of identiﬁed and expected GAI risks connected to \ncontexts of GAI model advancement and use, potentially including specialized risk',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.85
cosine_accuracy@3	0.96
cosine_accuracy@5	0.98
cosine_accuracy@10	1.0
cosine_precision@1	0.85
cosine_precision@3	0.32
cosine_precision@5	0.196
cosine_precision@10	0.1
cosine_recall@1	0.85
cosine_recall@3	0.96
cosine_recall@5	0.98
cosine_recall@10	1.0
cosine_ndcg@10	0.9343
cosine_mrr@10	0.9124
cosine_map@100	0.9124
dot_accuracy@1	0.85
dot_accuracy@3	0.96
dot_accuracy@5	0.98
dot_accuracy@10	1.0
dot_precision@1	0.85
dot_precision@3	0.32
dot_precision@5	0.196
dot_precision@10	0.1
dot_recall@1	0.85
dot_recall@3	0.96
dot_recall@5	0.98
dot_recall@10	1.0
dot_ndcg@10	0.9343
dot_mrr@10	0.9124
dot_map@100	0.9124

Training Details

Training Dataset

Unnamed Dataset

Size: 600 training samples
Columns: sentence_0 and sentence_1
Approximate statistics based on the first 600 samples:
sentence_0 sentence_1
type string string
details
min: 11 tokens
mean: 21.05 tokens
max: 34 tokens

min: 14 tokens
mean: 91.74 tokens
max: 335 tokens

	sentence_0	sentence_1
type	string	string
details	min: 11 tokens mean: 21.05 tokens max: 34 tokens	min: 14 tokens mean: 91.74 tokens max: 335 tokens

Samples:

sentence_0	sentence_1
`What is the title of the publication related to Artificial Intelligence Risk Management by NIST?`	`NIST Trustworthy and Responsible AI NIST AI 600-1 Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile This publication is available free of charge from: https://doi.org/10.6028/NIST.AI.600-1`
`Where can the NIST AI 600-1 publication be accessed for free?`	`NIST Trustworthy and Responsible AI NIST AI 600-1 Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile This publication is available free of charge from: https://doi.org/10.6028/NIST.AI.600-1`
`What is the title of the publication released by NIST in July 2024 regarding AI risk management?`	`NIST Trustworthy and Responsible AI NIST AI 600-1 Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile This publication is available free of charge from: https://doi.org/10.6028/NIST.AI.600-1 July 2024 U.S. Department of Commerce Gina M. Raimondo, Secretary National Institute of Standards and Technology Laurie E. Locascio, NIST Director and Under Secretary of Commerce for Standards and Technology`

Loss: MatryoshkaLoss with these parameters:

{
    "loss": "MultipleNegativesRankingLoss",
    "matryoshka_dims": [
        768,
        512,
        256,
        128,
        64
    ],
    "matryoshka_weights": [
        1,
        1,
        1,
        1,
        1
    ],
    "n_dims_per_step": -1
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 3
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.0
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: False
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
eval_use_gather_object: False
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin

Training Logs

Epoch	Step	cosine_map@100
1.0	38	0.9033
1.3158	50	0.9067
2.0	76	0.9124

Framework Versions

Python: 3.10.12
Sentence Transformers: 3.1.1
Transformers: 4.44.2
PyTorch: 2.4.1+cu121
Accelerate: 0.34.2
Datasets: 3.0.0
Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}