Cheselle's picture
Add new SentenceTransformer model.
baa7639 verified
|
raw
history blame
28.1 kB
metadata
base_model: Snowflake/snowflake-arctic-embed-m
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
  - dot_accuracy@1
  - dot_accuracy@3
  - dot_accuracy@5
  - dot_accuracy@10
  - dot_precision@1
  - dot_precision@3
  - dot_precision@5
  - dot_precision@10
  - dot_recall@1
  - dot_recall@3
  - dot_recall@5
  - dot_recall@10
  - dot_ndcg@10
  - dot_mrr@10
  - dot_map@100
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:600
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: >-
      How can organizations tailor their measurement of GAI risks based on
      specific characteristics?
    sentences:
      - >-
        3 

        the abuse, misuse, and unsafe repurposing by humans (adversarial or
        not), and others result 

        from interactions between a human and an AI system.  

         

        Time scale: GAI risks may materialize abruptly or across extended
        periods. Examples include 

        immediate (and/or prolonged) emotional harm and potential risks to
        physical safety due to the 

        distribution of harmful deepfake images, or the long-term effect of
        disinformation on societal 

        trust in public institutions.
      - >-
        12 

        CSAM. Even when trained on “clean” data, increasingly capable GAI models
        can synthesize or produce 

        synthetic NCII and CSAM. Websites, mobile apps, and custom-built models
        that generate synthetic NCII 

        have moved from niche internet forums to mainstream, automated, and
        scaled online businesses.  

        Trustworthy AI Characteristics: Fair with Harmful Bias Managed, Safe,
        Privacy Enhanced 

        2.12. 

        Value Chain and Component Integration
      - >-
        case context. 

        Organizations may choose to tailor how they measure GAI risks based on
        these characteristics. They may 

        additionally wish to allocate risk management resources relative to the
        severity and likelihood of 

        negative impacts, including where and how these risks manifest, and
        their direct and material impacts 

        harms in the context of GAI use. Mitigations for model or system level
        risks may differ from mitigations 

        for use-case or ecosystem level risks.
  - source_sentence: >-
      What methods are suggested for measuring the reliability of content
      authentication techniques in the context of content provenance?
    sentences:
      - >-
        updates. 

        Information Integrity; Data Privacy 

        MG-3.2-003 

        Document sources and types of training data and their origins, potential
        biases 

        present in the data related to the GAI application and its content
        provenance, 

        architecture, training process of the pre-trained model including
        information on 

        hyperparameters, training duration, and any fine-tuning or
        retrieval-augmented 

        generation processes applied. 

        Information Integrity; Harmful Bias 

        and Homogenization; Intellectual 

        Property
      - >-
        Security 

        MS-2.7-005 

        Measure reliability of content authentication methods, such as
        watermarking, 

        cryptographic signatures, digital fingerprints, as well as access
        controls, 

        conformity assessment, and model integrity verification, which can help
        support 

        the effective implementation of content provenance techniques. Evaluate
        the 

        rate of false positives and false negatives in content provenance, as
        well as true 

        positives and true negatives for verification. 

        Information Integrity 

        MS-2.7-006
      - >-
        GV-1.6-003 

        In addition to general model, governance, and risk information, consider
        the 

        following items in GAI system inventory entries: Data provenance
        information 

        (e.g., source, signatures, versioning, watermarks); Known issues
        reported from 

        internal bug tracking or external information sharing resources (e.g.,
        AI incident 

        database, AVID, CVE, NVD, or OECD AI incident monitor); Human oversight
        roles 

        and responsibilities; Special rights and considerations for intellectual
        property,
  - source_sentence: >-
      What are the suggested actions an organization can take to manage GAI
      risks?
    sentences:
      - >-
        Information Integrity; Dangerous, 

        Violent, or Hateful Content; CBRN 

        Information or Capabilities 

        GV-1.3-007 Devise a plan to halt development or deployment of a GAI
        system that poses 

        unacceptable negative risk. 

        CBRN Information and Capability; 

        Information Security; Information 

        Integrity 

        AI Actor Tasks: Governance and Oversight 
         
        GOVERN 1.4: The risk management process and its outcomes are established
        through transparent policies, procedures, and other
      - >-
        match the statistical properties of real-world data without disclosing
        personally 

        identifiable information or contributing to homogenization. 

        Data Privacy; Intellectual Property; 

        Information Integrity; 

        Confabulation; Harmful Bias and 

        Homogenization 

        AI Actor Tasks: AI Deployment, AI Impact Assessment, Governance and
        Oversight, Operation and Monitoring 
         
        MANAGE 2.3: Procedures are followed to respond to and recover from a
        previously unknown risk when it is identified. 

        Action ID
      - >-


        Suggested Action: Steps an organization or AI actor can take to manage
        GAI risks.  

         

        GAI Risks: Tags linking suggested actions with relevant GAI risks.  

         

        AI Actor Tasks: Pertinent AI Actor Tasks for each subcategory. Not every
        AI Actor Task listed will 

        apply to every suggested action in the subcategory (i.e., some apply to
        AI development and 

        others apply to AI deployment).  

        The tables below begin with the AI RMF subcategory, shaded in blue,
        followed by suggested actions.
  - source_sentence: >-
      How can harmful bias and homogenization be addressed in the context of
      human-AI configuration?
    sentences:
      - >-
        on GAI, apply general fairness metrics (e.g., demographic parity,
        equalized odds, 

        equal opportunity, statistical hypothesis tests), to the pipeline or
        business 

        outcome where appropriate; Custom, context-specific metrics developed in 

        collaboration with domain experts and affected communities; Measurements
        of 

        the prevalence of denigration in generated content in deployment (e.g.,
        sub-

        sampling a fraction of traffic and manually annotating denigrating
        content). 

        Harmful Bias and Homogenization;
      - >-
        MP-5.1-001 Apply TEVV practices for content provenance (e.g., probing a
        system's synthetic 

        data generation capabilities for potential misuse or vulnerabilities. 

        Information Integrity; Information 

        Security 

        MP-5.1-002 

        Identify potential content provenance harms of GAI, such as
        misinformation or 

        disinformation, deepfakes, including NCII, or tampered content.
        Enumerate and 

        rank risks based on their likelihood and potential impact, and determine
        how well
      - >-
        MS-1.3-002 

        Engage in internal and external evaluations, GAI red-teaming, impact 

        assessments, or other structured human feedback exercises in
        consultation 

        with representative AI Actors with expertise and familiarity in the
        context of 

        use, and/or who are representative of the populations associated with
        the 

        context of use. 

        Human-AI Configuration; Harmful 

        Bias and Homogenization; CBRN 

        Information or Capabilities 

        MS-1.3-003
  - source_sentence: >-
      How can structured human feedback exercises, such as GAI red-teaming,
      contribute to GAI risk measurement and management?
    sentences:
      - >-
        rank risks based on their likelihood and potential impact, and determine
        how well 

        provenance solutions address specific risks and/or harms. 

        Information Integrity; Dangerous, 

        Violent, or Hateful Content; 

        Obscene, Degrading, and/or 

        Abusive Content 

        MP-5.1-003 

        Consider disclosing use of GAI to end users in relevant contexts, while
        considering 

        the objective of disclosure, the context of use, the likelihood and
        magnitude of the
      - >-
        15 

        GV-1.3-004 Obtain input from stakeholder communities to identify
        unacceptable use, in 

        accordance with activities in the AI RMF Map function. 

        CBRN Information or Capabilities; 

        Obscene, Degrading, and/or 

        Abusive Content; Harmful Bias 

        and Homogenization; Dangerous, 

        Violent, or Hateful Content 

        GV-1.3-005 

        Maintain an updated hierarchy of identified and expected GAI risks
        connected to 

        contexts of GAI model advancement and use, potentially including
        specialized risk
      - >-
        AI-generated content, for example by employing techniques like chaos 

        engineering and seeking stakeholder feedback. 

        Information Integrity 

        MS-1.1-008 

        Define use cases, contexts of use, capabilities, and negative impacts
        where 

        structured human feedback exercises, e.g., GAI red-teaming, would be
        most 

        beneficial for GAI risk measurement and management based on the context
        of 

        use. 

        Harmful Bias and 

        Homogenization; CBRN 

        Information or Capabilities 

        MS-1.1-009
model-index:
  - name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-m
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: Unknown
          type: unknown
        metrics:
          - type: cosine_accuracy@1
            value: 0.85
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.96
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.98
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.85
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.31999999999999995
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19599999999999995
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09999999999999998
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.85
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.96
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.98
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9342942871848772
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.9124166666666668
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.9124166666666668
            name: Cosine Map@100
          - type: dot_accuracy@1
            value: 0.85
            name: Dot Accuracy@1
          - type: dot_accuracy@3
            value: 0.96
            name: Dot Accuracy@3
          - type: dot_accuracy@5
            value: 0.98
            name: Dot Accuracy@5
          - type: dot_accuracy@10
            value: 1
            name: Dot Accuracy@10
          - type: dot_precision@1
            value: 0.85
            name: Dot Precision@1
          - type: dot_precision@3
            value: 0.31999999999999995
            name: Dot Precision@3
          - type: dot_precision@5
            value: 0.19599999999999995
            name: Dot Precision@5
          - type: dot_precision@10
            value: 0.09999999999999998
            name: Dot Precision@10
          - type: dot_recall@1
            value: 0.85
            name: Dot Recall@1
          - type: dot_recall@3
            value: 0.96
            name: Dot Recall@3
          - type: dot_recall@5
            value: 0.98
            name: Dot Recall@5
          - type: dot_recall@10
            value: 1
            name: Dot Recall@10
          - type: dot_ndcg@10
            value: 0.9342942871848772
            name: Dot Ndcg@10
          - type: dot_mrr@10
            value: 0.9124166666666668
            name: Dot Mrr@10
          - type: dot_map@100
            value: 0.9124166666666668
            name: Dot Map@100

SentenceTransformer based on Snowflake/snowflake-arctic-embed-m

This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-m. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Snowflake/snowflake-arctic-embed-m
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Cheselle/finetuned-arctic-sentence")
# Run inference
sentences = [
    'How can structured human feedback exercises, such as GAI red-teaming, contribute to GAI risk measurement and management?',
    'AI-generated content, for example by employing techniques like chaos \nengineering and seeking stakeholder feedback. \nInformation Integrity \nMS-1.1-008 \nDefine use cases, contexts of use, capabilities, and negative impacts where \nstructured human feedback exercises, e.g., GAI red-teaming, would be most \nbeneficial for GAI risk measurement and management based on the context of \nuse. \nHarmful Bias and \nHomogenization; CBRN \nInformation or Capabilities \nMS-1.1-009',
    '15 \nGV-1.3-004 Obtain input from stakeholder communities to identify unacceptable use, in \naccordance with activities in the AI RMF Map function. \nCBRN Information or Capabilities; \nObscene, Degrading, and/or \nAbusive Content; Harmful Bias \nand Homogenization; Dangerous, \nViolent, or Hateful Content \nGV-1.3-005 \nMaintain an updated hierarchy of identified and expected GAI risks connected to \ncontexts of GAI model advancement and use, potentially including specialized risk',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.85
cosine_accuracy@3 0.96
cosine_accuracy@5 0.98
cosine_accuracy@10 1.0
cosine_precision@1 0.85
cosine_precision@3 0.32
cosine_precision@5 0.196
cosine_precision@10 0.1
cosine_recall@1 0.85
cosine_recall@3 0.96
cosine_recall@5 0.98
cosine_recall@10 1.0
cosine_ndcg@10 0.9343
cosine_mrr@10 0.9124
cosine_map@100 0.9124
dot_accuracy@1 0.85
dot_accuracy@3 0.96
dot_accuracy@5 0.98
dot_accuracy@10 1.0
dot_precision@1 0.85
dot_precision@3 0.32
dot_precision@5 0.196
dot_precision@10 0.1
dot_recall@1 0.85
dot_recall@3 0.96
dot_recall@5 0.98
dot_recall@10 1.0
dot_ndcg@10 0.9343
dot_mrr@10 0.9124
dot_map@100 0.9124

Training Details

Training Dataset

Unnamed Dataset

  • Size: 600 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 600 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 11 tokens
    • mean: 21.05 tokens
    • max: 34 tokens
    • min: 14 tokens
    • mean: 91.74 tokens
    • max: 335 tokens
  • Samples:
    sentence_0 sentence_1
    What is the title of the publication related to Artificial Intelligence Risk Management by NIST? NIST Trustworthy and Responsible AI
    NIST AI 600-1
    Artificial Intelligence Risk Management
    Framework: Generative Artificial
    Intelligence Profile



    This publication is available free of charge from:
    https://doi.org/10.6028/NIST.AI.600-1
    Where can the NIST AI 600-1 publication be accessed for free? NIST Trustworthy and Responsible AI
    NIST AI 600-1
    Artificial Intelligence Risk Management
    Framework: Generative Artificial
    Intelligence Profile



    This publication is available free of charge from:
    https://doi.org/10.6028/NIST.AI.600-1
    What is the title of the publication released by NIST in July 2024 regarding AI risk management? NIST Trustworthy and Responsible AI
    NIST AI 600-1
    Artificial Intelligence Risk Management
    Framework: Generative Artificial
    Intelligence Profile



    This publication is available free of charge from:
    https://doi.org/10.6028/NIST.AI.600-1

    July 2024




    U.S. Department of Commerce
    Gina M. Raimondo, Secretary
    National Institute of Standards and Technology
    Laurie E. Locascio, NIST Director and Under Secretary of Commerce for Standards and Technology
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step cosine_map@100
1.0 38 0.9033
1.3158 50 0.9067
2.0 76 0.9124

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.1.1
  • Transformers: 4.44.2
  • PyTorch: 2.4.1+cu121
  • Accelerate: 0.34.2
  • Datasets: 3.0.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}