pattonma's picture
Add new SentenceTransformer model
b7bb480 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:600
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
base_model: Snowflake/snowflake-arctic-embed-l
widget:
  - source_sentence: >-
      What is the date of the Gallup report regarding employer care for employee
      wellbeing?
    sentences:
      - sense of purpose Defining work wellbeing
      - >-
        What constitutes meaningful conversations between managers and
        employees? Gallup found they include recognition and discussion about
        collaboration, goals, and priorities, and the employee’s strengths.
        These conversations prevent employees from feeling disconnected from the
        organization because managers stay in touch with what each employee
        contributes and can then articulate how that work affects the larger
        organization. The conversations ensure that expectations can be adjusted
        as the business needs change and in what ways those changing
        expectations interact with coworker roles.
      - >-
        March 18, 2022 Gallup
        https://www.gallup.com/workplace/390776/percent-feel-employer-cares-wellbeing-plummets.aspx
        Gallup World Headquarters, 901 F Street, Washington, D.C., 20001, U.S.A
        +1 202.715.3030
  - source_sentence: What services does Evernorth Health Services provide?
    sentences:
      - >-
        Focusing on employee wellbeing and acknowledging the whole person. Since
        work and life are blended for many, consider the demands of life inside
        and out of the workplace. Consider career, social, financial, physical,
        and community wellbeing impacts and resources.


        Tailoring communication to reach their team where they are. Transparent
        and creative omnichannel communication to employees and customers is
        more likely to reach and resonate with a wide variety of people in many
        different work-life situations.
      - |-
        Investor Relations

        Careers

        Bottom FB - column 3

        COVID Resource Center

        Health and Wellness

        Member Resources

        Bottom FB - column 4

        The Cigna Group

        Cigna Healthcare

        Evernorth Health Services

        International
      - >-
        1. The evolution of the disease burden. While McKinsey & Company employs
        many medical experts and scientists, we are not a disease forecasting
        firm. We rely on disease-burden forecasts globally and for the United
        States provided by IHME, which maintains the most comprehensive database
        of the global disease burden and for the United States as whole.
        Forecasts of the global and US disease burden are inherently uncertain
        and health shocks such as the COVID-19 pandemic may affect forecasts.
  - source_sentence: >-
      How does the theme of "Wellbeing" relate to employees' perceptions of
      their work-life balance?
    sentences:
      - >-
        engagement as an extremely important priority—are effectively using
        metrics and shared some best practices for tying engagement to business
        performance. 

        Copyright © 2013 Harvard Business School Publishing. All rights
        reserved.The Impact of  

        Employee Engagement on Performance

        highlights

        71%

        of respondents rank  

        employee engagement as  

        very important to achieving  

        overall organizational success.

        72%

        of respondents rank recognition  

        given for high performers as  

        having a significant impact on  

        employee engagement.

        24% 

        of respondents say employees  

        in their organization are  

        highly engaged.
      - >-
        figure 10

        Senior managers were far more likely to be optimistic than their
        middle-management colleagues were in their perceptions of engagement
        levels. Since middle managers are tasked with handling more day-to-day
        employee issues, their assessment is likely the more accurate. This
        implies that in many firms senior man-agers may need to take off the
        rose-colored glasses and take a closer look at the barriers to
        engagement that may be present, and then find more effective ways of
        overcoming them.
      - >-
        Gallup analysts identified individuals in its database who have declined
        in clarity of expectations from 2020 to 2023. Among this group, across
        job types and work locations, the largest areas of decline fit into five
        themes:


        Feedback and Performance Focus


        Received meaningful feedback in the last week


        Performance managed to motivate outstanding performance


        Manager keeps me informed on what is going on


        Pride in quality of products/services


        Freedom to make decisions needed to do my job well


        Goals/Priorities


        Manager includes me in goal setting


        Feel prepared to do my job


        Wellbeing


        Organization cares about my wellbeing


        Able to maintain a healthy balance between work and personal life


        Team


        Feel like part of the team
  - source_sentence: >-
      What impact does having one meaningful conversation per week with each
      team member have on high-performance relationships according to Gallup?
    sentences:
      - >-
        Fewer than one in four U.S. employees feel strongly that their
        organization cares about their wellbeing -- the lowest percentage in
        nearly a decade.


        This finding has significant implications, as work and life have never
        been more blended and employee wellbeing matters more than ever-- to
        employees and the resiliency of organizations. The discovery is based on
        a random sample of 15,001 full and part-time U.S. employees who were
        surveyed in February 2022.
      - >-
        has developed an open-access dashboard for more than 80 measures at the
        county, state, and national levels. This data has highlighted, for
        example, the disproportionate impact of COVID-19 on communities of color
        as well as physical health and behavioral health vulnerability to
        COVID-19.
      - >-
        Gallup finds that a manager having one meaningful conversation per week
        with each team member develops high-performance relationships more than
        any other leadership activity. Gallup analytics have found managers can
        be quickly upskilled to have these ongoing strengths-based conversations
        that bring purpose and clear expectations to work, which is now
        deteriorating in U.S. organizations.
  - source_sentence: >-
      How does Alexis Krivkovich's perspective as a mother influence her
      optimism about the future of women in the workplace?
    sentences:
      - >-
        Author(s)


        Jim Harter, Ph.D., is Chief Scientist, Workplace for Gallup and
        bestselling author of Culture Shock, Wellbeing at Work, It's the
        Manager, 12: The Elements of Great Managing and Wellbeing: The Five
        Essential Elements. His research is also featured in the groundbreaking
        New York Times bestseller, First, Break All the Rules. Dr. Harter has
        led more than 1,000 studies of workplace effectiveness, including the
        largest ongoing meta-analysis of human potential and business-unit
        performance. His work has also appeared in many publications, including
        Harvard Business Review, The New York Times and The Wall Street Journal,
        and in many prominent academic journals.


        Sangeeta Agrawal contributed analysis to this article.


        Survey Methods
      - |-
        Learn more about the 
        Work Happiness Score at: 
        go.indeed.com/happiness
      - >-
        Lucia Rahilly: Sometimes, I feel that we’ve been talking about these
        issues since I was in college, and that can feel discouraging. What are
        you most optimistic about going into 2022, coming out of this Women in
        the Workplace report?


        Alexis Krivkovich: I’m most optimistic about the fact that we’re having
        an honest conversation, and now with a real fact base. We’re not talking
        about these things as perception but as real and measured experiences
        that companies can’t hide from—and they don’t want to.


        As a mother of three young daughters, it gives me real hope because I’ve
        been thinking about this question for 20 years. But in 20 years, when
        they’re fully in the workplace, maybe we’ll have a totally different
        paradigm.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
  - dot_accuracy@1
  - dot_accuracy@3
  - dot_accuracy@5
  - dot_accuracy@10
  - dot_precision@1
  - dot_precision@3
  - dot_precision@5
  - dot_precision@10
  - dot_recall@1
  - dot_recall@3
  - dot_recall@5
  - dot_recall@10
  - dot_ndcg@10
  - dot_mrr@10
  - dot_map@100
model-index:
  - name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-l
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: Unknown
          type: unknown
        metrics:
          - type: cosine_accuracy@1
            value: 0.81
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.93
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.97
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.98
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.81
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.30999999999999994
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19399999999999995
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09799999999999998
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.81
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.93
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.97
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.98
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9036533710134148
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.8780952380952383
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.8798376623376624
            name: Cosine Map@100
          - type: dot_accuracy@1
            value: 0.81
            name: Dot Accuracy@1
          - type: dot_accuracy@3
            value: 0.93
            name: Dot Accuracy@3
          - type: dot_accuracy@5
            value: 0.97
            name: Dot Accuracy@5
          - type: dot_accuracy@10
            value: 0.98
            name: Dot Accuracy@10
          - type: dot_precision@1
            value: 0.81
            name: Dot Precision@1
          - type: dot_precision@3
            value: 0.30999999999999994
            name: Dot Precision@3
          - type: dot_precision@5
            value: 0.19399999999999995
            name: Dot Precision@5
          - type: dot_precision@10
            value: 0.09799999999999998
            name: Dot Precision@10
          - type: dot_recall@1
            value: 0.81
            name: Dot Recall@1
          - type: dot_recall@3
            value: 0.93
            name: Dot Recall@3
          - type: dot_recall@5
            value: 0.97
            name: Dot Recall@5
          - type: dot_recall@10
            value: 0.98
            name: Dot Recall@10
          - type: dot_ndcg@10
            value: 0.9036533710134148
            name: Dot Ndcg@10
          - type: dot_mrr@10
            value: 0.8780952380952383
            name: Dot Mrr@10
          - type: dot_map@100
            value: 0.8798376623376624
            name: Dot Map@100

SentenceTransformer based on Snowflake/snowflake-arctic-embed-l

This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-l. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Snowflake/snowflake-arctic-embed-l
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("CoExperiences/snowflake-l-marketing-tuned")
# Run inference
sentences = [
    "How does Alexis Krivkovich's perspective as a mother influence her optimism about the future of women in the workplace?",
    'Lucia Rahilly: Sometimes, I feel that we’ve been talking about these issues since I was in college, and that can feel discouraging. What are you most optimistic about going into 2022, coming out of this Women in the Workplace report?\n\nAlexis Krivkovich: I’m most optimistic about the fact that we’re having an honest conversation, and now with a real fact base. We’re not talking about these things as perception but as real and measured experiences that companies can’t hide from—and they don’t want to.\n\nAs a mother of three young daughters, it gives me real hope because I’ve been thinking about this question for 20 years. But in 20 years, when they’re fully in the workplace, maybe we’ll have a totally different paradigm.',
    'Learn more about the \nWork Happiness Score at: \ngo.indeed.com/happiness',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.81
cosine_accuracy@3 0.93
cosine_accuracy@5 0.97
cosine_accuracy@10 0.98
cosine_precision@1 0.81
cosine_precision@3 0.31
cosine_precision@5 0.194
cosine_precision@10 0.098
cosine_recall@1 0.81
cosine_recall@3 0.93
cosine_recall@5 0.97
cosine_recall@10 0.98
cosine_ndcg@10 0.9037
cosine_mrr@10 0.8781
cosine_map@100 0.8798
dot_accuracy@1 0.81
dot_accuracy@3 0.93
dot_accuracy@5 0.97
dot_accuracy@10 0.98
dot_precision@1 0.81
dot_precision@3 0.31
dot_precision@5 0.194
dot_precision@10 0.098
dot_recall@1 0.81
dot_recall@3 0.93
dot_recall@5 0.97
dot_recall@10 0.98
dot_ndcg@10 0.9037
dot_mrr@10 0.8781
dot_map@100 0.8798

Training Details

Training Dataset

Unnamed Dataset

  • Size: 600 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 600 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 9 tokens
    • mean: 20.08 tokens
    • max: 39 tokens
    • min: 5 tokens
    • mean: 110.85 tokens
    • max: 187 tokens
  • Samples:
    sentence_0 sentence_1
    What significant change occurred in employees' perceptions of their employer's care for their wellbeing during the pandemic? Workplace

    Percent Who Feel Employer Cares About Their Wellbeing Plummets

    Share on LinkedIn

    Share on Twitter

    Share on Facebook

    Share via Email

    Print

    Share on LinkedIn

    Share on Twitter

    Share on Facebook

    Share via Email

    Print

    Workplace

    March 18, 2022

    Percent Who Feel Employer Cares About Their Wellbeing Plummets

    by Jim Harter

    Story Highlights

    Employees' perceptions of their organization caring about their wellbeing drops

    During the onset of the pandemic, employees felt employers had more care and concern

    Employees who feel their employer cares about their wellbeing are 69% less likely to actively search for a job
    How does feeling cared for by an employer impact employees' job search behavior? Workplace

    Percent Who Feel Employer Cares About Their Wellbeing Plummets

    Share on LinkedIn

    Share on Twitter

    Share on Facebook

    Share via Email

    Print

    Share on LinkedIn

    Share on Twitter

    Share on Facebook

    Share via Email

    Print

    Workplace

    March 18, 2022

    Percent Who Feel Employer Cares About Their Wellbeing Plummets

    by Jim Harter

    Story Highlights

    Employees' perceptions of their organization caring about their wellbeing drops

    During the onset of the pandemic, employees felt employers had more care and concern

    Employees who feel their employer cares about their wellbeing are 69% less likely to actively search for a job
    What percentage of U.S. employees feel strongly that their organization cares about their wellbeing? Fewer than one in four U.S. employees feel strongly that their organization cares about their wellbeing -- the lowest percentage in nearly a decade.

    This finding has significant implications, as work and life have never been more blended and employee wellbeing matters more than ever-- to employees and the resiliency of organizations. The discovery is based on a random sample of 15,001 full and part-time U.S. employees who were surveyed in February 2022.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            1024,
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 20
  • per_device_eval_batch_size: 20
  • num_train_epochs: 5
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 20
  • per_device_eval_batch_size: 20
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step cosine_map@100
1.0 30 0.8782
1.6667 50 0.8878
2.0 60 0.8854
3.0 90 0.8853
3.3333 100 0.8845
4.0 120 0.8793
5.0 150 0.8798

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.2.0
  • Transformers: 4.45.2
  • PyTorch: 2.5.0+cu124
  • Accelerate: 0.34.2
  • Datasets: 3.0.1
  • Tokenizers: 0.20.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}