--- base_model: BAAI/bge-m3 datasets: [] language: - ca library_name: sentence-transformers license: apache-2.0 metrics: - cosine_accuracy@1 - cosine_accuracy@3 - cosine_accuracy@5 - cosine_accuracy@10 - cosine_precision@1 - cosine_precision@3 - cosine_precision@5 - cosine_precision@10 - cosine_recall@1 - cosine_recall@3 - cosine_recall@5 - cosine_recall@10 - cosine_ndcg@10 - cosine_mrr@10 - cosine_map@100 pipeline_tag: sentence-similarity tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:3755 - loss:MatryoshkaLoss - loss:MultipleNegativesRankingLoss widget: - source_sentence: En el cas que la persona beneficiària mantingui les condicions d’elegibilitat es podrà concedir la pròrroga de la prestació sempre que la persona interessada ho sol·liciti i ho permetin les dotacions pressupostàries de cada exercici. sentences: - Quin és el benefici de l'ajut a la consolidació d'empreses? - Quin és el requisit per a la persona beneficiària? - Quin és el benefici del Registre municipal d'entitats per a l'Ajuntament? - source_sentence: Aquest tràmit permet la presentació de les sol·licituds per a l’atorgament de llicències d’aprofitament especial sense transformació del domini públic marítim terrestre consistent en la instal·lació i explotació d'escola per oferir activitats nàutiques, amb zona d’avarada, durant la temporada. sentences: - Quin és el propòsit de la llicència d'aprofitament especial sense transformació del domini públic marítim terrestre? - Quin és el termini per a presentar les sol·licituds de subvencions per a projectes i activitats a entitats de l'àmbit de drets civils? - Quin és el lloc on es realitzen les activitats amb aquest permís? - source_sentence: en cas de compliment dels requisits establerts (persones residents, titulars de plaça d'aparcament, autotaxis, establiments hotelers) sentences: - Quin és el paper de l'administració en la justificació del projecte/activitat subvencionada? - Quin és el benefici de ser un autotaxi? - Quin és el benefici per als establiments de la instal·lació de terrasses o vetlladors? - source_sentence: La convocatòria és el document que estableix les condicions i els requisits per a poder sol·licitar les subvencions pel suport educatiu a les escoles públiques de Sitges. sentences: - Quin és el paper de la convocatòria en les subvencions pel suport educatiu a les escoles públiques de Sitges? - Quin és el benefici de la consulta prèvia de classificació d'activitat per a l'Ajuntament de Sitges? - Quin és el tipus d'ocupació de la via pública que es pot realitzar amb aquest permís? - source_sentence: Cal revisar la informació i els terminis de la convocatòria específica de cada procés que trobareu a la Seu electrònica de l'Ajuntament de Sitges. sentences: - Quin és el document que es necessita per acreditar l'any de construcció i l'adequació a la legalitat urbanística d'un immoble? - Quin és el paper de l'Ajuntament en la gestió de les activitats per temporades? - On es pot trobar la informació sobre els terminis de presentació d'al·legacions en un procés de selecció de personal de l'Ajuntament de Sitges? model-index: - name: BGE SITGES CAT results: - task: type: information-retrieval name: Information Retrieval dataset: name: dim 1024 type: dim_1024 metrics: - type: cosine_accuracy@1 value: 0.13875598086124402 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.22248803827751196 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.30861244019138756 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.5 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.13875598086124402 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.07416267942583732 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.06172248803827752 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.049999999999999996 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.13875598086124402 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.22248803827751196 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.30861244019138756 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.5 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.28246378665685234 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.21777644869750143 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.24297774164515282 name: Cosine Map@100 - task: type: information-retrieval name: Information Retrieval dataset: name: dim 768 type: dim_768 metrics: - type: cosine_accuracy@1 value: 0.13157894736842105 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.22248803827751196 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.3157894736842105 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.4904306220095694 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.13157894736842105 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.07416267942583732 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.06315789473684211 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.04904306220095694 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.13157894736842105 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.22248803827751196 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.3157894736842105 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.4904306220095694 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.27585932698577753 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.21171489329384077 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.23780085464747025 name: Cosine Map@100 - task: type: information-retrieval name: Information Retrieval dataset: name: dim 512 type: dim_512 metrics: - type: cosine_accuracy@1 value: 0.13875598086124402 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.21770334928229665 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.3062200956937799 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.48564593301435405 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.13875598086124402 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.07256778309409888 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.06124401913875598 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.0485645933014354 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.13875598086124402 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.21770334928229665 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.3062200956937799 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.48564593301435405 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.276564299219231 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.21426198070934924 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.24076362333582052 name: Cosine Map@100 - task: type: information-retrieval name: Information Retrieval dataset: name: dim 256 type: dim_256 metrics: - type: cosine_accuracy@1 value: 0.12440191387559808 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.21770334928229665 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.3133971291866029 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.4688995215311005 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.12440191387559808 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.07256778309409888 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.06267942583732058 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.04688995215311005 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.12440191387559808 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.21770334928229665 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.3133971291866029 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.4688995215311005 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.2671493494247117 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.20640996430470124 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.23431223249888664 name: Cosine Map@100 - task: type: information-retrieval name: Information Retrieval dataset: name: dim 128 type: dim_128 metrics: - type: cosine_accuracy@1 value: 0.12200956937799043 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.21291866028708134 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.3014354066985646 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.49282296650717705 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.12200956937799043 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.07097288676236044 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.06028708133971292 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.049282296650717705 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.12200956937799043 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.21291866028708134 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.3014354066985646 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.49282296650717705 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.27152939051256636 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.20549764562922473 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.23082152106975815 name: Cosine Map@100 - task: type: information-retrieval name: Information Retrieval dataset: name: dim 64 type: dim_64 metrics: - type: cosine_accuracy@1 value: 0.11961722488038277 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.19856459330143542 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.2822966507177033 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.4688995215311005 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.11961722488038277 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.06618819776714513 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.056459330143540674 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.046889952153110044 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.11961722488038277 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.19856459330143542 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.2822966507177033 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.4688995215311005 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.2582882544405147 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.19569188121819714 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.22122525098210105 name: Cosine Map@100 --- # BGE SITGES CAT This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) - **Maximum Sequence Length:** 8192 tokens - **Output Dimensionality:** 1024 tokens - **Similarity Function:** Cosine Similarity - **Language:** ca - **License:** apache-2.0 ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: XLMRobertaModel (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Normalize() ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("adriansanz/SITGES-BAAI2") # Run inference sentences = [ "Cal revisar la informació i els terminis de la convocatòria específica de cada procés que trobareu a la Seu electrònica de l'Ajuntament de Sitges.", "On es pot trobar la informació sobre els terminis de presentació d'al·legacions en un procés de selecció de personal de l'Ajuntament de Sitges?", "Quin és el document que es necessita per acreditar l'any de construcció i l'adequació a la legalitat urbanística d'un immoble?", ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 1024] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Evaluation ### Metrics #### Information Retrieval * Dataset: `dim_1024` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:----------| | cosine_accuracy@1 | 0.1388 | | cosine_accuracy@3 | 0.2225 | | cosine_accuracy@5 | 0.3086 | | cosine_accuracy@10 | 0.5 | | cosine_precision@1 | 0.1388 | | cosine_precision@3 | 0.0742 | | cosine_precision@5 | 0.0617 | | cosine_precision@10 | 0.05 | | cosine_recall@1 | 0.1388 | | cosine_recall@3 | 0.2225 | | cosine_recall@5 | 0.3086 | | cosine_recall@10 | 0.5 | | cosine_ndcg@10 | 0.2825 | | cosine_mrr@10 | 0.2178 | | **cosine_map@100** | **0.243** | #### Information Retrieval * Dataset: `dim_768` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.1316 | | cosine_accuracy@3 | 0.2225 | | cosine_accuracy@5 | 0.3158 | | cosine_accuracy@10 | 0.4904 | | cosine_precision@1 | 0.1316 | | cosine_precision@3 | 0.0742 | | cosine_precision@5 | 0.0632 | | cosine_precision@10 | 0.049 | | cosine_recall@1 | 0.1316 | | cosine_recall@3 | 0.2225 | | cosine_recall@5 | 0.3158 | | cosine_recall@10 | 0.4904 | | cosine_ndcg@10 | 0.2759 | | cosine_mrr@10 | 0.2117 | | **cosine_map@100** | **0.2378** | #### Information Retrieval * Dataset: `dim_512` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.1388 | | cosine_accuracy@3 | 0.2177 | | cosine_accuracy@5 | 0.3062 | | cosine_accuracy@10 | 0.4856 | | cosine_precision@1 | 0.1388 | | cosine_precision@3 | 0.0726 | | cosine_precision@5 | 0.0612 | | cosine_precision@10 | 0.0486 | | cosine_recall@1 | 0.1388 | | cosine_recall@3 | 0.2177 | | cosine_recall@5 | 0.3062 | | cosine_recall@10 | 0.4856 | | cosine_ndcg@10 | 0.2766 | | cosine_mrr@10 | 0.2143 | | **cosine_map@100** | **0.2408** | #### Information Retrieval * Dataset: `dim_256` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.1244 | | cosine_accuracy@3 | 0.2177 | | cosine_accuracy@5 | 0.3134 | | cosine_accuracy@10 | 0.4689 | | cosine_precision@1 | 0.1244 | | cosine_precision@3 | 0.0726 | | cosine_precision@5 | 0.0627 | | cosine_precision@10 | 0.0469 | | cosine_recall@1 | 0.1244 | | cosine_recall@3 | 0.2177 | | cosine_recall@5 | 0.3134 | | cosine_recall@10 | 0.4689 | | cosine_ndcg@10 | 0.2671 | | cosine_mrr@10 | 0.2064 | | **cosine_map@100** | **0.2343** | #### Information Retrieval * Dataset: `dim_128` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.122 | | cosine_accuracy@3 | 0.2129 | | cosine_accuracy@5 | 0.3014 | | cosine_accuracy@10 | 0.4928 | | cosine_precision@1 | 0.122 | | cosine_precision@3 | 0.071 | | cosine_precision@5 | 0.0603 | | cosine_precision@10 | 0.0493 | | cosine_recall@1 | 0.122 | | cosine_recall@3 | 0.2129 | | cosine_recall@5 | 0.3014 | | cosine_recall@10 | 0.4928 | | cosine_ndcg@10 | 0.2715 | | cosine_mrr@10 | 0.2055 | | **cosine_map@100** | **0.2308** | #### Information Retrieval * Dataset: `dim_64` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.1196 | | cosine_accuracy@3 | 0.1986 | | cosine_accuracy@5 | 0.2823 | | cosine_accuracy@10 | 0.4689 | | cosine_precision@1 | 0.1196 | | cosine_precision@3 | 0.0662 | | cosine_precision@5 | 0.0565 | | cosine_precision@10 | 0.0469 | | cosine_recall@1 | 0.1196 | | cosine_recall@3 | 0.1986 | | cosine_recall@5 | 0.2823 | | cosine_recall@10 | 0.4689 | | cosine_ndcg@10 | 0.2583 | | cosine_mrr@10 | 0.1957 | | **cosine_map@100** | **0.2212** | ## Training Details ### Training Hyperparameters #### Non-Default Hyperparameters - `eval_strategy`: epoch - `per_device_train_batch_size`: 16 - `per_device_eval_batch_size`: 16 - `gradient_accumulation_steps`: 16 - `learning_rate`: 2e-05 - `num_train_epochs`: 6 - `lr_scheduler_type`: cosine - `warmup_ratio`: 0.1 - `bf16`: True - `tf32`: True - `load_best_model_at_end`: True - `optim`: adamw_torch_fused - `batch_sampler`: no_duplicates #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: epoch - `prediction_loss_only`: True - `per_device_train_batch_size`: 16 - `per_device_eval_batch_size`: 16 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 16 - `eval_accumulation_steps`: None - `learning_rate`: 2e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1.0 - `num_train_epochs`: 6 - `max_steps`: -1 - `lr_scheduler_type`: cosine - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.1 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: True - `fp16`: False - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: True - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: True - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch_fused - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: False - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `dispatch_batches`: None - `split_batches`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `batch_sampler`: no_duplicates - `multi_dataset_batch_sampler`: proportional
### Training Logs | Epoch | Step | Training Loss | loss | dim_1024_cosine_map@100 | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_512_cosine_map@100 | dim_64_cosine_map@100 | dim_768_cosine_map@100 | |:----------:|:------:|:-------------:|:---------:|:-----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:| | 0.3404 | 5 | 3.3256 | - | - | - | - | - | - | - | | 0.6809 | 10 | 2.2115 | - | - | - | - | - | - | - | | 0.9532 | 14 | - | 1.2963 | 0.2260 | 0.2148 | 0.2144 | 0.2258 | 0.2069 | 0.2252 | | 1.0213 | 15 | 1.7921 | - | - | - | - | - | - | - | | 1.3617 | 20 | 1.2295 | - | - | - | - | - | - | - | | 1.7021 | 25 | 0.9048 | - | - | - | - | - | - | - | | 1.9745 | 29 | - | 0.8667 | 0.2311 | 0.2267 | 0.2292 | 0.2279 | 0.2121 | 0.2278 | | 2.0426 | 30 | 0.7256 | - | - | - | - | - | - | - | | 2.3830 | 35 | 0.5252 | - | - | - | - | - | - | - | | 2.7234 | 40 | 0.4648 | - | - | - | - | - | - | - | | 2.9957 | 44 | - | 0.6920 | 0.2311 | 0.2243 | 0.2332 | 0.2319 | 0.2211 | 0.2354 | | 3.0638 | 45 | 0.3518 | - | - | - | - | - | - | - | | 3.4043 | 50 | 0.321 | - | - | - | - | - | - | - | | 3.7447 | 55 | 0.2923 | - | - | - | - | - | - | - | | 3.9489 | 58 | - | 0.6514 | 0.2343 | 0.2210 | 0.2293 | 0.2338 | 0.2242 | 0.2331 | | 4.0851 | 60 | 0.2522 | - | - | - | - | - | - | - | | 4.4255 | 65 | 0.2445 | - | - | - | - | - | - | - | | 4.7660 | 70 | 0.2358 | - | - | - | - | - | - | - | | 4.9702 | 73 | - | 0.6481 | 0.2348 | 0.2239 | 0.2252 | 0.2332 | 0.2167 | 0.2298 | | 5.1064 | 75 | 0.2301 | - | - | - | - | - | - | - | | 5.4468 | 80 | 0.2262 | - | - | - | - | - | - | - | | **5.7191** | **84** | **-** | **0.646** | **0.243** | **0.2308** | **0.2343** | **0.2408** | **0.2212** | **0.2378** | * The bold row denotes the saved checkpoint. ### Framework Versions - Python: 3.10.12 - Sentence Transformers: 3.0.1 - Transformers: 4.42.3 - PyTorch: 2.3.1+cu121 - Accelerate: 0.32.1 - Datasets: 2.20.0 - Tokenizers: 0.19.1 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### MatryoshkaLoss ```bibtex @misc{kusupati2024matryoshka, title={Matryoshka Representation Learning}, author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi}, year={2024}, eprint={2205.13147}, archivePrefix={arXiv}, primaryClass={cs.LG} } ``` #### MultipleNegativesRankingLoss ```bibtex @misc{henderson2017efficient, title={Efficient Natural Language Response Suggestion for Smart Reply}, author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, year={2017}, eprint={1705.00652}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```