SentenceTransformer based on sentence-transformers/distiluse-base-multilingual-cased-v2

This is a sentence-transformers model finetuned from sentence-transformers/distiluse-base-multilingual-cased-v2 on the bps-publication-title-pairs dataset. It maps sentences & paragraphs to a 512-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: DistilBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Dense({'in_features': 768, 'out_features': 512, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("yahyaabd/f-sts")
# Run inference
sentences = [
    'Laporan keuangan pemerintah provinsi periode 2003-2006',
    'Statistik Keuangan Provinsi 2003-2006',
    'Statistik Perdagangan Luar Negeri Indonesia Ekspor Menurut Kode ISIC 2013-2014',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 512]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric allstat-semantic-dev allstat-semantic-test
pearson_cosine 0.9659 0.9645
spearman_cosine 0.8745 0.8646

Training Details

Training Dataset

bps-publication-title-pairs

  • Dataset: bps-publication-title-pairs at 4987e97
  • Size: 42,138 training samples
  • Columns: query, doc_title, and score
  • Approximate statistics based on the first 1000 samples:
    query doc_title score
    type string string float
    details
    • min: 5 tokens
    • mean: 12.33 tokens
    • max: 71 tokens
    • min: 5 tokens
    • mean: 15.04 tokens
    • max: 77 tokens
    • min: 0.0
    • mean: 0.53
    • max: 1.0
  • Samples:
    query doc_title score
    Hasil riset mobilitas Jabodetabek tahun 2023 Statistik Komuter Jabodetabek Hasil Survei Komuter Jabodetabek 2023 0.85
    Indeks harga konsumen di Indonesia tahun 2017 (82 kota) Harga Konsumen Beberapa Barang dan Jasa Kelompok Sandang di 82 Kota di Indonesia 2017 0.15
    Laporan sektor bangunan Indonesia Q4 2009 Indikator Konstruksi Triwulan IV Tahun 2009 0.91
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Evaluation Dataset

bps-publication-title-pairs

  • Dataset: bps-publication-title-pairs at 4987e97
  • Size: 2,634 evaluation samples
  • Columns: query, doc_title, and score
  • Approximate statistics based on the first 1000 samples:
    query doc_title score
    type string string float
    details
    • min: 6 tokens
    • mean: 12.31 tokens
    • max: 30 tokens
    • min: 5 tokens
    • mean: 15.19 tokens
    • max: 66 tokens
    • min: 0.0
    • mean: 0.55
    • max: 1.0
  • Samples:
    query doc_title score
    Statistik tebu Indonesia tahun 2018 Direktori Perusahaan Perkebunan Karet Indonesia 2018 0.1
    Data industri makanan dan minuman 2017 Statistik Upah Buruh Tani di Perdesaan 2018 0.2
    Biaya hidup di Gorontalo tahun 2018 Survei Biaya Hidup (SBH) 2018 Gorontalo 0.9
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 5
  • warmup_ratio: 0.1
  • fp16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss allstat-semantic-dev_spearman_cosine allstat-semantic-test_spearman_cosine
0.0380 100 0.0435 0.0320 0.7989 -
0.0759 200 0.0287 0.0246 0.8127 -
0.1139 300 0.0261 0.0222 0.8132 -
0.1519 400 0.0229 0.0216 0.8096 -
0.1898 500 0.0228 0.0213 0.8090 -
0.2278 600 0.0242 0.0210 0.8096 -
0.2658 700 0.0214 0.0199 0.8143 -
0.3037 800 0.0204 0.0197 0.8136 -
0.3417 900 0.0218 0.0202 0.8097 -
0.3797 1000 0.0228 0.0206 0.8077 -
0.4176 1100 0.0226 0.0192 0.8109 -
0.4556 1200 0.021 0.0202 0.8059 -
0.4935 1300 0.0221 0.0204 0.8053 -
0.5315 1400 0.0218 0.0203 0.8070 -
0.5695 1500 0.0229 0.0213 0.8071 -
0.6074 1600 0.0248 0.0202 0.8125 -
0.6454 1700 0.0207 0.0189 0.8116 -
0.6834 1800 0.0206 0.0195 0.8106 -
0.7213 1900 0.0202 0.0200 0.8117 -
0.7593 2000 0.0198 0.0193 0.8163 -
0.7973 2100 0.0187 0.0176 0.8204 -
0.8352 2200 0.0188 0.0177 0.8192 -
0.8732 2300 0.0192 0.0191 0.8167 -
0.9112 2400 0.0173 0.0176 0.8188 -
0.9491 2500 0.0186 0.0183 0.8212 -
0.9871 2600 0.0174 0.0182 0.8243 -
1.0251 2700 0.0148 0.0158 0.8255 -
1.0630 2800 0.0149 0.0162 0.8216 -
1.1010 2900 0.0137 0.0161 0.8273 -
1.1390 3000 0.0148 0.0166 0.8233 -
1.1769 3100 0.0138 0.0155 0.8251 -
1.2149 3200 0.0122 0.0154 0.8320 -
1.2528 3300 0.0149 0.0158 0.8293 -
1.2908 3400 0.0134 0.0150 0.8314 -
1.3288 3500 0.0141 0.0148 0.8292 -
1.3667 3600 0.0138 0.0140 0.8337 -
1.4047 3700 0.0128 0.0158 0.8256 -
1.4427 3800 0.0135 0.0154 0.8284 -
1.4806 3900 0.0142 0.0151 0.8376 -
1.5186 4000 0.0148 0.0145 0.8308 -
1.5566 4100 0.013 0.0146 0.8373 -
1.5945 4200 0.0137 0.0144 0.8296 -
1.6325 4300 0.0126 0.0146 0.8273 -
1.6705 4400 0.0138 0.0138 0.8358 -
1.7084 4500 0.0141 0.0144 0.8371 -
1.7464 4600 0.0127 0.0142 0.8339 -
1.7844 4700 0.0124 0.0144 0.8356 -
1.8223 4800 0.0126 0.0142 0.8311 -
1.8603 4900 0.0145 0.0137 0.8371 -
1.8983 5000 0.0125 0.0139 0.8336 -
1.9362 5100 0.0137 0.0140 0.8394 -
1.9742 5200 0.0127 0.0135 0.8374 -
2.0121 5300 0.0111 0.0135 0.8384 -
2.0501 5400 0.0086 0.0127 0.8404 -
2.0881 5500 0.0089 0.0120 0.8453 -
2.1260 5600 0.0091 0.0119 0.8463 -
2.1640 5700 0.0094 0.0125 0.8432 -
2.2020 5800 0.009 0.0126 0.8440 -
2.2399 5900 0.0093 0.0120 0.8469 -
2.2779 6000 0.0091 0.0124 0.8484 -
2.3159 6100 0.0101 0.0119 0.8472 -
2.3538 6200 0.0091 0.0125 0.8419 -
2.3918 6300 0.0105 0.0125 0.8409 -
2.4298 6400 0.0096 0.0125 0.8446 -
2.4677 6500 0.0099 0.0120 0.8431 -
2.5057 6600 0.0098 0.0124 0.8428 -
2.5437 6700 0.0085 0.0120 0.8444 -
2.5816 6800 0.0096 0.0120 0.8487 -
2.6196 6900 0.0094 0.0127 0.8479 -
2.6576 7000 0.0082 0.0116 0.8504 -
2.6955 7100 0.0098 0.0115 0.8509 -
2.7335 7200 0.0088 0.0114 0.8551 -
2.7715 7300 0.0081 0.0112 0.8525 -
2.8094 7400 0.0099 0.0114 0.8497 -
2.8474 7500 0.0085 0.0116 0.8527 -
2.8853 7600 0.0098 0.0115 0.8502 -
2.9233 7700 0.0093 0.0118 0.8482 -
2.9613 7800 0.0093 0.0117 0.8512 -
2.9992 7900 0.0087 0.0117 0.8517 -
3.0372 8000 0.0064 0.0106 0.8559 -
3.0752 8100 0.0059 0.0107 0.8578 -
3.1131 8200 0.0062 0.0106 0.8556 -
3.1511 8300 0.0071 0.0107 0.8526 -
3.1891 8400 0.0059 0.0106 0.8563 -
3.2270 8500 0.0065 0.0105 0.8595 -
3.2650 8600 0.0068 0.0105 0.8595 -
3.3030 8700 0.0068 0.0105 0.8588 -
3.3409 8800 0.0061 0.0103 0.8592 -
3.3789 8900 0.0067 0.0103 0.8599 -
3.4169 9000 0.0061 0.0102 0.8597 -
3.4548 9100 0.0058 0.0106 0.8604 -
3.4928 9200 0.0068 0.0103 0.8599 -
3.5308 9300 0.0058 0.0099 0.8636 -
3.5687 9400 0.0061 0.0100 0.8625 -
3.6067 9500 0.0064 0.0105 0.8590 -
3.6446 9600 0.006 0.0101 0.8590 -
3.6826 9700 0.0064 0.0106 0.8590 -
3.7206 9800 0.0059 0.0105 0.8600 -
3.7585 9900 0.0066 0.0102 0.8635 -
3.7965 10000 0.0065 0.0101 0.8617 -
3.8345 10100 0.006 0.0104 0.8628 -
3.8724 10200 0.0063 0.0103 0.8629 -
3.9104 10300 0.0064 0.0098 0.8659 -
3.9484 10400 0.0063 0.0098 0.8669 -
3.9863 10500 0.0062 0.0100 0.8647 -
4.0243 10600 0.0052 0.0095 0.8666 -
4.0623 10700 0.0045 0.0095 0.8665 -
4.1002 10800 0.004 0.0097 0.8662 -
4.1382 10900 0.0042 0.0095 0.8680 -
4.1762 11000 0.0045 0.0096 0.8676 -
4.2141 11100 0.0044 0.0096 0.8673 -
4.2521 11200 0.0043 0.0097 0.8684 -
4.2901 11300 0.0046 0.0094 0.8705 -
4.3280 11400 0.0039 0.0095 0.8699 -
4.3660 11500 0.0046 0.0094 0.8707 -
4.4039 11600 0.0041 0.0094 0.8708 -
4.4419 11700 0.0039 0.0093 0.8709 -
4.4799 11800 0.0046 0.0092 0.8720 -
4.5178 11900 0.0043 0.0093 0.8715 -
4.5558 12000 0.004 0.0093 0.8726 -
4.5938 12100 0.0043 0.0092 0.8729 -
4.6317 12200 0.0042 0.0092 0.8734 -
4.6697 12300 0.004 0.0091 0.8735 -
4.7077 12400 0.004 0.0092 0.8733 -
4.7456 12500 0.0039 0.0090 0.8743 -
4.7836 12600 0.0044 0.0090 0.8749 -
4.8216 12700 0.0041 0.0089 0.8752 -
4.8595 12800 0.004 0.0090 0.8746 -
4.8975 12900 0.004 0.0090 0.8744 -
4.9355 13000 0.0041 0.0090 0.8746 -
4.9734 13100 0.0039 0.0090 0.8745 -
5.0 13170 - - - 0.8646

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.1
  • Transformers: 4.47.1
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
69
Safetensors
Model size
135M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for yahyaabd/allstats-search-distiluse-v1

Dataset used to train yahyaabd/allstats-search-distiluse-v1

Evaluation results