SentenceTransformer based on BAAI/bge-m3

This is a sentence-transformers model finetuned from BAAI/bge-m3. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-m3
  • Maximum Sequence Length: 1024 tokens
  • Output Dimensionality: 1024 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 1024, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("seongil-dn/bge-m3-kor-retrieval-bs128-checkpoint-471")
# Run inference
sentences = [
    '전남지역의 석유와 화학제품은 왜 수출이 늘어나는 경향을 보였어',
    '(2) 전남지역\n2013년중 전남지역 수출은 전년대비 1.2% 감소로 전환하였다. 품목별로는 석유(+9.3% → +3.8%) 및 화학제품(+1.2% → +7.1%)이 중국 등 해외수요확대로 증가세를 지속하였으나 철강금속(+1.8% → -8.6%)은 글로벌 공급과잉 및 중국의 저가 철강수출 확대로, 선박(+7.6% → -49.2%)은 수주물량이 급격히 줄어들면서 감소로 전환하였다. 전남지역 수입은 원유, 화학제품, 철강금속 등의 수입이 줄면서 전년대비 7.4% 감소로 전환하였다.',
    '수출 증가세 지속\n1/4분기 중 수출은 전년동기대비 증가흐름을 지속하였다. 품목별로 보면 석유제품, 석유화학, 철강, 선박, 반도체, 자동차 등 대다수 품목에서 증가하였다. 석유제품은 글로벌 경기회복에 따른 에너지 수요 증가와 국제유가 급등으로 수출단가가 높은 상승세를 지속하면서 증가하였다. 석유화학도 중국, 아세안을 중심으로 합성수지, 고무 등의 수출이 큰 폭 증가한 데다 고유가로 인한 수출가격도 동반 상승하면서 증가세를 이어갔다. 철강은 건설, 조선 등 글로벌 전방산업의 수요 증대, 원자재가격 상승 및 중국 감산 등에 따른 수출단가 상승 등에 힘입어 증가세를 이어갔다. 선박은 1/4분기 중 인도물량이 확대됨에 따라 증가하였다. 반도체는 자동차 등 전방산업의 견조한 수요가 이어지는 가운데 전년동기대비로 높은 단가가 지속되면서 증가하였다. 자동차는 차량용 반도체 수급차질이 지속되었음에도 불구하고 글로벌 경기회복 흐름에 따라 수요가 늘어나면서 전년동기대비 소폭 증가하였다. 모니터링 결과 향후 수출은 증가세가 지속될 것으로 전망되었다. 석유화학 및 석유정제는 수출단가 상승과 전방산업의 수요확대 기조가 이어지면서 증가할 전망이다. 철강은 주요국 경기회복과 중국, 인도 등의 인프라 투자 확대 등으로 양호한 흐름을 이어갈 전망이다. 반도체는 글로벌 스마트폰 수요 회복, 디지털 전환 기조 등으로 견조한 증가세를 지속할 것으로 보인다. 자동차는 차량용 반도체 공급차질이 점차 완화되고 미국, 신흥시장을 중심으로 수요회복이 본격화됨에 따라 소폭 증가할 전망이다. 선박은 친환경 선박수요 지속, 글로별 교역 신장 등에도 불구하고 2021년 2/4분기 집중되었던 인도물량의 기저효과로 인해 감소할 것으로 보인다.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • gradient_accumulation_steps: 4
  • learning_rate: 3e-05
  • num_train_epochs: 5
  • warmup_ratio: 0.05
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 4
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 3e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.05
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss
0.0011 1 0.8341
0.0021 2 1.0166
0.0032 3 0.9211
0.0042 4 0.9691
0.0053 5 1.125
0.0064 6 1.0482
0.0074 7 1.0723
0.0085 8 0.8762
0.0095 9 0.8418
0.0106 10 0.8903
0.0117 11 0.836
0.0127 12 0.793
0.0138 13 0.7567
0.0149 14 0.7655
0.0159 15 0.6211
0.0170 16 0.5386
0.0180 17 0.8626
0.0191 18 0.7625
0.0202 19 0.8242
0.0212 20 0.7407
0.0223 21 0.7566
0.0233 22 0.732
0.0244 23 0.9182
0.0255 24 0.9062
0.0265 25 0.8974
0.0276 26 0.8435
0.0286 27 0.9853
0.0297 28 0.873
0.0308 29 0.8315
0.0318 30 0.8199
0.0329 31 0.8008
0.0339 32 0.8274
0.0350 33 0.8253
0.0361 34 0.7876
0.0371 35 0.7426
0.0382 36 0.673
0.0392 37 0.6449
0.0403 38 0.6409
0.0414 39 0.6433
0.0424 40 0.5225
0.0435 41 0.6376
0.0446 42 0.481
0.0456 43 0.4797
0.0467 44 0.3574
0.0477 45 0.4295
0.0488 46 0.4309
0.0499 47 0.2894
0.0509 48 0.2807
0.0520 49 0.2999
0.0530 50 0.4545
0.0541 51 0.4439
0.0552 52 0.3863
0.0562 53 0.4087
0.0573 54 0.3755
0.0583 55 0.1911
0.0594 56 0.1981
0.0605 57 0.1573
0.0615 58 0.1871
0.0626 59 0.1685
0.0636 60 0.1449
0.0647 61 0.1454
0.0658 62 0.4044
0.0668 63 0.5494
0.0679 64 0.4302
0.0689 65 0.4958
0.0700 66 0.4588
0.0711 67 0.5131
0.0721 68 0.5228
0.0732 69 0.6142
0.0743 70 0.5715
0.0753 71 0.6371
0.0764 72 0.5576
0.0774 73 0.6738
0.0785 74 0.5976
0.0796 75 0.5893
0.0806 76 0.536
0.0817 77 0.3585
0.0827 78 0.3952
0.0838 79 0.4002
0.0849 80 0.4096
0.0859 81 0.5608
0.0870 82 0.576
0.0880 83 0.5504
0.0891 84 0.4473
0.0902 85 0.5513
0.0912 86 0.4456
0.0923 87 0.3616
0.0933 88 0.3095
0.0944 89 0.36
0.0955 90 0.3519
0.0965 91 0.371
0.0976 92 0.3983
0.0986 93 0.4247
0.0997 94 0.4506
0.1008 95 0.3018
0.1018 96 0.3297
0.1029 97 0.4068
0.1040 98 0.4193
0.1050 99 0.3848
0.1061 100 0.4427
0.1071 101 0.4027
0.1082 102 0.3423
0.1093 103 0.3328
0.1103 104 0.2959
0.1114 105 0.2651
0.1124 106 0.2867
0.1135 107 0.2483
0.1146 108 0.2358
0.1156 109 0.238
0.1167 110 0.3582
0.1177 111 0.3198
0.1188 112 0.3416
0.1199 113 0.2493
0.1209 114 0.399
0.1220 115 0.2705
0.1230 116 0.4283
0.1241 117 0.2453
0.1252 118 0.4051
0.1262 119 0.4689
0.1273 120 0.541
0.1283 121 0.5689
0.1294 122 0.6007
0.1305 123 0.5881
0.1315 124 0.5972
0.1326 125 0.5676
0.1337 126 0.5887
0.1347 127 0.6295
0.1358 128 0.4957
0.1368 129 0.2438
0.1379 130 0.4221
0.1390 131 0.3979
0.1400 132 0.3613
0.1411 133 0.3293
0.1421 134 0.5636
0.1432 135 0.4417
0.1443 136 0.0825
0.1453 137 0.0893
0.1464 138 0.0862
0.1474 139 0.0641
0.1485 140 0.0664
0.1496 141 0.0525
0.1506 142 0.0508
0.1517 143 0.0476
0.1527 144 0.0508
0.1538 145 0.0622
0.1549 146 0.0643
0.1559 147 0.0513
0.1570 148 0.0415
0.1580 149 0.0516
0.1591 150 0.0333
0.1602 151 0.0445
0.1612 152 0.0331
0.1623 153 0.0374
0.1634 154 0.0399
0.1644 155 0.0352
0.1655 156 0.0391
0.1665 157 0.0349
0.1676 158 0.0294
0.1687 159 0.0388
0.1697 160 0.0468
0.1708 161 0.0447
0.1718 162 0.0426
0.1729 163 0.0397
0.1740 164 0.0354
0.1750 165 0.0389
0.1761 166 0.0354
0.1771 167 0.0416
0.1782 168 0.0487
0.1793 169 0.0372
0.1803 170 0.036
0.1814 171 0.034
0.1824 172 0.034
0.1835 173 0.0391
0.1846 174 0.0307
0.1856 175 0.0382
0.1867 176 0.0363
0.1877 177 0.0437
0.1888 178 0.0365
0.1899 179 0.0398
0.1909 180 0.0519
0.1920 181 0.0568
0.1931 182 0.0515
0.1941 183 0.0465
0.1952 184 0.0459
0.1962 185 0.0498
0.1973 186 0.0503
0.1984 187 0.0523
0.1994 188 0.0503
0.2005 189 0.0446
0.2015 190 0.0466
0.2026 191 0.0404
0.2037 192 0.037
0.2047 193 0.0387
0.2058 194 0.0396
0.2068 195 0.0447
0.2079 196 0.0355
0.2090 197 0.0307
0.2100 198 0.0438
0.2111 199 0.0516
0.2121 200 0.0471
0.2132 201 0.0407
0.2143 202 0.032
0.2153 203 0.0389
0.2164 204 0.0424
0.2174 205 0.0449
0.2185 206 0.0395
0.2196 207 0.0386
0.2206 208 0.0342
0.2217 209 0.0334
0.2228 210 0.0368
0.2238 211 0.0436
0.2249 212 0.0432
0.2259 213 0.0461
0.2270 214 0.0454
0.2281 215 0.0468
0.2291 216 0.0396
0.2302 217 0.037
0.2312 218 0.03
0.2323 219 0.0253
0.2334 220 0.0309
0.2344 221 0.0323
0.2355 222 0.0228
0.2365 223 0.0375
0.2376 224 0.0366
0.2387 225 0.0292
0.2397 226 0.0293
0.2408 227 0.0314
0.2418 228 0.0255
0.2429 229 0.0349
0.2440 230 0.0391
0.2450 231 0.0209
0.2461 232 0.0277
0.2471 233 0.0276
0.2482 234 0.0352
0.2493 235 0.0328
0.2503 236 0.026
0.2514 237 0.0277
0.2525 238 0.0305
0.2535 239 0.0346
0.2546 240 0.0293
0.2556 241 0.0339
0.2567 242 0.0401
0.2578 243 0.0257
0.2588 244 0.0363
0.2599 245 0.0308
0.2609 246 0.0542
0.2620 247 0.2199
0.2631 248 0.1689
0.2641 249 0.1434
0.2652 250 0.128
0.2662 251 0.1084
0.2673 252 0.1225
0.2684 253 0.1174
0.2694 254 0.1201
0.2705 255 0.1024
0.2715 256 0.0911
0.2726 257 0.1071
0.2737 258 0.0985
0.2747 259 0.0921
0.2758 260 0.1616
0.2768 261 0.121
0.2779 262 0.1064
0.2790 263 0.1105
0.2800 264 0.0845
0.2811 265 0.136
0.2822 266 0.0865
0.2832 267 0.1084
0.2843 268 0.075
0.2853 269 0.0718
0.2864 270 0.0887
0.2875 271 0.0767
0.2885 272 0.0706
0.2896 273 0.0617
0.2906 274 0.0584
0.2917 275 0.0584
0.2928 276 0.0612
0.2938 277 0.0727
0.2949 278 0.0731
0.2959 279 0.0489
0.2970 280 0.057
0.2981 281 0.0662
0.2991 282 0.0676
0.3002 283 0.0482
0.3012 284 0.0551
0.3023 285 0.0659
0.3034 286 0.0498
0.3044 287 0.0549
0.3055 288 0.0393
0.3065 289 0.0462
0.3076 290 0.0494
0.3087 291 0.0428
0.3097 292 0.0431
0.3108 293 0.0507
0.3119 294 0.0421
0.3129 295 0.0504
0.3140 296 0.0379
0.3150 297 0.0456
0.3161 298 0.0418
0.3172 299 0.0433
0.3182 300 0.0469
0.3193 301 0.0473
0.3203 302 0.0588
0.3214 303 0.0428
0.3225 304 0.0494
0.3235 305 0.053
0.3246 306 0.0574
0.3256 307 0.057
0.3267 308 0.0518
0.3278 309 0.0414
0.3288 310 0.0456
0.3299 311 0.0464
0.3309 312 0.0891
0.3320 313 0.09
0.3331 314 0.1011
0.3341 315 0.0668
0.3352 316 0.0682
0.3363 317 0.0716
0.3373 318 0.0536
0.3384 319 0.0512
0.3394 320 0.0583
0.3405 321 0.0539
0.3416 322 0.0607
0.3426 323 0.0487
0.3437 324 0.0616
0.3447 325 0.0485
0.3458 326 0.0645
0.3469 327 0.0472
0.3479 328 0.8462
0.3490 329 0.3472
0.3500 330 0.108
0.3511 331 0.0827
0.3522 332 0.0727
0.3532 333 0.082
0.3543 334 0.0766
0.3553 335 0.0804
0.3564 336 0.0743
0.3575 337 0.0689
0.3585 338 0.0695
0.3596 339 0.0708
0.3606 340 0.0739
0.3617 341 0.0794
0.3628 342 0.0748
0.3638 343 0.0969
0.3649 344 0.0814
0.3660 345 0.0974
0.3670 346 0.0784
0.3681 347 0.078
0.3691 348 0.0945
0.3702 349 0.1016
0.3713 350 0.0993
0.3723 351 0.1161
0.3734 352 0.1071
0.3744 353 0.0981
0.3755 354 0.0946
0.3766 355 0.0917
0.3776 356 0.1076
0.3787 357 0.0756
0.3797 358 0.0733
0.3808 359 0.0528
0.3819 360 0.0551
0.3829 361 0.0684
0.3840 362 0.073
0.3850 363 0.0718
0.3861 364 0.0672
0.3872 365 0.0878
0.3882 366 0.057
0.3893 367 0.0498
0.3903 368 0.0583
0.3914 369 0.0691
0.3925 370 0.0671
0.3935 371 0.0595
0.3946 372 0.0494
0.3957 373 0.0676
0.3967 374 0.0581
0.3978 375 0.044
0.3988 376 0.0656
0.3999 377 0.0484
0.4010 378 0.0597
0.4020 379 0.0717
0.4031 380 0.0447
0.4041 381 0.0825
0.4052 382 0.1297
0.4063 383 0.1127
0.4073 384 0.0791
0.4084 385 0.0842
0.4094 386 0.0777
0.4105 387 0.0787
0.4116 388 0.0926
0.4126 389 0.1106
0.4137 390 0.106
0.4147 391 0.0833
0.4158 392 0.0777
0.4169 393 0.0867
0.4179 394 0.0681
0.4190 395 0.0774
0.4200 396 0.0806
0.4211 397 0.0932
0.4222 398 0.0765
0.4232 399 0.086
0.4243 400 0.0663
0.4254 401 0.0833
0.4264 402 0.0978
0.4275 403 0.0607
0.4285 404 0.0703
0.4296 405 0.0954
0.4307 406 0.0763
0.4317 407 0.069
0.4328 408 0.085
0.4338 409 0.0788
0.4349 410 0.0662
0.4360 411 0.0661
0.4370 412 0.0687
0.4381 413 0.0692
0.4391 414 0.083
0.4402 415 0.0551
0.4413 416 0.0616
0.4423 417 0.0894
0.4434 418 0.058
0.4444 419 0.0489
0.4455 420 0.059
0.4466 421 0.0732
0.4476 422 0.0568
0.4487 423 0.0635
0.4497 424 0.0526
0.4508 425 0.0521
0.4519 426 0.0909
0.4529 427 0.0599
0.4540 428 0.0622
0.4551 429 0.056
0.4561 430 0.0738
0.4572 431 0.0832
0.4582 432 0.098
0.4593 433 0.115
0.4604 434 0.1045
0.4614 435 0.1022
0.4625 436 0.1069
0.4635 437 0.1331
0.4646 438 0.103
0.4657 439 0.1248
0.4667 440 0.0882
0.4678 441 0.0672
0.4688 442 0.9033
0.4699 443 0.9551
0.4710 444 0.6253
0.4720 445 0.3963
0.4731 446 0.2817
0.4741 447 0.241
0.4752 448 0.2031
0.4763 449 0.1835
0.4773 450 0.1686
0.4784 451 0.2295
0.4794 452 0.1949
0.4805 453 0.1619
0.4816 454 0.1488
0.4826 455 0.1723
0.4837 456 0.1788
0.4848 457 0.1428
0.4858 458 0.1839
0.4869 459 0.1752
0.4879 460 0.1232
0.4890 461 0.1525
0.4901 462 0.1265
0.4911 463 0.1217
0.4922 464 0.154
0.4932 465 0.1511
0.4943 466 0.1181
0.4954 467 0.1268
0.4964 468 0.1041
0.4975 469 0.1058
0.4985 470 0.1373
0.4996 471 0.1385

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.2.1
  • Transformers: 4.44.2
  • PyTorch: 2.3.1+cu121
  • Accelerate: 1.1.1
  • Datasets: 2.21.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CachedMultipleNegativesRankingLoss

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
Downloads last month
18
Safetensors
Model size
568M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for seongil-dn/bge-m3-kor-retrieval-bs128-checkpoint-471

Base model

BAAI/bge-m3
Finetuned
(181)
this model