---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:53499287
- loss:RZTKMatryoshka2dLoss
base_model: intfloat/multilingual-e5-base
widget:
- source_sentence: 'query: koton женская одежда'
sentences:
- 'passage: Жіночі блузи Koton Габарити С Стандарт (до 300x200x250 мм) Кількість
вантажних місць 1 Країна реєстрації бренда Туреччина Країна-виробник товару Туреччина
Розмір 34 Стиль Повсякденний (casual) Колір Бежевий Матеріал Поліестер Доставка
Доставка в магазини ROZETKA'
- 'passage: Меблеві ручки ДС'
- 'passage: Жіночі штани Koton Габарити С Стандарт (до 300x200x250 мм) Кількість
вантажних місць 1 Країна реєстрації бренда Туреччина Країна-виробник товару Туреччина
Розмір 36 Стиль Повсякденний (casual) Матеріал Бавовна Наявність товара по містах
Київ і область Доставка Доставка в магазини ROZETKA'
- source_sentence: 'query: koton женская одежда'
sentences:
- 'passage: Жіночі блузи Koton Габарити С Стандарт (до 300x200x250 мм) Кількість
вантажних місць 1 Країна реєстрації бренда Туреччина Країна-виробник товару Туреччина
Розмір 36 Стиль Повсякденний (casual) Колір Зелений Матеріал Бавовна Матеріал
Віскоза Матеріал Поліестер Принт Смужка Доставка Доставка в магазини ROZETKA'
- 'passage: Мебельные ручки MVM Гарантия 12 месяцев Страна регистрации бренда Украина
Количество предметов, шт 1 Страна-производитель товара Китай Тип Ручки Вид Мебельные
ручки Тип ручки Кнопка'
- 'passage: Блузка жіноча Koton 7KAK63013EW 34 Marine (8681456231631)'
- source_sentence: 'query: redmi note 5 чехол'
sentences:
- 'passage: Женские блузы Koton Габариты_old C Стандарт (до 300x200x250 мм) Количество
грузовых мест 1 Страна регистрации бренда Турция Страна-производитель товара Турция
Размер 38 Цвет Белый Цвет Черный Материал Эластан Материал Полиэстер Материал
Вискоза Наличие товара по городам Киев и область Доставка Доставка в магазины
ROZETKA'
- 'passage: Блузка женская Koton 8YAK68470PW-000 40 White (8681953271741)'
- 'passage: Чехлы для мобильных телефонов Nillkin Материал Пластик Цвет White Форм-фактор
Панель Совместимый бренд Xiaomi'
- source_sentence: 'query: ручки для мебели'
sentences:
- 'passage: Ручки для меблів DR 52/96 ЛАТУНЬ AB'
- 'passage: Жіночі блузи Koton Габарити С Стандарт (до 300x200x250 мм) Кількість
вантажних місць 1 Країна реєстрації бренда Туреччина Країна-виробник товару Туреччина
Розмір 34 Стиль Повсякденний (casual) Колір Чорний Матеріал Поліестер Доставка
Доставка в магазини ROZETKA'
- 'passage: Меблеві ручки MVM Гарантія 12 місяців Країна реєстрації бренда Україна
Країна-виробник товару Китай Тип Ручки Різновид Меблеві ручки Тип ручки Скоба'
- source_sentence: 'query: koton женская одежда'
sentences:
- 'passage: Женские блузы Koton Габариты_old C Стандарт (до 300x200x250 мм) Количество
грузовых мест 1 Страна регистрации бренда Турция Страна-производитель товара Турция
Размер M Стиль Повседневный (casual) Цвет Бежевый Материал Полиэстер Материал
Эластан Доставка Доставка в магазины ROZETKA'
- 'passage: Головные устройства Podofo Гарантия 12 месяцев официальной гарантии
от производителя'
- 'passage: Жіночі штани Koton Габарити С Стандарт (до 300x200x250 мм) Кількість
вантажних місць 1 Країна реєстрації бренда Туреччина Країна-виробник товару Туреччина
Розмір M Стиль Повсякденний (casual) Колір Зелений Моделі Кюлоти Доставка Доставка
в магазини ROZETKA'
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- dot_accuracy_10
- dot_precision_10
- dot_recall_10
- dot_ndcg_10
- dot_mrr_10
- dot_map_60
- dot_accuracy_1
- dot_accuracy_3
- dot_accuracy_5
- dot_precision_1
- dot_precision_3
- dot_precision_5
- dot_recall_1
- dot_recall_3
- dot_recall_5
- dot_map_100
- dot_ndcg_1
- dot_mrr_1
model-index:
- name: SentenceTransformer based on intfloat/multilingual-e5-base
results:
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: 'validation matryoshka dim 768 '
type: validation--matryoshka_dim-768--
metrics:
- type: dot_accuracy_10
value: 0.471532222776257
name: Dot Accuracy 10
- type: dot_precision_10
value: 0.08159134620276996
name: Dot Precision 10
- type: dot_recall_10
value: 0.34224243157429496
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.23821978834610905
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.23429693389396777
name: Dot Mrr 10
- type: dot_map_60
value: 0.2036515780475901
name: Dot Map 60
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: bm full
type: bm-full
metrics:
- type: dot_accuracy_1
value: 0.6861626248216833
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.7964812173086068
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.8525915359010937
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9063242986210176
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.6861626248216833
name: Dot Precision 1
- type: dot_precision_3
value: 0.6771279125059438
name: Dot Precision 3
- type: dot_precision_5
value: 0.6605801236329054
name: Dot Precision 5
- type: dot_precision_10
value: 0.6134094151212554
name: Dot Precision 10
- type: dot_recall_1
value: 0.04668966384307371
name: Dot Recall 1
- type: dot_recall_3
value: 0.13507182853716604
name: Dot Recall 3
- type: dot_recall_5
value: 0.21306175070934336
name: Dot Recall 5
- type: dot_recall_10
value: 0.36601551837301194
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.65600709607968
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.7509249824513727
name: Dot Mrr 10
- type: dot_map_100
value: 0.6072416533315765
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: core uk title
type: core-uk-title
metrics:
- type: dot_accuracy_1
value: 0.7887139107611548
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.926509186351706
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9671916010498688
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9908136482939632
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.7887139107611548
name: Dot Precision 1
- type: dot_precision_3
value: 0.7195975503062116
name: Dot Precision 3
- type: dot_precision_5
value: 0.6291338582677166
name: Dot Precision 5
- type: dot_precision_10
value: 0.3898950131233596
name: Dot Precision 10
- type: dot_recall_1
value: 0.24053972420114153
name: Dot Recall 1
- type: dot_recall_3
value: 0.5723758488522268
name: Dot Recall 3
- type: dot_recall_5
value: 0.7821485855934673
name: Dot Recall 5
- type: dot_recall_10
value: 0.9340447027454901
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8576447362007393
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8626879973336663
name: Dot Mrr 10
- type: dot_map_100
value: 0.8122655107656657
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: core ru title
type: core-ru-title
metrics:
- type: dot_accuracy_1
value: 0.800524934383202
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.9225721784776902
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9645669291338582
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9908136482939632
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.800524934383202
name: Dot Precision 1
- type: dot_precision_3
value: 0.7239720034995627
name: Dot Precision 3
- type: dot_precision_5
value: 0.6338582677165354
name: Dot Precision 5
- type: dot_precision_10
value: 0.39041994750656167
name: Dot Precision 10
- type: dot_recall_1
value: 0.24474649002208057
name: Dot Recall 1
- type: dot_recall_3
value: 0.5728575594717327
name: Dot Recall 3
- type: dot_recall_5
value: 0.7877208057326167
name: Dot Recall 5
- type: dot_recall_10
value: 0.9334697746115069
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8609903761426677
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8679654626505019
name: Dot Mrr 10
- type: dot_map_100
value: 0.8173926974530602
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: core uk options
type: core-uk-options
metrics:
- type: dot_accuracy_1
value: 0.6666666666666666
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8451443569553806
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9028871391076115
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9606299212598425
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.6666666666666666
name: Dot Precision 1
- type: dot_precision_3
value: 0.6251093613298337
name: Dot Precision 3
- type: dot_precision_5
value: 0.5585301837270341
name: Dot Precision 5
- type: dot_precision_10
value: 0.36811023622047245
name: Dot Precision 10
- type: dot_recall_1
value: 0.1954453609965421
name: Dot Recall 1
- type: dot_recall_3
value: 0.48256624171978496
name: Dot Recall 3
- type: dot_recall_5
value: 0.6772981502312211
name: Dot Recall 5
- type: dot_recall_10
value: 0.8667755072282631
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.7687933316395861
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.7696902470524517
name: Dot Mrr 10
- type: dot_map_100
value: 0.7160782009013741
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: core ru options
type: core-ru-options
metrics:
- type: dot_accuracy_1
value: 0.6758530183727034
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8503937007874016
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9094488188976378
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9593175853018373
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.6758530183727034
name: Dot Precision 1
- type: dot_precision_3
value: 0.6251093613298337
name: Dot Precision 3
- type: dot_precision_5
value: 0.5593175853018373
name: Dot Precision 5
- type: dot_precision_10
value: 0.3656167979002625
name: Dot Precision 10
- type: dot_recall_1
value: 0.1985512227638212
name: Dot Recall 1
- type: dot_recall_3
value: 0.485936132983377
name: Dot Recall 3
- type: dot_recall_5
value: 0.6818100862392201
name: Dot Recall 5
- type: dot_recall_10
value: 0.8625692621755614
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.7675707330888574
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.7727190351206101
name: Dot Mrr 10
- type: dot_map_100
value: 0.7162203481785496
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: options uk title
type: options-uk-title
metrics:
- type: dot_accuracy_1
value: 0.8228155339805825
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.9441747572815534
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9733009708737864
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9927184466019418
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.8228155339805825
name: Dot Precision 1
- type: dot_precision_3
value: 0.7483818770226538
name: Dot Precision 3
- type: dot_precision_5
value: 0.5893203883495145
name: Dot Precision 5
- type: dot_precision_10
value: 0.3378640776699029
name: Dot Precision 10
- type: dot_recall_1
value: 0.2576687471104947
name: Dot Recall 1
- type: dot_recall_3
value: 0.6688973647711511
name: Dot Recall 3
- type: dot_recall_5
value: 0.8499479889042997
name: Dot Recall 5
- type: dot_recall_10
value: 0.9645428802588997
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8845008693025476
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.885860687316998
name: Dot Mrr 10
- type: dot_map_100
value: 0.8301493504859574
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: options ru title
type: options-ru-title
metrics:
- type: dot_accuracy_1
value: 0.8203883495145631
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.9368932038834952
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9757281553398058
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9927184466019418
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.8203883495145631
name: Dot Precision 1
- type: dot_precision_3
value: 0.75
name: Dot Precision 3
- type: dot_precision_5
value: 0.5898058252427184
name: Dot Precision 5
- type: dot_precision_10
value: 0.3366504854368932
name: Dot Precision 10
- type: dot_recall_1
value: 0.25544382801664356
name: Dot Recall 1
- type: dot_recall_3
value: 0.6682096625057791
name: Dot Recall 3
- type: dot_recall_5
value: 0.8495839112343966
name: Dot Recall 5
- type: dot_recall_10
value: 0.9612864077669901
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8831806165585419
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8836492525812916
name: Dot Mrr 10
- type: dot_map_100
value: 0.8310617357057324
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: options uk options
type: options-uk-options
metrics:
- type: dot_accuracy_1
value: 0.6893203883495146
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8640776699029126
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9150485436893204
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9587378640776699
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.6893203883495146
name: Dot Precision 1
- type: dot_precision_3
value: 0.6318770226537216
name: Dot Precision 3
- type: dot_precision_5
value: 0.5131067961165048
name: Dot Precision 5
- type: dot_precision_10
value: 0.30970873786407765
name: Dot Precision 10
- type: dot_recall_1
value: 0.21082408691631993
name: Dot Recall 1
- type: dot_recall_3
value: 0.5591308368007397
name: Dot Recall 3
- type: dot_recall_5
value: 0.7383032824780397
name: Dot Recall 5
- type: dot_recall_10
value: 0.8790250809061488
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.7782100511048052
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.782757358606873
name: Dot Mrr 10
- type: dot_map_100
value: 0.7187646574270111
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: options ru options
type: options-ru-options
metrics:
- type: dot_accuracy_1
value: 0.691747572815534
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8713592233009708
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9174757281553398
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9587378640776699
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.691747572815534
name: Dot Precision 1
- type: dot_precision_3
value: 0.634304207119741
name: Dot Precision 3
- type: dot_precision_5
value: 0.5121359223300971
name: Dot Precision 5
- type: dot_precision_10
value: 0.30898058252427185
name: Dot Precision 10
- type: dot_recall_1
value: 0.21211858529819694
name: Dot Recall 1
- type: dot_recall_3
value: 0.5634997688395746
name: Dot Recall 3
- type: dot_recall_5
value: 0.7373526352288489
name: Dot Recall 5
- type: dot_recall_10
value: 0.8778114886731391
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.778811344668198
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.7868854985359838
name: Dot Mrr 10
- type: dot_map_100
value: 0.7187745303836717
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: rusisms uk title
type: rusisms-uk-title
metrics:
- type: dot_accuracy_1
value: 0.8307692307692308
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.9076923076923077
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9230769230769231
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9384615384615385
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.8307692307692308
name: Dot Precision 1
- type: dot_precision_3
value: 0.7871794871794873
name: Dot Precision 3
- type: dot_precision_5
value: 0.7353846153846153
name: Dot Precision 5
- type: dot_precision_10
value: 0.64
name: Dot Precision 10
- type: dot_recall_1
value: 0.15370877634922128
name: Dot Recall 1
- type: dot_recall_3
value: 0.3588767193522337
name: Dot Recall 3
- type: dot_recall_5
value: 0.4859039533050944
name: Dot Recall 5
- type: dot_recall_10
value: 0.7001575879745761
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8430087861918706
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8716117216117216
name: Dot Mrr 10
- type: dot_map_100
value: 0.833319403026344
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: rusisms ru title
type: rusisms-ru-title
metrics:
- type: dot_accuracy_1
value: 0.8538461538461538
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8923076923076924
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9153846153846154
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9384615384615385
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.8538461538461538
name: Dot Precision 1
- type: dot_precision_3
value: 0.7871794871794873
name: Dot Precision 3
- type: dot_precision_5
value: 0.7415384615384616
name: Dot Precision 5
- type: dot_precision_10
value: 0.6361538461538462
name: Dot Precision 10
- type: dot_recall_1
value: 0.1672829799234249
name: Dot Recall 1
- type: dot_recall_3
value: 0.35084530632082067
name: Dot Recall 3
- type: dot_recall_5
value: 0.48702574817688926
name: Dot Recall 5
- type: dot_recall_10
value: 0.6996719624889505
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8448495673199355
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8784188034188034
name: Dot Mrr 10
- type: dot_map_100
value: 0.8405111806620816
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: rusisms uk options
type: rusisms-uk-options
metrics:
- type: dot_accuracy_1
value: 0.6846153846153846
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.7846153846153846
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.8307692307692308
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9076923076923077
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.6846153846153846
name: Dot Precision 1
- type: dot_precision_3
value: 0.6743589743589743
name: Dot Precision 3
- type: dot_precision_5
value: 0.6415384615384616
name: Dot Precision 5
- type: dot_precision_10
value: 0.5746153846153846
name: Dot Precision 10
- type: dot_recall_1
value: 0.12884582239571002
name: Dot Recall 1
- type: dot_recall_3
value: 0.31245321810288107
name: Dot Recall 3
- type: dot_recall_5
value: 0.42559222255218715
name: Dot Recall 5
- type: dot_recall_10
value: 0.6453739612985345
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.7480936271212298
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.7509493284493283
name: Dot Mrr 10
- type: dot_map_100
value: 0.7425708839632604
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: rusisms ru options
type: rusisms-ru-options
metrics:
- type: dot_accuracy_1
value: 0.7307692307692307
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.823076923076923
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.8846153846153846
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.7307692307692307
name: Dot Precision 1
- type: dot_precision_3
value: 0.676923076923077
name: Dot Precision 3
- type: dot_precision_5
value: 0.663076923076923
name: Dot Precision 5
- type: dot_precision_10
value: 0.5753846153846154
name: Dot Precision 10
- type: dot_recall_1
value: 0.13837573692562458
name: Dot Recall 1
- type: dot_recall_3
value: 0.3179817075013395
name: Dot Recall 3
- type: dot_recall_5
value: 0.4479109659003423
name: Dot Recall 5
- type: dot_recall_10
value: 0.6397850203249781
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.7591370002283114
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.7871886446886445
name: Dot Mrr 10
- type: dot_map_100
value: 0.7544740852251903
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: rusisms corrected uk title
type: rusisms_corrected-uk-title
metrics:
- type: dot_accuracy_1
value: 0.9230769230769231
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.9769230769230769
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9846153846153847
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9923076923076923
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.9230769230769231
name: Dot Precision 1
- type: dot_precision_3
value: 0.8358974358974359
name: Dot Precision 3
- type: dot_precision_5
value: 0.7969230769230768
name: Dot Precision 5
- type: dot_precision_10
value: 0.6753846153846155
name: Dot Precision 10
- type: dot_recall_1
value: 0.19286190582843776
name: Dot Recall 1
- type: dot_recall_3
value: 0.39436099956477483
name: Dot Recall 3
- type: dot_recall_5
value: 0.5413524032787159
name: Dot Recall 5
- type: dot_recall_10
value: 0.7549475685254261
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.9158457183274707
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.9502564102564103
name: Dot Mrr 10
- type: dot_map_100
value: 0.9056120868131005
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: rusisms corrected ru title
type: rusisms_corrected-ru-title
metrics:
- type: dot_accuracy_1
value: 0.9153846153846154
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.9615384615384616
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9615384615384616
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9923076923076923
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.9153846153846154
name: Dot Precision 1
- type: dot_precision_3
value: 0.8282051282051283
name: Dot Precision 3
- type: dot_precision_5
value: 0.7815384615384615
name: Dot Precision 5
- type: dot_precision_10
value: 0.6738461538461538
name: Dot Precision 10
- type: dot_recall_1
value: 0.1927395282060601
name: Dot Recall 1
- type: dot_recall_3
value: 0.38262803481710417
name: Dot Recall 3
- type: dot_recall_5
value: 0.5193256800019926
name: Dot Recall 5
- type: dot_recall_10
value: 0.751256408464297
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.9093765276897922
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.9432234432234431
name: Dot Mrr 10
- type: dot_map_100
value: 0.8989008077917762
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: rusisms corrected uk options
type: rusisms_corrected-uk-options
metrics:
- type: dot_accuracy_1
value: 0.8
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8846153846153846
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9230769230769231
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9923076923076923
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.8
name: Dot Precision 1
- type: dot_precision_3
value: 0.7461538461538462
name: Dot Precision 3
- type: dot_precision_5
value: 0.7107692307692308
name: Dot Precision 5
- type: dot_precision_10
value: 0.6469230769230769
name: Dot Precision 10
- type: dot_recall_1
value: 0.1613825127584874
name: Dot Recall 1
- type: dot_recall_3
value: 0.33880570690421896
name: Dot Recall 3
- type: dot_recall_5
value: 0.4734387401444646
name: Dot Recall 5
- type: dot_recall_10
value: 0.7305622917794682
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8466323426002027
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8580372405372405
name: Dot Mrr 10
- type: dot_map_100
value: 0.8358041815803707
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: rusisms corrected ru options
type: rusisms_corrected-ru-options
metrics:
- type: dot_accuracy_1
value: 0.8153846153846154
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.9384615384615385
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9692307692307692
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9923076923076923
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.8153846153846154
name: Dot Precision 1
- type: dot_precision_3
value: 0.7769230769230769
name: Dot Precision 3
- type: dot_precision_5
value: 0.7430769230769231
name: Dot Precision 5
- type: dot_precision_10
value: 0.6469230769230769
name: Dot Precision 10
- type: dot_recall_1
value: 0.1617457606217352
name: Dot Recall 1
- type: dot_recall_3
value: 0.3647951242636054
name: Dot Recall 3
- type: dot_recall_5
value: 0.5108016550073794
name: Dot Recall 5
- type: dot_recall_10
value: 0.7309236117460514
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8600680223019087
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8773076923076921
name: Dot Mrr 10
- type: dot_map_100
value: 0.8504375427082684
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: core typos uk title
type: core_typos-uk-title
metrics:
- type: dot_accuracy_1
value: 0.7020997375328084
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8530183727034121
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9094488188976378
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9566929133858267
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.7020997375328084
name: Dot Precision 1
- type: dot_precision_3
value: 0.6373578302712161
name: Dot Precision 3
- type: dot_precision_5
value: 0.5666666666666667
name: Dot Precision 5
- type: dot_precision_10
value: 0.35971128608923886
name: Dot Precision 10
- type: dot_recall_1
value: 0.20417447819022624
name: Dot Recall 1
- type: dot_recall_3
value: 0.5012982752155981
name: Dot Recall 3
- type: dot_recall_5
value: 0.6996099445902597
name: Dot Recall 5
- type: dot_recall_10
value: 0.857386055909678
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.7737050537251563
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.7894914177394494
name: Dot Mrr 10
- type: dot_map_100
value: 0.7246307559393186
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: core typos ru title
type: core_typos-ru-title
metrics:
- type: dot_accuracy_1
value: 0.7178477690288714
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8543307086614174
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9094488188976378
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9593175853018373
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.7178477690288714
name: Dot Precision 1
- type: dot_precision_3
value: 0.6430446194225722
name: Dot Precision 3
- type: dot_precision_5
value: 0.568503937007874
name: Dot Precision 5
- type: dot_precision_10
value: 0.3603674540682415
name: Dot Precision 10
- type: dot_recall_1
value: 0.2075495771361913
name: Dot Recall 1
- type: dot_recall_3
value: 0.5054097404491106
name: Dot Recall 3
- type: dot_recall_5
value: 0.7007410532016832
name: Dot Recall 5
- type: dot_recall_10
value: 0.8613824313627464
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.7780047051865264
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.7987939007624044
name: Dot Mrr 10
- type: dot_map_100
value: 0.727822322472804
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: core typos uk options
type: core_typos-uk-options
metrics:
- type: dot_accuracy_1
value: 0.562992125984252
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.7506561679790026
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.8110236220472441
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.8832020997375328
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.562992125984252
name: Dot Precision 1
- type: dot_precision_3
value: 0.5323709536307961
name: Dot Precision 3
- type: dot_precision_5
value: 0.4745406824146982
name: Dot Precision 5
- type: dot_precision_10
value: 0.32020997375328086
name: Dot Precision 10
- type: dot_recall_1
value: 0.16236980794067407
name: Dot Recall 1
- type: dot_recall_3
value: 0.4062455734699829
name: Dot Recall 3
- type: dot_recall_5
value: 0.571470753655793
name: Dot Recall 5
- type: dot_recall_10
value: 0.7552811106944965
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.6583622139042691
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.6701943507061612
name: Dot Mrr 10
- type: dot_map_100
value: 0.6088867550722008
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: core typos ru options
type: core_typos-ru-options
metrics:
- type: dot_accuracy_1
value: 0.5656167979002624
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.7506561679790026
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.8136482939632546
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.8910761154855643
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.5656167979002624
name: Dot Precision 1
- type: dot_precision_3
value: 0.5284339457567804
name: Dot Precision 3
- type: dot_precision_5
value: 0.4732283464566929
name: Dot Precision 5
- type: dot_precision_10
value: 0.32086614173228345
name: Dot Precision 10
- type: dot_recall_1
value: 0.16283849935424738
name: Dot Recall 1
- type: dot_recall_3
value: 0.4039828354788985
name: Dot Recall 3
- type: dot_recall_5
value: 0.5723029412990043
name: Dot Recall 5
- type: dot_recall_10
value: 0.7582671957671957
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.6596609305267019
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.6717733200016661
name: Dot Mrr 10
- type: dot_map_100
value: 0.6085857438998208
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: 'bm full matryoshka dim 768 '
type: bm-full--matryoshka_dim-768--
metrics:
- type: dot_accuracy_1
value: 0.6861626248216833
name: Dot Accuracy 1
- type: dot_precision_1
value: 0.6861626248216833
name: Dot Precision 1
- type: dot_recall_1
value: 0.04668966384307371
name: Dot Recall 1
- type: dot_ndcg_1
value: 0.6861626248216833
name: Dot Ndcg 1
- type: dot_mrr_1
value: 0.6861626248216833
name: Dot Mrr 1
- type: dot_map_100
value: 0.6072416533315765
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: 'bm full matryoshka dim 512 '
type: bm-full--matryoshka_dim-512--
metrics:
- type: dot_accuracy_1
value: 0.6785544460294817
name: Dot Accuracy 1
- type: dot_precision_1
value: 0.6785544460294817
name: Dot Precision 1
- type: dot_recall_1
value: 0.04616228207848423
name: Dot Recall 1
- type: dot_ndcg_1
value: 0.6785544460294817
name: Dot Ndcg 1
- type: dot_mrr_1
value: 0.6785544460294817
name: Dot Mrr 1
- type: dot_map_100
value: 0.6022887817250344
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: 'bm full matryoshka dim 256 '
type: bm-full--matryoshka_dim-256--
metrics:
- type: dot_accuracy_1
value: 0.6704707560627675
name: Dot Accuracy 1
- type: dot_precision_1
value: 0.6704707560627675
name: Dot Precision 1
- type: dot_recall_1
value: 0.045373533996273384
name: Dot Recall 1
- type: dot_ndcg_1
value: 0.6704707560627675
name: Dot Ndcg 1
- type: dot_mrr_1
value: 0.6704707560627675
name: Dot Mrr 1
- type: dot_map_100
value: 0.5908672963850514
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: 'bm full matryoshka dim 128 '
type: bm-full--matryoshka_dim-128--
metrics:
- type: dot_accuracy_1
value: 0.6562054208273894
name: Dot Accuracy 1
- type: dot_precision_1
value: 0.6562054208273894
name: Dot Precision 1
- type: dot_recall_1
value: 0.04344229579260482
name: Dot Recall 1
- type: dot_ndcg_1
value: 0.6562054208273894
name: Dot Ndcg 1
- type: dot_mrr_1
value: 0.6562054208273894
name: Dot Mrr 1
- type: dot_map_100
value: 0.5607118898662458
name: Dot Map 100
---
# SentenceTransformer based on intfloat/multilingual-e5-base
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [intfloat/multilingual-e5-base](https://huggingface.co/intfloat/multilingual-e5-base) on the rozetka_positive_pairs dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [intfloat/multilingual-e5-base](https://huggingface.co/intfloat/multilingual-e5-base)
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 768 dimensions
- **Similarity Function:** Dot Product
- **Training Dataset:**
- rozetka_positive_pairs
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
RZTKSentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("yklymchuk-rztk/multilingual-e5-base-matryoshka2d-mnr-10")
# Run inference
sentences = [
'query: koton женская одежда',
'passage: Жіночі штани Koton Габарити С Стандарт (до 300x200x250 мм) Кількість вантажних місць 1 Країна реєстрації бренда Туреччина Країна-виробник товару Туреччина Розмір M Стиль Повсякденний (casual) Колір Зелений Моделі Кюлоти Доставка Доставка в магазини ROZETKA',
'passage: Женские блузы Koton Габариты_old C Стандарт (до 300x200x250 мм) Количество грузовых мест 1 Страна регистрации бренда Турция Страна-производитель товара Турция Размер M Стиль Повседневный (casual) Цвет Бежевый Материал Полиэстер Материал Эластан Доставка Доставка в магазины ROZETKA',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
## Evaluation
### Metrics
#### RZTKInformation Retrieval
* Dataset: `validation--matryoshka_dim-768--`
* Evaluated with sentence_transformers_training.evaluation.information_retrieval_evaluator.RZTKInformationRetrievalEvaluator
| Metric | Value |
|:-----------------|:-----------|
| dot_accuracy_10 | 0.4715 |
| dot_precision_10 | 0.0816 |
| dot_recall_10 | 0.3422 |
| **dot_ndcg_10** | **0.2382** |
| dot_mrr_10 | 0.2343 |
| dot_map_60 | 0.2037 |
#### RZTKInformation Retrieval
* Datasets: `bm-full`, `core-uk-title`, `core-ru-title`, `core-uk-options`, `core-ru-options`, `options-uk-title`, `options-ru-title`, `options-uk-options`, `options-ru-options`, `rusisms-uk-title`, `rusisms-ru-title`, `rusisms-uk-options`, `rusisms-ru-options`, `rusisms_corrected-uk-title`, `rusisms_corrected-ru-title`, `rusisms_corrected-uk-options`, `rusisms_corrected-ru-options`, `core_typos-uk-title`, `core_typos-ru-title`, `core_typos-uk-options` and `core_typos-ru-options`
* Evaluated with sentence_transformers_training.evaluation.information_retrieval_evaluator.RZTKInformationRetrievalEvaluator
| Metric | bm-full | core-uk-title | core-ru-title | core-uk-options | core-ru-options | options-uk-title | options-ru-title | options-uk-options | options-ru-options | rusisms-uk-title | rusisms-ru-title | rusisms-uk-options | rusisms-ru-options | rusisms_corrected-uk-title | rusisms_corrected-ru-title | rusisms_corrected-uk-options | rusisms_corrected-ru-options | core_typos-uk-title | core_typos-ru-title | core_typos-uk-options | core_typos-ru-options |
|:-----------------|:----------|:--------------|:--------------|:----------------|:----------------|:-----------------|:-----------------|:-------------------|:-------------------|:-----------------|:-----------------|:-------------------|:-------------------|:---------------------------|:---------------------------|:-----------------------------|:-----------------------------|:--------------------|:--------------------|:----------------------|:----------------------|
| dot_accuracy_1 | 0.6862 | 0.7887 | 0.8005 | 0.6667 | 0.6759 | 0.8228 | 0.8204 | 0.6893 | 0.6917 | 0.8308 | 0.8538 | 0.6846 | 0.7308 | 0.9231 | 0.9154 | 0.8 | 0.8154 | 0.7021 | 0.7178 | 0.563 | 0.5656 |
| dot_accuracy_3 | 0.7965 | 0.9265 | 0.9226 | 0.8451 | 0.8504 | 0.9442 | 0.9369 | 0.8641 | 0.8714 | 0.9077 | 0.8923 | 0.7846 | 0.8231 | 0.9769 | 0.9615 | 0.8846 | 0.9385 | 0.853 | 0.8543 | 0.7507 | 0.7507 |
| dot_accuracy_5 | 0.8526 | 0.9672 | 0.9646 | 0.9029 | 0.9094 | 0.9733 | 0.9757 | 0.915 | 0.9175 | 0.9231 | 0.9154 | 0.8308 | 0.8846 | 0.9846 | 0.9615 | 0.9231 | 0.9692 | 0.9094 | 0.9094 | 0.811 | 0.8136 |
| dot_accuracy_10 | 0.9063 | 0.9908 | 0.9908 | 0.9606 | 0.9593 | 0.9927 | 0.9927 | 0.9587 | 0.9587 | 0.9385 | 0.9385 | 0.9077 | 0.9 | 0.9923 | 0.9923 | 0.9923 | 0.9923 | 0.9567 | 0.9593 | 0.8832 | 0.8911 |
| dot_precision_1 | 0.6862 | 0.7887 | 0.8005 | 0.6667 | 0.6759 | 0.8228 | 0.8204 | 0.6893 | 0.6917 | 0.8308 | 0.8538 | 0.6846 | 0.7308 | 0.9231 | 0.9154 | 0.8 | 0.8154 | 0.7021 | 0.7178 | 0.563 | 0.5656 |
| dot_precision_3 | 0.6771 | 0.7196 | 0.724 | 0.6251 | 0.6251 | 0.7484 | 0.75 | 0.6319 | 0.6343 | 0.7872 | 0.7872 | 0.6744 | 0.6769 | 0.8359 | 0.8282 | 0.7462 | 0.7769 | 0.6374 | 0.643 | 0.5324 | 0.5284 |
| dot_precision_5 | 0.6606 | 0.6291 | 0.6339 | 0.5585 | 0.5593 | 0.5893 | 0.5898 | 0.5131 | 0.5121 | 0.7354 | 0.7415 | 0.6415 | 0.6631 | 0.7969 | 0.7815 | 0.7108 | 0.7431 | 0.5667 | 0.5685 | 0.4745 | 0.4732 |
| dot_precision_10 | 0.6134 | 0.3899 | 0.3904 | 0.3681 | 0.3656 | 0.3379 | 0.3367 | 0.3097 | 0.309 | 0.64 | 0.6362 | 0.5746 | 0.5754 | 0.6754 | 0.6738 | 0.6469 | 0.6469 | 0.3597 | 0.3604 | 0.3202 | 0.3209 |
| dot_recall_1 | 0.0467 | 0.2405 | 0.2447 | 0.1954 | 0.1986 | 0.2577 | 0.2554 | 0.2108 | 0.2121 | 0.1537 | 0.1673 | 0.1288 | 0.1384 | 0.1929 | 0.1927 | 0.1614 | 0.1617 | 0.2042 | 0.2075 | 0.1624 | 0.1628 |
| dot_recall_3 | 0.1351 | 0.5724 | 0.5729 | 0.4826 | 0.4859 | 0.6689 | 0.6682 | 0.5591 | 0.5635 | 0.3589 | 0.3508 | 0.3125 | 0.318 | 0.3944 | 0.3826 | 0.3388 | 0.3648 | 0.5013 | 0.5054 | 0.4062 | 0.404 |
| dot_recall_5 | 0.2131 | 0.7821 | 0.7877 | 0.6773 | 0.6818 | 0.8499 | 0.8496 | 0.7383 | 0.7374 | 0.4859 | 0.487 | 0.4256 | 0.4479 | 0.5414 | 0.5193 | 0.4734 | 0.5108 | 0.6996 | 0.7007 | 0.5715 | 0.5723 |
| dot_recall_10 | 0.366 | 0.934 | 0.9335 | 0.8668 | 0.8626 | 0.9645 | 0.9613 | 0.879 | 0.8778 | 0.7002 | 0.6997 | 0.6454 | 0.6398 | 0.7549 | 0.7513 | 0.7306 | 0.7309 | 0.8574 | 0.8614 | 0.7553 | 0.7583 |
| **dot_ndcg_10** | **0.656** | **0.8576** | **0.861** | **0.7688** | **0.7676** | **0.8845** | **0.8832** | **0.7782** | **0.7788** | **0.843** | **0.8448** | **0.7481** | **0.7591** | **0.9158** | **0.9094** | **0.8466** | **0.8601** | **0.7737** | **0.778** | **0.6584** | **0.6597** |
| dot_mrr_10 | 0.7509 | 0.8627 | 0.868 | 0.7697 | 0.7727 | 0.8859 | 0.8836 | 0.7828 | 0.7869 | 0.8716 | 0.8784 | 0.7509 | 0.7872 | 0.9503 | 0.9432 | 0.858 | 0.8773 | 0.7895 | 0.7988 | 0.6702 | 0.6718 |
| dot_map_100 | 0.6072 | 0.8123 | 0.8174 | 0.7161 | 0.7162 | 0.8301 | 0.8311 | 0.7188 | 0.7188 | 0.8333 | 0.8405 | 0.7426 | 0.7545 | 0.9056 | 0.8989 | 0.8358 | 0.8504 | 0.7246 | 0.7278 | 0.6089 | 0.6086 |
#### RZTKInformation Retrieval
* Datasets: `bm-full--matryoshka_dim-768--`, `bm-full--matryoshka_dim-512--`, `bm-full--matryoshka_dim-256--` and `bm-full--matryoshka_dim-128--`
* Evaluated with sentence_transformers_training.evaluation.information_retrieval_evaluator.RZTKInformationRetrievalEvaluator
| Metric | bm-full--matryoshka_dim-768-- | bm-full--matryoshka_dim-512-- | bm-full--matryoshka_dim-256-- | bm-full--matryoshka_dim-128-- |
|:----------------|:------------------------------|:------------------------------|:------------------------------|:------------------------------|
| dot_accuracy_1 | 0.6862 | 0.6786 | 0.6705 | 0.6562 |
| dot_precision_1 | 0.6862 | 0.6786 | 0.6705 | 0.6562 |
| dot_recall_1 | 0.0467 | 0.0462 | 0.0454 | 0.0434 |
| **dot_ndcg_1** | **0.6862** | **0.6786** | **0.6705** | **0.6562** |
| dot_mrr_1 | 0.6862 | 0.6786 | 0.6705 | 0.6562 |
| dot_map_100 | 0.6072 | 0.6023 | 0.5909 | 0.5607 |
## Training Details
### Training Dataset
#### rozetka_positive_pairs
* Dataset: rozetka_positive_pairs
* Size: 53,499,287 training samples
* Columns: query
and text
* Approximate statistics based on the first 1000 samples:
| | query | text |
|:--------|:---------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
| type | string | string |
| details |
query: campingaz fold n cool classic 10l dark blue
| passage: Термосумка Campingaz Fold'n Cool Classic 10L Dark Blue (4823082704729)
|
| query: campingaz fold n cool classic 10l dark blue
| passage: Термопродукція Campingaz Гарантія 14 днів Вид Термосумки Колір Синій з білим Режим роботи Охолодження Країна реєстрації бренда Франція Країна-виробник товару Китай Тип гарантійного талона Гарантія по чеку Можливість доставки Почтомати Доставка Premium Немає
|
| query: campingaz fold n cool classic 10l dark blue
| passage: Термосумка Campingaz Fold'n Cool Classic 10L Dark Blue (4823082704729)
|
* Loss: sentence_transformers_training.model.matryoshka2d_loss.RZTKMatryoshka2dLoss
with these parameters:
```json
{
"loss": "RZTKMultipleNegativesRankingLoss",
"n_layers_per_step": 1,
"last_layer_weight": 1.0,
"prior_layers_weight": 1.0,
"kl_div_weight": 1.0,
"kl_temperature": 0.3,
"matryoshka_dims": [
768,
512,
256,
128
],
"matryoshka_weights": [
1,
1,
1,
1
],
"n_dims_per_step": 1
}
```
### Evaluation Dataset
#### rozetka_positive_pairs
* Dataset: rozetka_positive_pairs
* Size: 1,369,397 evaluation samples
* Columns: query
and text
* Approximate statistics based on the first 1000 samples:
| | query | text |
|:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
| type | string | string |
| details | query: ab553446bu
| passage: Акумулятори для мобільних телефонів
|
| query: ab553446bu
| passage: Аккумулятор AB553446BU для Samsung i320 1000 mAh (03649-25)
|
| query: ab553446bu
| passage: Аккумуляторы для мобильных телефонов
|
* Loss: sentence_transformers_training.model.matryoshka2d_loss.RZTKMatryoshka2dLoss
with these parameters:
```json
{
"loss": "RZTKMultipleNegativesRankingLoss",
"n_layers_per_step": 1,
"last_layer_weight": 1.0,
"prior_layers_weight": 1.0,
"kl_div_weight": 1.0,
"kl_temperature": 0.3,
"matryoshka_dims": [
768,
512,
256,
128
],
"matryoshka_weights": [
1,
1,
1,
1
],
"n_dims_per_step": 1
}
```
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: steps
- `per_device_train_batch_size`: 88
- `per_device_eval_batch_size`: 88
- `learning_rate`: 2e-05
- `num_train_epochs`: 1.0
- `warmup_ratio`: 0.1
- `bf16`: True
- `bf16_full_eval`: True
- `tf32`: True
- `dataloader_num_workers`: 4
- `load_best_model_at_end`: True
- `optim`: adafactor
- `push_to_hub`: True
- `hub_model_id`: yklymchuk-rztk/multilingual-e5-base-matryoshka2d-mnr-10
- `hub_private_repo`: True
- `prompts`: {'query': 'query: ', 'text': 'passage: '}
- `batch_sampler`: no_duplicates
#### All Hyperparameters