silma-ai
/

silma-embeddding-sts-v0.1

@@ -363,7 +363,7 @@ This model was fine-tuned via 2 phases:
 In phase `1`, we curated a dataset [silma-ai/silma-arabic-triplets-dataset-v1.0](https://huggingface.co/datasets/silma-ai/silma-arabic-triplets-dataset-v1.0) which
 contains more than `2.25M` records of (anchor, positive and negative) Arabic/English samples.
-Only the first `600` samples were taken to be the `eval` dataset, while the rest was used for fine-tuning.
 Phase `1` produces a finetuned `Matryoshka` model based on [aubmindlab/bert-base-arabertv02](https://huggingface.co/aubmindlab/bert-base-arabertv02) with the following hyperparameters:
@@ -376,7 +376,7 @@ Phase `1` produces a finetuned `Matryoshka` model based on [aubmindlab/bert-base
 - `optim`: adamw_torch_fused
 - `batch_sampler`: no_duplicates
-**[trainin-example](https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/matryoshka/matryoshka_sts.py)**
 ### Phase 2:
@@ -385,7 +385,7 @@ In phase `2`, we curated a dataset [silma-ai/silma-arabic-english-sts-dataset-v1
 contains more than `30k` records of (sentence1, sentence2 and similarity-score) Arabic/English samples.
 Only the first `100` samples were taken to be the `eval` dataset, while the rest was used for fine-tuning.
-Phase `1` produces a finetuned `STS` model based on the model from phase `1`, with the following hyperparameters:
 - `eval_strategy`: steps
 - `per_device_train_batch_size`: 250
@@ -397,7 +397,7 @@ Phase `1` produces a finetuned `STS` model based on the model from phase `1`, wi
 - `optim`: adamw_torch_fused
 - `batch_sampler`: no_duplicates
-**[trainin-example](https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/sts/training_stsbenchmark_continue_training.py)**
 </details>

 In phase `1`, we curated a dataset [silma-ai/silma-arabic-triplets-dataset-v1.0](https://huggingface.co/datasets/silma-ai/silma-arabic-triplets-dataset-v1.0) which
 contains more than `2.25M` records of (anchor, positive and negative) Arabic/English samples.
+Only the first `600` samples were taken to be the `eval` dataset, while the rest were used for fine-tuning.
 Phase `1` produces a finetuned `Matryoshka` model based on [aubmindlab/bert-base-arabertv02](https://huggingface.co/aubmindlab/bert-base-arabertv02) with the following hyperparameters:
 - `optim`: adamw_torch_fused
 - `batch_sampler`: no_duplicates
+**[training script](https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/matryoshka/matryoshka_sts.py)**
 ### Phase 2:
 contains more than `30k` records of (sentence1, sentence2 and similarity-score) Arabic/English samples.
 Only the first `100` samples were taken to be the `eval` dataset, while the rest was used for fine-tuning.
+Phase `2` produces a finetuned `STS` model based on the model from phase `1`, with the following hyperparameters:
 - `eval_strategy`: steps
 - `per_device_train_batch_size`: 250
 - `optim`: adamw_torch_fused
 - `batch_sampler`: no_duplicates
+**[training script](https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/sts/training_stsbenchmark_continue_training.py)**
 </details>