Update README.md
Browse filesSome word and spelling changes
README.md
CHANGED
@@ -363,7 +363,7 @@ This model was fine-tuned via 2 phases:
|
|
363 |
|
364 |
In phase `1`, we curated a dataset [silma-ai/silma-arabic-triplets-dataset-v1.0](https://huggingface.co/datasets/silma-ai/silma-arabic-triplets-dataset-v1.0) which
|
365 |
contains more than `2.25M` records of (anchor, positive and negative) Arabic/English samples.
|
366 |
-
Only the first `600` samples were taken to be the `eval` dataset, while the rest
|
367 |
|
368 |
Phase `1` produces a finetuned `Matryoshka` model based on [aubmindlab/bert-base-arabertv02](https://huggingface.co/aubmindlab/bert-base-arabertv02) with the following hyperparameters:
|
369 |
|
@@ -376,7 +376,7 @@ Phase `1` produces a finetuned `Matryoshka` model based on [aubmindlab/bert-base
|
|
376 |
- `optim`: adamw_torch_fused
|
377 |
- `batch_sampler`: no_duplicates
|
378 |
|
379 |
-
**[
|
380 |
|
381 |
|
382 |
### Phase 2:
|
@@ -385,7 +385,7 @@ In phase `2`, we curated a dataset [silma-ai/silma-arabic-english-sts-dataset-v1
|
|
385 |
contains more than `30k` records of (sentence1, sentence2 and similarity-score) Arabic/English samples.
|
386 |
Only the first `100` samples were taken to be the `eval` dataset, while the rest was used for fine-tuning.
|
387 |
|
388 |
-
Phase `
|
389 |
|
390 |
- `eval_strategy`: steps
|
391 |
- `per_device_train_batch_size`: 250
|
@@ -397,7 +397,7 @@ Phase `1` produces a finetuned `STS` model based on the model from phase `1`, wi
|
|
397 |
- `optim`: adamw_torch_fused
|
398 |
- `batch_sampler`: no_duplicates
|
399 |
|
400 |
-
**[
|
401 |
|
402 |
|
403 |
</details>
|
|
|
363 |
|
364 |
In phase `1`, we curated a dataset [silma-ai/silma-arabic-triplets-dataset-v1.0](https://huggingface.co/datasets/silma-ai/silma-arabic-triplets-dataset-v1.0) which
|
365 |
contains more than `2.25M` records of (anchor, positive and negative) Arabic/English samples.
|
366 |
+
Only the first `600` samples were taken to be the `eval` dataset, while the rest were used for fine-tuning.
|
367 |
|
368 |
Phase `1` produces a finetuned `Matryoshka` model based on [aubmindlab/bert-base-arabertv02](https://huggingface.co/aubmindlab/bert-base-arabertv02) with the following hyperparameters:
|
369 |
|
|
|
376 |
- `optim`: adamw_torch_fused
|
377 |
- `batch_sampler`: no_duplicates
|
378 |
|
379 |
+
**[training script](https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/matryoshka/matryoshka_sts.py)**
|
380 |
|
381 |
|
382 |
### Phase 2:
|
|
|
385 |
contains more than `30k` records of (sentence1, sentence2 and similarity-score) Arabic/English samples.
|
386 |
Only the first `100` samples were taken to be the `eval` dataset, while the rest was used for fine-tuning.
|
387 |
|
388 |
+
Phase `2` produces a finetuned `STS` model based on the model from phase `1`, with the following hyperparameters:
|
389 |
|
390 |
- `eval_strategy`: steps
|
391 |
- `per_device_train_batch_size`: 250
|
|
|
397 |
- `optim`: adamw_torch_fused
|
398 |
- `batch_sampler`: no_duplicates
|
399 |
|
400 |
+
**[training script](https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/sts/training_stsbenchmark_continue_training.py)**
|
401 |
|
402 |
|
403 |
</details>
|