universitytehran
/

PersianMind-v1.0

@@ -29,7 +29,7 @@ co2_eq_emissions:
 # <span style="font-variant:small-caps;">PersianMind</span>
 <span style="font-variant:small-caps;">PersianMind</span> is a cross-lingual Persian-English large language model.
-The model achieves state-of-the-art results on Persian subset of the [Belebele](https://github.com/facebookresearch/belebele) benchmark
 and the [ParsiNLU multiple-choice QA](https://github.com/persiannlp/parsinlu) task.
 It also attains performance comparable to GPT-3.5-turbo in a Persian reading comprehension task.
@@ -111,15 +111,15 @@ model = LlamaForCausalLM.from_pretrained(
 ### Evaluating Quantized Models
-| Model                                                              | Belebele (Persian) | Fa→En Translation | En→Fa Translation | Model Size | Tokens/sec |
-| :----------------------------------------------------------------- | :----------------: | :---------------: | :---------------: | :--------: | :--------: |
-| <span style="font-variant:small-caps;">PersianMind</span> (`bf16`) |        73.9        |       83.61       |       79.44       |   13.7G    |   25.35    |
-| <span style="font-variant:small-caps;">PersianMind</span> (`INT8`) |        73.7        |       82.32       |       78.61       |    7.2G    |   11.36    |
-| <span style="font-variant:small-caps;">PersianMind</span> (`INT4`) |        70.2        |       82.07       |       80.36       |    3.9G    |   24.36    |
 We evaluated quantized models in various tasks against the original model.
 Specifically, we evaluated all models using the reading comprehension multiple-choice
-question-answering benchmark of [Belebele](https://github.com/facebookresearch/belebele) (Persian subset) and reported the accuracy of each model.
 Additionally, we evaluated our models for Persian-to-English and English-to-Persian translation tasks.
 For this, we utilized the Persian-English subset of the [Flores-200](https://github.com/facebookresearch/flores/tree/main/flores200) dataset and
 reported our results using the <span style="font-variant:small-caps;">Comet</span> metric.

 # <span style="font-variant:small-caps;">PersianMind</span>
 <span style="font-variant:small-caps;">PersianMind</span> is a cross-lingual Persian-English large language model.
+The model achieves state-of-the-art results on Persian subset of the [<span style="font-variant:small-caps;">Belebele</span>](https://github.com/facebookresearch/belebele) benchmark
 and the [ParsiNLU multiple-choice QA](https://github.com/persiannlp/parsinlu) task.
 It also attains performance comparable to GPT-3.5-turbo in a Persian reading comprehension task.
 ### Evaluating Quantized Models
+| Model                                                              | <span style="font-variant:small-caps;">Belebele</span> (Persian) | Fa→En Translation | En→Fa Translation | Model Size | Tokens/sec |
+| :----------------------------------------------------------------- | :--------------------------------------------------------------: | :---------------: | :---------------: | :--------: | :--------: |
+| <span style="font-variant:small-caps;">PersianMind</span> (`bf16`) |        73.9                                                      |       83.61       |       79.44       |   13.7G    |   25.35    |
+| <span style="font-variant:small-caps;">PersianMind</span> (`INT8`) |        73.7                                                      |       82.32       |       78.61       |    7.2G    |   11.36    |
+| <span style="font-variant:small-caps;">PersianMind</span> (`INT4`) |        70.2                                                      |       82.07       |       80.36       |    3.9G    |   24.36    |
 We evaluated quantized models in various tasks against the original model.
 Specifically, we evaluated all models using the reading comprehension multiple-choice
+question-answering benchmark of [<span style="font-variant:small-caps;">Belebele</span>](https://github.com/facebookresearch/belebele) (Persian subset) and reported the accuracy of each model.
 Additionally, we evaluated our models for Persian-to-English and English-to-Persian translation tasks.
 For this, we utilized the Persian-English subset of the [Flores-200](https://github.com/facebookresearch/flores/tree/main/flores200) dataset and
 reported our results using the <span style="font-variant:small-caps;">Comet</span> metric.