universitytehran
/

PersianMind-v1.0

@@ -92,7 +92,7 @@ model = LlamaForCausalLM.from_pretrained(
 )
 ```
-Alternatively, you can quantize the model in 4-bit (`INT4`) with the following code.
 ```python
 from transformers import BitsAndBytesConfig
@@ -115,7 +115,7 @@ model = LlamaForCausalLM.from_pretrained(
 | :----------------------------------------------------------------: | :--------------------------------------------------------------: | :------------------------------------------------------------------------: | :------------------------------------------------------------------------: | :--------: | :--------: |
 | <span style="font-variant:small-caps;">PersianMind</span> (`BF16`) |        73.9                                                      |                                   83.61                                    |                                     79.44                                  |   13.7G    |   25.35    |
 | <span style="font-variant:small-caps;">PersianMind</span> (`INT8`) |        73.7                                                      |                                   82.32                                    |                                     78.61                                  |    7.2G    |   11.36    |
-| <span style="font-variant:small-caps;">PersianMind</span> (`INT4`) |        70.2                                                      |                                   82.07                                    |                                     80.36                                  |    3.9G    |   24.36    |
 We evaluated quantized models in various tasks against the original model.
 Specifically, we evaluated all models using the reading comprehension multiple-choice

 )
 ```
+Alternatively, you can quantize the model in 4-bit (`NormalFloat4`) with the following code.
 ```python
 from transformers import BitsAndBytesConfig
 | :----------------------------------------------------------------: | :--------------------------------------------------------------: | :------------------------------------------------------------------------: | :------------------------------------------------------------------------: | :--------: | :--------: |
 | <span style="font-variant:small-caps;">PersianMind</span> (`BF16`) |        73.9                                                      |                                   83.61                                    |                                     79.44                                  |   13.7G    |   25.35    |
 | <span style="font-variant:small-caps;">PersianMind</span> (`INT8`) |        73.7                                                      |                                   82.32                                    |                                     78.61                                  |    7.2G    |   11.36    |
+| <span style="font-variant:small-caps;">PersianMind</span> (`NormalFloat4`) |        70.2                                                      |                                   82.07                                    |                                     80.36                                  |    3.9G    |   24.36    |
 We evaluated quantized models in various tasks against the original model.
 Specifically, we evaluated all models using the reading comprehension multiple-choice