haoyang-amd
/

ts_model

Model card Files Files and versions Community

haoyang-amd commited on Oct 7, 2024

Commit

4e3d382

verified ·

1 Parent(s): 243be4a

Update README.md

Browse files

Files changed (1) hide show

README.md +8 -11

README.md CHANGED Viewed

@@ -1,12 +1,9 @@
 ---
 base_model:
-- microsoft/Phi-3-mini-4k-instruct
-license: other
-license_name: llama2
-license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2/LICENSE
 ---
-# Phi-3-mini-4k-instruct-Weight-INT4-Per-Group-AWQ-Bfloat16
 - ## Introduction
   This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
 - ## Quantization Stragegy
@@ -17,7 +14,7 @@ license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2
 1. [Download and install Quark](https://quark.docs.amd.com/latest/install.html)
 2. Run the quantization script in the example folder using the following command line:
     ```sh
-    export MODEL_DIR = [local model checkpoint folder] or microsoft/Phi-3-mini-4k-instruct
     # single GPU
     python3 quantize_quark.py --model_dir $MODEL_DIR \
                               --data_type bfloat16 \
@@ -26,7 +23,7 @@ license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2
                               --quant_algo awq \
                               --dataset pileval_for_awq_benchmark \
                               --seq_len 512 \
-                              --output_dir Phi-3-mini-4k-instruct-W_Int4-Per_Group-AWQ-BFloat16 \
                               --model_export quark_safetensors
     # cpu
     python3 quantize_quark.py --model_dir $MODEL_DIR \
@@ -36,7 +33,7 @@ license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2
                               --quant_algo awq \
                               --dataset pileval_for_awq_benchmark \
                               --seq_len 512 \
-                              --output_dir Phi-3-mini-4k-instruct-W_Int4-Per_Group-AWQ-BFloat16 \
                               --model_export quark_safetensors \
                               --device cpu
     ```
@@ -58,9 +55,9 @@ The quantization evaluation results are conducted in pseudo-quantization mode, w
   <tr>
    <td>Perplexity-wikitext2
    </td>
-   <td>6.0164
    </td>
-   <td>6.5575
    </td>
   </tr>
 </table>

 ---
 base_model:
+- microsoft/Phi-3-mini-128k-instruct
+license: mit
 ---
+# Phi-3-mini-128k-instruct-Weight-INT4-Per-Group-AWQ-Bfloat16
 - ## Introduction
   This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
 - ## Quantization Stragegy
 1. [Download and install Quark](https://quark.docs.amd.com/latest/install.html)
 2. Run the quantization script in the example folder using the following command line:
     ```sh
+    export MODEL_DIR = [local model checkpoint folder] or microsoft/Phi-3-mini-128k-instruct
     # single GPU
     python3 quantize_quark.py --model_dir $MODEL_DIR \
                               --data_type bfloat16 \
                               --quant_algo awq \
                               --dataset pileval_for_awq_benchmark \
                               --seq_len 512 \
+                              --output_dir Phi-3-mini-128k-instruct-W_Int4-Per_Group-AWQ-BFloat16 \
                               --model_export quark_safetensors
     # cpu
     python3 quantize_quark.py --model_dir $MODEL_DIR \
                               --quant_algo awq \
                               --dataset pileval_for_awq_benchmark \
                               --seq_len 512 \
+                              --output_dir Phi-3-mini-128k-instruct-W_Int4-Per_Group-AWQ-BFloat16 \
                               --model_export quark_safetensors \
                               --device cpu
     ```
   <tr>
    <td>Perplexity-wikitext2
    </td>
+   <td>6.2359
    </td>
+   <td>6.8193
    </td>
   </tr>
 </table>