haoyang-amd commited on
Commit
4e3d382
·
verified ·
1 Parent(s): 243be4a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -11
README.md CHANGED
@@ -1,12 +1,9 @@
1
  ---
2
  base_model:
3
- - microsoft/Phi-3-mini-4k-instruct
4
- license: other
5
- license_name: llama2
6
- license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2/LICENSE
7
  ---
8
-
9
- # Phi-3-mini-4k-instruct-Weight-INT4-Per-Group-AWQ-Bfloat16
10
  - ## Introduction
11
  This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
12
  - ## Quantization Stragegy
@@ -17,7 +14,7 @@ license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2
17
  1. [Download and install Quark](https://quark.docs.amd.com/latest/install.html)
18
  2. Run the quantization script in the example folder using the following command line:
19
  ```sh
20
- export MODEL_DIR = [local model checkpoint folder] or microsoft/Phi-3-mini-4k-instruct
21
  # single GPU
22
  python3 quantize_quark.py --model_dir $MODEL_DIR \
23
  --data_type bfloat16 \
@@ -26,7 +23,7 @@ license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2
26
  --quant_algo awq \
27
  --dataset pileval_for_awq_benchmark \
28
  --seq_len 512 \
29
- --output_dir Phi-3-mini-4k-instruct-W_Int4-Per_Group-AWQ-BFloat16 \
30
  --model_export quark_safetensors
31
  # cpu
32
  python3 quantize_quark.py --model_dir $MODEL_DIR \
@@ -36,7 +33,7 @@ license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2
36
  --quant_algo awq \
37
  --dataset pileval_for_awq_benchmark \
38
  --seq_len 512 \
39
- --output_dir Phi-3-mini-4k-instruct-W_Int4-Per_Group-AWQ-BFloat16 \
40
  --model_export quark_safetensors \
41
  --device cpu
42
  ```
@@ -58,9 +55,9 @@ The quantization evaluation results are conducted in pseudo-quantization mode, w
58
  <tr>
59
  <td>Perplexity-wikitext2
60
  </td>
61
- <td>6.0164
62
  </td>
63
- <td>6.5575
64
  </td>
65
  </tr>
66
  </table>
 
1
  ---
2
  base_model:
3
+ - microsoft/Phi-3-mini-128k-instruct
4
+ license: mit
 
 
5
  ---
6
+ # Phi-3-mini-128k-instruct-Weight-INT4-Per-Group-AWQ-Bfloat16
 
7
  - ## Introduction
8
  This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
9
  - ## Quantization Stragegy
 
14
  1. [Download and install Quark](https://quark.docs.amd.com/latest/install.html)
15
  2. Run the quantization script in the example folder using the following command line:
16
  ```sh
17
+ export MODEL_DIR = [local model checkpoint folder] or microsoft/Phi-3-mini-128k-instruct
18
  # single GPU
19
  python3 quantize_quark.py --model_dir $MODEL_DIR \
20
  --data_type bfloat16 \
 
23
  --quant_algo awq \
24
  --dataset pileval_for_awq_benchmark \
25
  --seq_len 512 \
26
+ --output_dir Phi-3-mini-128k-instruct-W_Int4-Per_Group-AWQ-BFloat16 \
27
  --model_export quark_safetensors
28
  # cpu
29
  python3 quantize_quark.py --model_dir $MODEL_DIR \
 
33
  --quant_algo awq \
34
  --dataset pileval_for_awq_benchmark \
35
  --seq_len 512 \
36
+ --output_dir Phi-3-mini-128k-instruct-W_Int4-Per_Group-AWQ-BFloat16 \
37
  --model_export quark_safetensors \
38
  --device cpu
39
  ```
 
55
  <tr>
56
  <td>Perplexity-wikitext2
57
  </td>
58
+ <td>6.2359
59
  </td>
60
+ <td>6.8193
61
  </td>
62
  </tr>
63
  </table>