Update README.md
Browse files
README.md
CHANGED
@@ -1,12 +1,9 @@
|
|
1 |
---
|
2 |
base_model:
|
3 |
-
- microsoft/Phi-3-mini-
|
4 |
-
license:
|
5 |
-
license_name: llama2
|
6 |
-
license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2/LICENSE
|
7 |
---
|
8 |
-
|
9 |
-
# Phi-3-mini-4k-instruct-Weight-INT4-Per-Group-AWQ-Bfloat16
|
10 |
- ## Introduction
|
11 |
This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
|
12 |
- ## Quantization Stragegy
|
@@ -17,7 +14,7 @@ license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2
|
|
17 |
1. [Download and install Quark](https://quark.docs.amd.com/latest/install.html)
|
18 |
2. Run the quantization script in the example folder using the following command line:
|
19 |
```sh
|
20 |
-
export MODEL_DIR = [local model checkpoint folder] or microsoft/Phi-3-mini-
|
21 |
# single GPU
|
22 |
python3 quantize_quark.py --model_dir $MODEL_DIR \
|
23 |
--data_type bfloat16 \
|
@@ -26,7 +23,7 @@ license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2
|
|
26 |
--quant_algo awq \
|
27 |
--dataset pileval_for_awq_benchmark \
|
28 |
--seq_len 512 \
|
29 |
-
--output_dir Phi-3-mini-
|
30 |
--model_export quark_safetensors
|
31 |
# cpu
|
32 |
python3 quantize_quark.py --model_dir $MODEL_DIR \
|
@@ -36,7 +33,7 @@ license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama2
|
|
36 |
--quant_algo awq \
|
37 |
--dataset pileval_for_awq_benchmark \
|
38 |
--seq_len 512 \
|
39 |
-
--output_dir Phi-3-mini-
|
40 |
--model_export quark_safetensors \
|
41 |
--device cpu
|
42 |
```
|
@@ -58,9 +55,9 @@ The quantization evaluation results are conducted in pseudo-quantization mode, w
|
|
58 |
<tr>
|
59 |
<td>Perplexity-wikitext2
|
60 |
</td>
|
61 |
-
<td>6.
|
62 |
</td>
|
63 |
-
<td>6.
|
64 |
</td>
|
65 |
</tr>
|
66 |
</table>
|
|
|
1 |
---
|
2 |
base_model:
|
3 |
+
- microsoft/Phi-3-mini-128k-instruct
|
4 |
+
license: mit
|
|
|
|
|
5 |
---
|
6 |
+
# Phi-3-mini-128k-instruct-Weight-INT4-Per-Group-AWQ-Bfloat16
|
|
|
7 |
- ## Introduction
|
8 |
This model was created by applying [Quark](https://quark.docs.amd.com/latest/index.html) with calibration samples from Pile dataset.
|
9 |
- ## Quantization Stragegy
|
|
|
14 |
1. [Download and install Quark](https://quark.docs.amd.com/latest/install.html)
|
15 |
2. Run the quantization script in the example folder using the following command line:
|
16 |
```sh
|
17 |
+
export MODEL_DIR = [local model checkpoint folder] or microsoft/Phi-3-mini-128k-instruct
|
18 |
# single GPU
|
19 |
python3 quantize_quark.py --model_dir $MODEL_DIR \
|
20 |
--data_type bfloat16 \
|
|
|
23 |
--quant_algo awq \
|
24 |
--dataset pileval_for_awq_benchmark \
|
25 |
--seq_len 512 \
|
26 |
+
--output_dir Phi-3-mini-128k-instruct-W_Int4-Per_Group-AWQ-BFloat16 \
|
27 |
--model_export quark_safetensors
|
28 |
# cpu
|
29 |
python3 quantize_quark.py --model_dir $MODEL_DIR \
|
|
|
33 |
--quant_algo awq \
|
34 |
--dataset pileval_for_awq_benchmark \
|
35 |
--seq_len 512 \
|
36 |
+
--output_dir Phi-3-mini-128k-instruct-W_Int4-Per_Group-AWQ-BFloat16 \
|
37 |
--model_export quark_safetensors \
|
38 |
--device cpu
|
39 |
```
|
|
|
55 |
<tr>
|
56 |
<td>Perplexity-wikitext2
|
57 |
</td>
|
58 |
+
<td>6.2359
|
59 |
</td>
|
60 |
+
<td>6.8193
|
61 |
</td>
|
62 |
</tr>
|
63 |
</table>
|