Update README.md
Browse files
README.md
CHANGED
@@ -156,17 +156,7 @@ lm_eval --model hf --model_args pretrained=microsoft/Phi-4-mini-instruct --tasks
|
|
156 |
|
157 |
## int8 dynamic activation and int4 weight quantization (8da4w)
|
158 |
```
|
159 |
-
|
160 |
-
from lm_eval import evaluator
|
161 |
-
from lm_eval.utils import (
|
162 |
-
make_table,
|
163 |
-
)
|
164 |
-
|
165 |
-
lm_eval_model = lm_eval.models.huggingface.HFLM(pretrained=quantized_model, batch_size=64)
|
166 |
-
results = evaluator.simple_evaluate(
|
167 |
-
lm_eval_model, tasks=["hellaswag"], device="cuda:0", batch_size="auto"
|
168 |
-
)
|
169 |
-
print(make_table(results))
|
170 |
```
|
171 |
|
172 |
| Benchmark | | |
|
@@ -199,7 +189,7 @@ Once ExecuTorch is [set-up](https://pytorch.org/executorch/main/getting-started.
|
|
199 |
We first convert the quantized checkpoint to one ExecuTorch's LLM export script expects by renaming some of the checkpoint keys.
|
200 |
The following script does this for you.
|
201 |
```
|
202 |
-
python -m executorch.examples.models.phi_4_mini.convert_weights
|
203 |
```
|
204 |
|
205 |
Once the checkpoint is converted, we can export to ExecuTorch's PTE format with the XNNPACK delegate.
|
|
|
156 |
|
157 |
## int8 dynamic activation and int4 weight quantization (8da4w)
|
158 |
```
|
159 |
+
lm_eval --model hf --model_args pretrained=pytorch/Phi-4-mini-instruct-8da4w --tasks hellaswag --device cuda:0 --batch_size 64
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
160 |
```
|
161 |
|
162 |
| Benchmark | | |
|
|
|
189 |
We first convert the quantized checkpoint to one ExecuTorch's LLM export script expects by renaming some of the checkpoint keys.
|
190 |
The following script does this for you.
|
191 |
```
|
192 |
+
python -m executorch.examples.models.phi_4_mini.convert_weights pytorch_model.bin phi4-mini-8da4w-converted.bin
|
193 |
```
|
194 |
|
195 |
Once the checkpoint is converted, we can export to ExecuTorch's PTE format with the XNNPACK delegate.
|