Update README.md
Browse files
README.md
CHANGED
@@ -20,8 +20,8 @@ pipeline_tag: text-generation
|
|
20 |
[Phi4-mini](https://huggingface.co/microsoft/Phi-4-mini-instruct) is quantized by the PyTorch team using [torchao](https://huggingface.co/docs/transformers/main/en/quantization/torchao) with 8-bit embeddings and 8-bit dynamic activations with 4-bit weight linears (8da4w).
|
21 |
The model is suitable for mobile deployment with [ExecuTorch](https://github.com/pytorch/executorch).
|
22 |
|
23 |
-
|
24 |
-
(The provided pte file is exported with the default max_seq_length/max_context_length of 128; if you wish to change this, re-export the model following the instructions in [Exporting to ExecuTorch](#exporting-to-executorch).)
|
25 |
|
26 |
# Running in a mobile app
|
27 |
The [pte file](https://huggingface.co/pytorch/Phi-4-mini-instruct-8da4w/blob/main/phi4-mini-8da4w.pte) can be run with ExecuTorch on a mobile phone. See the [instructions](https://pytorch.org/executorch/main/llm/llama-demo-ios.html) for doing this in iOS.
|
|
|
20 |
[Phi4-mini](https://huggingface.co/microsoft/Phi-4-mini-instruct) is quantized by the PyTorch team using [torchao](https://huggingface.co/docs/transformers/main/en/quantization/torchao) with 8-bit embeddings and 8-bit dynamic activations with 4-bit weight linears (8da4w).
|
21 |
The model is suitable for mobile deployment with [ExecuTorch](https://github.com/pytorch/executorch).
|
22 |
|
23 |
+
We provide the [quantized pte](https://huggingface.co/pytorch/Phi-4-mini-instruct-8da4w/blob/main/phi4-mini-8da4w.pte) for direct use in ExecuTorch.
|
24 |
+
(The provided pte file is exported with the default max_seq_length/max_context_length of 128; if you wish to change this, re-export the quantized model following the instructions in [Exporting to ExecuTorch](#exporting-to-executorch).)
|
25 |
|
26 |
# Running in a mobile app
|
27 |
The [pte file](https://huggingface.co/pytorch/Phi-4-mini-instruct-8da4w/blob/main/phi4-mini-8da4w.pte) can be run with ExecuTorch on a mobile phone. See the [instructions](https://pytorch.org/executorch/main/llm/llama-demo-ios.html) for doing this in iOS.
|