davidshtian
commited on
Commit
•
ee0bcdd
1
Parent(s):
d880cf5
Update README.md
Browse files
README.md
CHANGED
@@ -18,6 +18,22 @@ Please refer to the 🤗 `optimum-neuron` [documentation](https://huggingface.co
|
|
18 |
|
19 |
Note: To compile the mistralai/Mistral-7B-Instruct-v0.2 on Inf2, you need to update the model config sliding_window (either file or model variable) from null to default 4096.
|
20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
## Usage with 🤗 `optimum-neuron`
|
22 |
|
23 |
```python
|
|
|
18 |
|
19 |
Note: To compile the mistralai/Mistral-7B-Instruct-v0.2 on Inf2, you need to update the model config sliding_window (either file or model variable) from null to default 4096.
|
20 |
|
21 |
+
## Usage with 🤗 `TGI`
|
22 |
+
|
23 |
+
```shell
|
24 |
+
export HF_TOKEN="hf_xxx"
|
25 |
+
|
26 |
+
docker run -d -p 8080:80 \
|
27 |
+
-v $(pwd)/data:/data \
|
28 |
+
--device=/dev/neuron0 \
|
29 |
+
-e HF_TOKEN=${HF_TOKEN} \
|
30 |
+
public.ecr.aws/shtian/neuronx-tgi:latest \
|
31 |
+
--model-id davidshtian/Mistral-7B-Instruct-v0.2-neuron-1x2048-2-cores-2.18 \
|
32 |
+
--max-batch-size 1 \
|
33 |
+
--max-input-length 16 \
|
34 |
+
--max-total-tokens 32
|
35 |
+
```
|
36 |
+
|
37 |
## Usage with 🤗 `optimum-neuron`
|
38 |
|
39 |
```python
|