Update README.md
Browse files
README.md
CHANGED
@@ -94,7 +94,8 @@ print(tokenizer.decode(response, skip_special_tokens=True))
|
|
94 |
## Inference Server Hosting Example
|
95 |
```bash
|
96 |
pip install vllm
|
97 |
-
vllm serve scb10x/llama3.1-typhoon2-70b-instruct
|
|
|
98 |
# see more information at https://docs.vllm.ai/
|
99 |
```
|
100 |
|
|
|
94 |
## Inference Server Hosting Example
|
95 |
```bash
|
96 |
pip install vllm
|
97 |
+
vllm serve scb10x/llama3.1-typhoon2-70b-instruct --tensor-parallel-size 2
|
98 |
+
# using at least 2 80GB gpu for hosting 70b model
|
99 |
# see more information at https://docs.vllm.ai/
|
100 |
```
|
101 |
|