README.md
CHANGED
@@ -4,7 +4,16 @@ base_model: mistralai/Mistral-7B-v0.3
|
|
4 |
extra_gated_description: If you want to learn more about how we process your personal data, please read our <a href="https://mistral.ai/terms/">Privacy Policy</a>.
|
5 |
---
|
6 |
|
7 |
-
# Model Card for Mistral-7B-Instruct-v0.3
|
8 |
|
9 |
The Mistral-7B-Instruct-v0.3 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.3.
|
10 |
neuron compiled
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
extra_gated_description: If you want to learn more about how we process your personal data, please read our <a href="https://mistral.ai/terms/">Privacy Policy</a>.
|
5 |
---
|
6 |
|
7 |
+
# Model Card for Mistral-7B-Instruct-v0.3 for inf2.xlarge
|
8 |
|
9 |
The Mistral-7B-Instruct-v0.3 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.3.
|
10 |
neuron compiled
|
11 |
+
these are the shapes currently cached.
|
12 |
+
don't ask me why 8196.. it was a typo.
|
13 |
+
```
|
14 |
+
python -m vllm.entrypoints.openai.api_server --model ./ --max-model-len 8196 --device neuron --tensor-parallel-size 2 --max-num-seqs 2
|
15 |
+
python -m vllm.entrypoints.openai.api_server --model ./ --max-model-len 8196 --device neuron --tensor-parallel-size 2 --max-num-seqs 4
|
16 |
+
python -m vllm.entrypoints.openai.api_server --model ./ --max-model-len 10240 --device neuron --tensor-parallel-size 2 --max-num-seqs 4
|
17 |
+
```
|
18 |
+
|
19 |
+
|