Update README.md
Browse files
README.md
CHANGED
@@ -75,8 +75,8 @@ python3 convert.py ../LLaMA-2-7B-32K
|
|
75 |
./quantize ../LLaMA-2-7B-32K/ggml-model-f16.gguf \
|
76 |
../LLaMA-2-7B-32K/LLaMA-2-7B-32K-Q4_0.gguf Q4_0
|
77 |
```
|
78 |
-
11. run any quantizations you need and stop the container
|
79 |
-
will remain available on your host computer
|
80 |
|
81 |
You are now free to move the quanitization results to where you need them and run inferences with context
|
82 |
lengths up to 32K (depending on the amount of memory you will have available - long contexts need an awful
|
|
|
75 |
./quantize ../LLaMA-2-7B-32K/ggml-model-f16.gguf \
|
76 |
../LLaMA-2-7B-32K/LLaMA-2-7B-32K-Q4_0.gguf Q4_0
|
77 |
```
|
78 |
+
11. run any quantizations you need and stop the container when finished (you may even delete it as the generated files
|
79 |
+
will remain available on your host computer)
|
80 |
|
81 |
You are now free to move the quanitization results to where you need them and run inferences with context
|
82 |
lengths up to 32K (depending on the amount of memory you will have available - long contexts need an awful
|