Boris commited on
Commit
31399dc
1 Parent(s): 6657b41

update Readme q4 to Q4

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -189,7 +189,7 @@ The following clients/libraries will automatically download models for you, prov
189
 
190
  ### In `text-generation-webui`
191
 
192
- Under Download Model, you can enter the model repo: TheBloke/Llama-2-70B-chat-GGUF and below it, a specific filename to download, such as: llama-2-70b-chat.q4_K_M.gguf.
193
 
194
  Then click Download.
195
 
@@ -204,7 +204,7 @@ pip3 install huggingface-hub>=0.17.1
204
  Then you can download any individual model file to the current directory, at high speed, with a command like this:
205
 
206
  ```shell
207
- huggingface-cli download TheBloke/Llama-2-70B-chat-GGUF llama-2-70b-chat.q4_K_M.gguf --local-dir . --local-dir-use-symlinks False
208
  ```
209
 
210
  <details>
@@ -227,7 +227,7 @@ pip3 install hf_transfer
227
  And set environment variable `HF_HUB_ENABLE_HF_TRANSFER` to `1`:
228
 
229
  ```shell
230
- HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download TheBloke/Llama-2-70B-chat-GGUF llama-2-70b-chat.q4_K_M.gguf --local-dir . --local-dir-use-symlinks False
231
  ```
232
 
233
  Windows CLI users: Use `set HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1` before running the download command.
@@ -240,7 +240,7 @@ Windows CLI users: Use `set HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1` before running
240
  Make sure you are using `llama.cpp` from commit [d0cee0d36d5be95a0d9088b674dbb27354107221](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
241
 
242
  ```shell
243
- ./main -ngl 32 -m llama-2-70b-chat.q4_K_M.gguf --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "[INST] <<SYS>>\nYou are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.\n<</SYS>>\n{prompt}[/INST]"
244
  ```
245
 
246
  Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
 
189
 
190
  ### In `text-generation-webui`
191
 
192
+ Under Download Model, you can enter the model repo: TheBloke/Llama-2-70B-chat-GGUF and below it, a specific filename to download, such as: llama-2-70b-chat.Q4_K_M.gguf.
193
 
194
  Then click Download.
195
 
 
204
  Then you can download any individual model file to the current directory, at high speed, with a command like this:
205
 
206
  ```shell
207
+ huggingface-cli download TheBloke/Llama-2-70B-chat-GGUF llama-2-70b-chat.Q4_K_M.gguf --local-dir . --local-dir-use-symlinks False
208
  ```
209
 
210
  <details>
 
227
  And set environment variable `HF_HUB_ENABLE_HF_TRANSFER` to `1`:
228
 
229
  ```shell
230
+ HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download TheBloke/Llama-2-70B-chat-GGUF llama-2-70b-chat.Q4_K_M.gguf --local-dir . --local-dir-use-symlinks False
231
  ```
232
 
233
  Windows CLI users: Use `set HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1` before running the download command.
 
240
  Make sure you are using `llama.cpp` from commit [d0cee0d36d5be95a0d9088b674dbb27354107221](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
241
 
242
  ```shell
243
+ ./main -ngl 32 -m llama-2-70b-chat.Q4_K_M.gguf --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "[INST] <<SYS>>\nYou are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.\n<</SYS>>\n{prompt}[/INST]"
244
  ```
245
 
246
  Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.