newsletter commited on
Commit
3121ede
1 Parent(s): 84054a1

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +29 -15
README.md CHANGED
@@ -1,15 +1,16 @@
1
  ---
 
 
 
 
2
  language:
3
  - en
4
  license: mit
 
5
  tags:
6
  - generated_from_trainer
7
  - llama-cpp
8
  - gguf-my-repo
9
- base_model: mistralai/Mistral-7B-v0.1
10
- datasets:
11
- - HuggingFaceH4/ultrachat_200k
12
- - HuggingFaceH4/ultrafeedback_binarized
13
  widget:
14
  - example_title: Pirate!
15
  messages:
@@ -24,7 +25,6 @@ widget:
24
  treat. Once he's gone, ye can clean up yer lawn and enjoy the peace and quiet
25
  once again. But beware, me hearty, for there may be more llamas where that one
26
  came from! Arr!
27
- pipeline_tag: text-generation
28
  model-index:
29
  - name: zephyr-7b-beta
30
  results:
@@ -173,29 +173,43 @@ model-index:
173
  # newsletter/zephyr-7b-beta-Q6_K-GGUF
174
  This model was converted to GGUF format from [`HuggingFaceH4/zephyr-7b-beta`](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
175
  Refer to the [original model card](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) for more details on the model.
176
- ## Use with llama.cpp
177
 
178
- Install llama.cpp through brew.
 
179
 
180
  ```bash
181
- brew install ggerganov/ggerganov/llama.cpp
 
182
  ```
183
  Invoke the llama.cpp server or the CLI.
184
 
185
- CLI:
186
-
187
  ```bash
188
- llama-cli --hf-repo newsletter/zephyr-7b-beta-Q6_K-GGUF --model zephyr-7b-beta.Q6_K.gguf -p "The meaning to life and the universe is"
189
  ```
190
 
191
- Server:
192
-
193
  ```bash
194
- llama-server --hf-repo newsletter/zephyr-7b-beta-Q6_K-GGUF --model zephyr-7b-beta.Q6_K.gguf -c 2048
195
  ```
196
 
197
  Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
198
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
199
  ```
200
- git clone https://github.com/ggerganov/llama.cpp && cd llama.cpp && make && ./main -m zephyr-7b-beta.Q6_K.gguf -n 128
201
  ```
 
1
  ---
2
+ base_model: HuggingFaceH4/zephyr-7b-beta
3
+ datasets:
4
+ - HuggingFaceH4/ultrachat_200k
5
+ - HuggingFaceH4/ultrafeedback_binarized
6
  language:
7
  - en
8
  license: mit
9
+ pipeline_tag: text-generation
10
  tags:
11
  - generated_from_trainer
12
  - llama-cpp
13
  - gguf-my-repo
 
 
 
 
14
  widget:
15
  - example_title: Pirate!
16
  messages:
 
25
  treat. Once he's gone, ye can clean up yer lawn and enjoy the peace and quiet
26
  once again. But beware, me hearty, for there may be more llamas where that one
27
  came from! Arr!
 
28
  model-index:
29
  - name: zephyr-7b-beta
30
  results:
 
173
  # newsletter/zephyr-7b-beta-Q6_K-GGUF
174
  This model was converted to GGUF format from [`HuggingFaceH4/zephyr-7b-beta`](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
175
  Refer to the [original model card](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) for more details on the model.
 
176
 
177
+ ## Use with llama.cpp
178
+ Install llama.cpp through brew (works on Mac and Linux)
179
 
180
  ```bash
181
+ brew install llama.cpp
182
+
183
  ```
184
  Invoke the llama.cpp server or the CLI.
185
 
186
+ ### CLI:
 
187
  ```bash
188
+ llama-cli --hf-repo newsletter/zephyr-7b-beta-Q6_K-GGUF --hf-file zephyr-7b-beta-q6_k.gguf -p "The meaning to life and the universe is"
189
  ```
190
 
191
+ ### Server:
 
192
  ```bash
193
+ llama-server --hf-repo newsletter/zephyr-7b-beta-Q6_K-GGUF --hf-file zephyr-7b-beta-q6_k.gguf -c 2048
194
  ```
195
 
196
  Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
197
 
198
+ Step 1: Clone llama.cpp from GitHub.
199
+ ```
200
+ git clone https://github.com/ggerganov/llama.cpp
201
+ ```
202
+
203
+ Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
204
+ ```
205
+ cd llama.cpp && LLAMA_CURL=1 make
206
+ ```
207
+
208
+ Step 3: Run inference through the main binary.
209
+ ```
210
+ ./llama-cli --hf-repo newsletter/zephyr-7b-beta-Q6_K-GGUF --hf-file zephyr-7b-beta-q6_k.gguf -p "The meaning to life and the universe is"
211
+ ```
212
+ or
213
  ```
214
+ ./llama-server --hf-repo newsletter/zephyr-7b-beta-Q6_K-GGUF --hf-file zephyr-7b-beta-q6_k.gguf -c 2048
215
  ```