newsletter
commited on
Commit
•
3121ede
1
Parent(s):
84054a1
Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -1,15 +1,16 @@
|
|
1 |
---
|
|
|
|
|
|
|
|
|
2 |
language:
|
3 |
- en
|
4 |
license: mit
|
|
|
5 |
tags:
|
6 |
- generated_from_trainer
|
7 |
- llama-cpp
|
8 |
- gguf-my-repo
|
9 |
-
base_model: mistralai/Mistral-7B-v0.1
|
10 |
-
datasets:
|
11 |
-
- HuggingFaceH4/ultrachat_200k
|
12 |
-
- HuggingFaceH4/ultrafeedback_binarized
|
13 |
widget:
|
14 |
- example_title: Pirate!
|
15 |
messages:
|
@@ -24,7 +25,6 @@ widget:
|
|
24 |
treat. Once he's gone, ye can clean up yer lawn and enjoy the peace and quiet
|
25 |
once again. But beware, me hearty, for there may be more llamas where that one
|
26 |
came from! Arr!
|
27 |
-
pipeline_tag: text-generation
|
28 |
model-index:
|
29 |
- name: zephyr-7b-beta
|
30 |
results:
|
@@ -173,29 +173,43 @@ model-index:
|
|
173 |
# newsletter/zephyr-7b-beta-Q6_K-GGUF
|
174 |
This model was converted to GGUF format from [`HuggingFaceH4/zephyr-7b-beta`](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
|
175 |
Refer to the [original model card](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) for more details on the model.
|
176 |
-
## Use with llama.cpp
|
177 |
|
178 |
-
|
|
|
179 |
|
180 |
```bash
|
181 |
-
brew install
|
|
|
182 |
```
|
183 |
Invoke the llama.cpp server or the CLI.
|
184 |
|
185 |
-
CLI:
|
186 |
-
|
187 |
```bash
|
188 |
-
llama-cli --hf-repo newsletter/zephyr-7b-beta-Q6_K-GGUF --
|
189 |
```
|
190 |
|
191 |
-
Server:
|
192 |
-
|
193 |
```bash
|
194 |
-
llama-server --hf-repo newsletter/zephyr-7b-beta-Q6_K-GGUF --
|
195 |
```
|
196 |
|
197 |
Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
|
198 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
199 |
```
|
200 |
-
|
201 |
```
|
|
|
1 |
---
|
2 |
+
base_model: HuggingFaceH4/zephyr-7b-beta
|
3 |
+
datasets:
|
4 |
+
- HuggingFaceH4/ultrachat_200k
|
5 |
+
- HuggingFaceH4/ultrafeedback_binarized
|
6 |
language:
|
7 |
- en
|
8 |
license: mit
|
9 |
+
pipeline_tag: text-generation
|
10 |
tags:
|
11 |
- generated_from_trainer
|
12 |
- llama-cpp
|
13 |
- gguf-my-repo
|
|
|
|
|
|
|
|
|
14 |
widget:
|
15 |
- example_title: Pirate!
|
16 |
messages:
|
|
|
25 |
treat. Once he's gone, ye can clean up yer lawn and enjoy the peace and quiet
|
26 |
once again. But beware, me hearty, for there may be more llamas where that one
|
27 |
came from! Arr!
|
|
|
28 |
model-index:
|
29 |
- name: zephyr-7b-beta
|
30 |
results:
|
|
|
173 |
# newsletter/zephyr-7b-beta-Q6_K-GGUF
|
174 |
This model was converted to GGUF format from [`HuggingFaceH4/zephyr-7b-beta`](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
|
175 |
Refer to the [original model card](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) for more details on the model.
|
|
|
176 |
|
177 |
+
## Use with llama.cpp
|
178 |
+
Install llama.cpp through brew (works on Mac and Linux)
|
179 |
|
180 |
```bash
|
181 |
+
brew install llama.cpp
|
182 |
+
|
183 |
```
|
184 |
Invoke the llama.cpp server or the CLI.
|
185 |
|
186 |
+
### CLI:
|
|
|
187 |
```bash
|
188 |
+
llama-cli --hf-repo newsletter/zephyr-7b-beta-Q6_K-GGUF --hf-file zephyr-7b-beta-q6_k.gguf -p "The meaning to life and the universe is"
|
189 |
```
|
190 |
|
191 |
+
### Server:
|
|
|
192 |
```bash
|
193 |
+
llama-server --hf-repo newsletter/zephyr-7b-beta-Q6_K-GGUF --hf-file zephyr-7b-beta-q6_k.gguf -c 2048
|
194 |
```
|
195 |
|
196 |
Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
|
197 |
|
198 |
+
Step 1: Clone llama.cpp from GitHub.
|
199 |
+
```
|
200 |
+
git clone https://github.com/ggerganov/llama.cpp
|
201 |
+
```
|
202 |
+
|
203 |
+
Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
|
204 |
+
```
|
205 |
+
cd llama.cpp && LLAMA_CURL=1 make
|
206 |
+
```
|
207 |
+
|
208 |
+
Step 3: Run inference through the main binary.
|
209 |
+
```
|
210 |
+
./llama-cli --hf-repo newsletter/zephyr-7b-beta-Q6_K-GGUF --hf-file zephyr-7b-beta-q6_k.gguf -p "The meaning to life and the universe is"
|
211 |
+
```
|
212 |
+
or
|
213 |
```
|
214 |
+
./llama-server --hf-repo newsletter/zephyr-7b-beta-Q6_K-GGUF --hf-file zephyr-7b-beta-q6_k.gguf -c 2048
|
215 |
```
|