Update README.md
Browse files
README.md
CHANGED
@@ -22,19 +22,7 @@ prompt_template: '[INST] <<SYS>>
|
|
22 |
quantized_by: Elkhayyat
|
23 |
---
|
24 |
|
25 |
-
|
26 |
-
<!-- 200823 -->
|
27 |
-
<div style="width: auto; margin-left: auto; margin-right: auto">
|
28 |
-
</div>
|
29 |
-
<div style="display: flex; justify-content: space-between; width: 100%;">
|
30 |
-
<div style="display: flex; flex-direction: column; align-items: flex-start;">
|
31 |
-
</div>
|
32 |
-
<div style="display: flex; flex-direction: column; align-items: flex-end;">
|
33 |
-
</div>
|
34 |
-
</div>
|
35 |
-
<div style="text-align:center; margin-top: 0em; margin-bottom: 0em"><p style="margin-top: 0.25em; margin-bottom: 0em;">TheBloke's LLM work is generously supported by a grant from <a href="https://a16z.com">andreessen horowitz (a16z)</a></p></div>
|
36 |
-
<hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
|
37 |
-
<!-- header end -->
|
38 |
|
39 |
# CodeLlama 7B - GGUF
|
40 |
- Model creator: [Meta](https://huggingface.co/meta-llama)
|
@@ -65,20 +53,14 @@ Here is an incomplate list of clients and libraries that are known to support GG
|
|
65 |
|
66 |
<!-- README_GGUF.md-about-gguf end -->
|
67 |
<!-- repositories-available start -->
|
68 |
-
## Repositories available
|
69 |
|
70 |
-
* [AWQ model(s) for GPU inference.](https://huggingface.co/TheBloke/CodeLlama-7B-AWQ)
|
71 |
-
* [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/CodeLlama-7B-GPTQ)
|
72 |
-
* [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/CodeLlama-7B-GGUF)
|
73 |
-
* [Meta's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/codellama/CodeLlama-7b-hf)
|
74 |
-
<!-- repositories-available end -->
|
75 |
-
|
76 |
-
<!-- prompt-template start -->
|
77 |
## Prompt template: None
|
78 |
|
79 |
```
|
80 |
-
{
|
81 |
-
|
|
|
|
|
82 |
```
|
83 |
|
84 |
<!-- prompt-template end -->
|
@@ -125,7 +107,7 @@ The following clients/libraries will automatically download models for you, prov
|
|
125 |
|
126 |
### In `text-generation-webui`
|
127 |
|
128 |
-
Under Download Model, you can enter the model repo:
|
129 |
|
130 |
Then click Download.
|
131 |
|
@@ -140,7 +122,7 @@ pip3 install huggingface-hub>=0.17.1
|
|
140 |
Then you can download any individual model file to the current directory, at high speed, with a command like this:
|
141 |
|
142 |
```shell
|
143 |
-
huggingface-cli download
|
144 |
```
|
145 |
|
146 |
<details>
|
@@ -149,7 +131,7 @@ huggingface-cli download TheBloke/CodeLlama-7B-GGUF codellama-7b.q4_K_M.gguf --l
|
|
149 |
You can also download multiple files at once with a pattern:
|
150 |
|
151 |
```shell
|
152 |
-
huggingface-cli download
|
153 |
```
|
154 |
|
155 |
For more documentation on downloading with `huggingface-cli`, please see: [HF -> Hub Python Library -> Download files -> Download from the CLI](https://huggingface.co/docs/huggingface_hub/guides/download#download-from-the-cli).
|
@@ -163,7 +145,7 @@ pip3 install hf_transfer
|
|
163 |
And set environment variable `HF_HUB_ENABLE_HF_TRANSFER` to `1`:
|
164 |
|
165 |
```shell
|
166 |
-
HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download
|
167 |
```
|
168 |
|
169 |
Windows CLI users: Use `set HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1` before running the download command.
|
@@ -216,7 +198,7 @@ CT_METAL=1 pip install ctransformers>=0.2.24 --no-binary ctransformers
|
|
216 |
from ctransformers import AutoModelForCausalLM
|
217 |
|
218 |
# Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
|
219 |
-
llm = AutoModelForCausalLM.from_pretrained("
|
220 |
|
221 |
print(llm("AI is going to"))
|
222 |
```
|
|
|
22 |
quantized_by: Elkhayyat
|
23 |
---
|
24 |
|
25 |
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
26 |
|
27 |
# CodeLlama 7B - GGUF
|
28 |
- Model creator: [Meta](https://huggingface.co/meta-llama)
|
|
|
53 |
|
54 |
<!-- README_GGUF.md-about-gguf end -->
|
55 |
<!-- repositories-available start -->
|
|
|
56 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
57 |
## Prompt template: None
|
58 |
|
59 |
```
|
60 |
+
[{"role": "system", "content": '''You are Doctor Sakenah, a virtual AI doctor known for your friendly and approachable demeanor,
|
61 |
+
combined with a deep expertise in the medical field. You're here to provide professional, empathetic, and knowledgeable advice on health-related inquiries.
|
62 |
+
You'll also provide differential diagnosis. If you're unsure about any information, Don't share false information.'''},
|
63 |
+
{"role": "user", "content": f" Symptoms:{inputs}"}]
|
64 |
```
|
65 |
|
66 |
<!-- prompt-template end -->
|
|
|
107 |
|
108 |
### In `text-generation-webui`
|
109 |
|
110 |
+
Under Download Model, you can enter the model repo: Elkhayyat17/llama2-Med-gguf and below it, a specific filename to download, such as: codellama-7b.q4_K_M.gguf.
|
111 |
|
112 |
Then click Download.
|
113 |
|
|
|
122 |
Then you can download any individual model file to the current directory, at high speed, with a command like this:
|
123 |
|
124 |
```shell
|
125 |
+
huggingface-cli download Elkhayyat17/llama2-Med-gguf ggml-model-Q4_K_M.gguf --local-dir . --local-dir-use-symlinks False
|
126 |
```
|
127 |
|
128 |
<details>
|
|
|
131 |
You can also download multiple files at once with a pattern:
|
132 |
|
133 |
```shell
|
134 |
+
huggingface-cli download Elkhayyat17/llama2-Med-gguf --local-dir . --local-dir-use-symlinks False --include='*Q4_K*gguf'
|
135 |
```
|
136 |
|
137 |
For more documentation on downloading with `huggingface-cli`, please see: [HF -> Hub Python Library -> Download files -> Download from the CLI](https://huggingface.co/docs/huggingface_hub/guides/download#download-from-the-cli).
|
|
|
145 |
And set environment variable `HF_HUB_ENABLE_HF_TRANSFER` to `1`:
|
146 |
|
147 |
```shell
|
148 |
+
HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download Elkhayyat17/llama2-Med-gguf ggml-model-Q4_K_M.gguff --local-dir . --local-dir-use-symlinks False
|
149 |
```
|
150 |
|
151 |
Windows CLI users: Use `set HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1` before running the download command.
|
|
|
198 |
from ctransformers import AutoModelForCausalLM
|
199 |
|
200 |
# Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
|
201 |
+
llm = AutoModelForCausalLM.from_pretrained("Elkhayyat17/llama2-Med-gguf", model_file="ggml-model-Q4_K_M.gguf", model_type="llama", gpu_layers=50)
|
202 |
|
203 |
print(llm("AI is going to"))
|
204 |
```
|