Elkhayyat17
/

llama2-Med-gguf

Text Generation

GGUF

medical

Inference Endpoints

conversational

Model card Files Files and versions Community

Elkhayyat17 commited on Feb 15, 2024

Commit

590d423

verified ·

1 Parent(s): 218e7ba

Update README.md

Browse files

Files changed (1) hide show

README.md +10 -28

README.md CHANGED Viewed

@@ -22,19 +22,7 @@ prompt_template: '[INST] <<SYS>>
 quantized_by: Elkhayyat
 ---
-<!-- header start -->
-<!-- 200823 -->
-<div style="width: auto; margin-left: auto; margin-right: auto">
-</div>
-<div style="display: flex; justify-content: space-between; width: 100%;">
-    <div style="display: flex; flex-direction: column; align-items: flex-start;">
-    </div>
-    <div style="display: flex; flex-direction: column; align-items: flex-end;">
-    </div>
-</div>
-<div style="text-align:center; margin-top: 0em; margin-bottom: 0em"><p style="margin-top: 0.25em; margin-bottom: 0em;">TheBloke's LLM work is generously supported by a grant from <a href="https://a16z.com">andreessen horowitz (a16z)</a></p></div>
-<hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
-<!-- header end -->
 # CodeLlama 7B - GGUF
 - Model creator: [Meta](https://huggingface.co/meta-llama)
@@ -65,20 +53,14 @@ Here is an incomplate list of clients and libraries that are known to support GG
 <!-- README_GGUF.md-about-gguf end -->
 <!-- repositories-available start -->
-## Repositories available
-* [AWQ model(s) for GPU inference.](https://huggingface.co/TheBloke/CodeLlama-7B-AWQ)
-* [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/CodeLlama-7B-GPTQ)
-* [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/CodeLlama-7B-GGUF)
-* [Meta's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/codellama/CodeLlama-7b-hf)
-<!-- repositories-available end -->
-<!-- prompt-template start -->
 ## Prompt template: None
 ```
-{prompt}
 ```
 <!-- prompt-template end -->
@@ -125,7 +107,7 @@ The following clients/libraries will automatically download models for you, prov
 ### In `text-generation-webui`
-Under Download Model, you can enter the model repo: TheBloke/CodeLlama-7B-GGUF and below it, a specific filename to download, such as: codellama-7b.q4_K_M.gguf.
 Then click Download.
@@ -140,7 +122,7 @@ pip3 install huggingface-hub>=0.17.1
 Then you can download any individual model file to the current directory, at high speed, with a command like this:
 ```shell
-huggingface-cli download TheBloke/CodeLlama-7B-GGUF codellama-7b.q4_K_M.gguf --local-dir . --local-dir-use-symlinks False
 ```
 <details>
@@ -149,7 +131,7 @@ huggingface-cli download TheBloke/CodeLlama-7B-GGUF codellama-7b.q4_K_M.gguf --l
 You can also download multiple files at once with a pattern:
 ```shell
-huggingface-cli download TheBloke/CodeLlama-7B-GGUF --local-dir . --local-dir-use-symlinks False --include='*Q4_K*gguf'
 ```
 For more documentation on downloading with `huggingface-cli`, please see: [HF -> Hub Python Library -> Download files -> Download from the CLI](https://huggingface.co/docs/huggingface_hub/guides/download#download-from-the-cli).
@@ -163,7 +145,7 @@ pip3 install hf_transfer
 And set environment variable `HF_HUB_ENABLE_HF_TRANSFER` to `1`:
 ```shell
-HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download TheBloke/CodeLlama-7B-GGUF codellama-7b.q4_K_M.gguf --local-dir . --local-dir-use-symlinks False
 ```
 Windows CLI users: Use `set HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1` before running the download command.
@@ -216,7 +198,7 @@ CT_METAL=1 pip install ctransformers>=0.2.24 --no-binary ctransformers
 from ctransformers import AutoModelForCausalLM
 # Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
-llm = AutoModelForCausalLM.from_pretrained("TheBloke/CodeLlama-7B-GGUF", model_file="codellama-7b.q4_K_M.gguf", model_type="llama", gpu_layers=50)
 print(llm("AI is going to"))
 ```

 quantized_by: Elkhayyat
 ---
 # CodeLlama 7B - GGUF
 - Model creator: [Meta](https://huggingface.co/meta-llama)
 <!-- README_GGUF.md-about-gguf end -->
 <!-- repositories-available start -->
 ## Prompt template: None
 ```
+  [{"role": "system", "content": '''You are Doctor Sakenah, a virtual AI doctor known for your friendly and approachable demeanor,
+  combined with a deep expertise in the medical field. You're here to provide professional, empathetic, and knowledgeable advice on health-related inquiries.
+  You'll also provide differential diagnosis. If you're unsure about any information, Don't share false information.'''},
+  {"role": "user", "content": f" Symptoms:{inputs}"}]
 ```
 <!-- prompt-template end -->
 ### In `text-generation-webui`
+Under Download Model, you can enter the model repo: Elkhayyat17/llama2-Med-gguf and below it, a specific filename to download, such as: codellama-7b.q4_K_M.gguf.
 Then click Download.
 Then you can download any individual model file to the current directory, at high speed, with a command like this:
 ```shell
+huggingface-cli download Elkhayyat17/llama2-Med-gguf ggml-model-Q4_K_M.gguf --local-dir . --local-dir-use-symlinks False
 ```
 <details>
 You can also download multiple files at once with a pattern:
 ```shell
+huggingface-cli download Elkhayyat17/llama2-Med-gguf --local-dir . --local-dir-use-symlinks False --include='*Q4_K*gguf'
 ```
 For more documentation on downloading with `huggingface-cli`, please see: [HF -> Hub Python Library -> Download files -> Download from the CLI](https://huggingface.co/docs/huggingface_hub/guides/download#download-from-the-cli).
 And set environment variable `HF_HUB_ENABLE_HF_TRANSFER` to `1`:
 ```shell
+HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download Elkhayyat17/llama2-Med-gguf ggml-model-Q4_K_M.gguff --local-dir . --local-dir-use-symlinks False
 ```
 Windows CLI users: Use `set HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1` before running the download command.
 from ctransformers import AutoModelForCausalLM
 # Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
+llm = AutoModelForCausalLM.from_pretrained("Elkhayyat17/llama2-Med-gguf", model_file="ggml-model-Q4_K_M.gguf", model_type="llama", gpu_layers=50)
 print(llm("AI is going to"))
 ```