ruslanmv commited on
Commit
d58b76d
·
verified ·
1 Parent(s): 0905571

Delete README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -46
README.md DELETED
@@ -1,46 +0,0 @@
1
-
2
- ---
3
- tags:
4
- - gguf
5
- - llama.cpp
6
- - quantized
7
- - ruslanmv/Medical-Llama3-v2
8
- license: apache-2.0
9
- ---
10
-
11
- # ruslanmv/Medical-Llama3-v2-Q4_K_M-GGUF
12
-
13
- This model was converted to GGUF format from [`ruslanmv/Medical-Llama3-v2`](https://huggingface.co/ruslanmv/Medical-Llama3-v2) using llama.cpp via
14
- [Convert Model to GGUF](https://huggingface.co/spaces/ruslanmv/convert_to_gguf).
15
-
16
- **Key Features:**
17
-
18
- * Quantized for reduced file size (GGUF format)
19
- * Optimized for use with llama.cpp
20
- * Compatible with llama-server for efficient serving
21
-
22
- Refer to the [original model card](https://huggingface.co/ruslanmv/Medical-Llama3-v2) for more details on the base model.
23
-
24
- ## Usage with llama.cpp
25
-
26
- **1. Install llama.cpp:**
27
-
28
- ```bash
29
- brew install llama.cpp # For macOS/Linux
30
- ```
31
-
32
- **2. Run Inference:**
33
-
34
- **CLI:**
35
-
36
- ```bash
37
- llama-cli --hf-repo ruslanmv/Medical-Llama3-v2-Q4_K_M-GGUF --hf-file Medical-Llama3-v2-Q4_K_M-GGUF-4bit.gguf -p "Your prompt here"
38
- ```
39
-
40
- **Server:**
41
-
42
- ```bash
43
- llama-server --hf-repo ruslanmv/Medical-Llama3-v2-Q4_K_M-GGUF --hf-file Medical-Llama3-v2-Q4_K_M-GGUF-4bit.gguf -c 2048
44
- ```
45
-
46
- For more advanced usage, refer to the [llama.cpp repository](https://github.com/ggerganov/llama.cpp).