dwikitheduck commited on
Commit
d88f81c
1 Parent(s): 9b74df7

Create/update model card (README.md)

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+ tags:
4
+ - gguf
5
+ - llama.cpp
6
+ - quantized
7
+ - dwikitheduck/gen-try1
8
+ license: apache-2.0
9
+ ---
10
+
11
+ # dwikitheduck/gen-try1-Q4_K_M-GGUF
12
+
13
+ This model was converted to GGUF format from [`dwikitheduck/gen-try1`](https://huggingface.co/dwikitheduck/gen-try1) using llama.cpp via
14
+ [Convert Model to GGUF](https://github.com/ruslanmv/convert-model-to-GGUF).
15
+
16
+ **Key Features:**
17
+
18
+ * Quantized for reduced file size (GGUF format)
19
+ * Optimized for use with llama.cpp
20
+ * Compatible with llama-server for efficient serving
21
+
22
+ Refer to the [original model card](https://huggingface.co/dwikitheduck/gen-try1) for more details on the base model.
23
+
24
+ ## Usage with llama.cpp
25
+
26
+ **1. Install llama.cpp:**
27
+
28
+ ```bash
29
+ brew install llama.cpp # For macOS/Linux
30
+ ```
31
+
32
+ **2. Run Inference:**
33
+
34
+ **CLI:**
35
+
36
+ ```bash
37
+ llama-cli --hf-repo dwikitheduck/gen-try1-Q4_K_M-GGUF --hf-file gen-try1-q4_k_m.gguf -p "Your prompt here"
38
+ ```
39
+
40
+ **Server:**
41
+
42
+ ```bash
43
+ llama-server --hf-repo dwikitheduck/gen-try1-Q4_K_M-GGUF --hf-file gen-try1-q4_k_m.gguf -c 2048
44
+ ```
45
+
46
+ For more advanced usage, refer to the [llama.cpp repository](https://github.com/ggerganov/llama.cpp).