dwikitheduck
/

gen-try1-Q4_K_M-GGUF

+---
+tags:
+- gguf
+- llama.cpp
+- quantized
+- dwikitheduck/gen-try1
+license: apache-2.0
+---
+# dwikitheduck/gen-try1-Q4_K_M-GGUF
+This model was converted to GGUF format from [`dwikitheduck/gen-try1`](https://huggingface.co/dwikitheduck/gen-try1) using llama.cpp via
+[Convert Model to GGUF](https://github.com/ruslanmv/convert-model-to-GGUF).
+**Key Features:**
+* Quantized for reduced file size (GGUF format)
+* Optimized for use with llama.cpp
+* Compatible with llama-server for efficient serving
+Refer to the [original model card](https://huggingface.co/dwikitheduck/gen-try1) for more details on the base model.
+## Usage with llama.cpp
+**1. Install llama.cpp:**
+```bash
+brew install llama.cpp  # For macOS/Linux
+```
+**2. Run Inference:**
+**CLI:**
+```bash
+llama-cli --hf-repo dwikitheduck/gen-try1-Q4_K_M-GGUF --hf-file gen-try1-q4_k_m.gguf -p "Your prompt here"
+```
+**Server:**
+```bash
+llama-server --hf-repo dwikitheduck/gen-try1-Q4_K_M-GGUF --hf-file gen-try1-q4_k_m.gguf -c 2048
+```
+For more advanced usage, refer to the [llama.cpp repository](https://github.com/ggerganov/llama.cpp).