--- base_model: RWKV/rwkv-6-world-3b-v2.1 library_name: gguf license: apache-2.0 quantized_by: Lyte tags: - text-generation - rwkv - rwkv-6 --- # RWKV-6-World-3B-v2.1-GGUF This repo contains the RWKV-6-World-3B-v2.1-GGUF NEW (RE)-quantized with the latest llama.cpp [b3771](https://github.com/ggerganov/llama.cpp/releases/tag/b3771). # **Note:** * The Notebook used to convert this model is included feel free to use to in Colab or Kaggle to quantize future models using it. ## How to run the model * Get the latest llama.cpp: ``` git clone https://github.com/ggerganov/llama.cpp ``` * Download the GGUF file to a new model folder in llama.cpp(choose your quant): ``` cd llama.cpp mkdir model git clone https://huggingface.co/Lyte/RWKV-6-World-3B-v2.1-GGUF mv RWKV-6-World-3B-v2.1-GGUF/RWKV-6-World-3B-v2.1-GGUF-Q4_K_M.gguf model/ rm -r RWKV-6-World-3B-v2.1-GGUF ``` * For Windows other than git cloning the repo, you just create the "model" folder inside llama.cpp folder and in here click "Files and versions" and download the model quant you want there. * Now to run the model, you can use the following command: ``` ./llama-cli -m ./model/RWKV-6-World-3B-v2.1-GGUF-Q4_K_M.gguf --in-suffix "Assistant:" --interactive-first -c 1024 -t 0.7 --top-k 50 --top-p 0.95 -n 128 -p "Assistant: Hello, what can i help you with today?\nUser:" -r "User:" ```