--- license: cc-by-sa-4.0 --- # StableLM-3B-4E1T # * Model Creator: [Stability AI](https://huggingface.co/stabilityai) * original Model: [StableLM-3B-4E1T](https://huggingface.co/stabilityai/stablelm-3b-4e1t) ## Description ## This repository contains the most relevant quantizations of Stability AI's [StableLM-3B-4E1T](https://huggingface.co/stabilityai/stablelm-3b-4e1t) model in GGUF format - ready to be used with [llama.cpp](https://github.com/ggerganov/llama.cpp) and similar applications. ## About StableLM-3B-4E1T ## Stability AI claims: "_StableLM-3B-4E1T achieves state-of-the-art performance (September 2023) at the 3B parameter scale for open-source models and is competitive with many of the popular contemporary 7B models, even outperforming our most recent 7B StableLM-Base-Alpha-v2._" According to them "_The model is intended to be used as a foundational base model for application-specific fine-tuning. Developers must evaluate and fine-tune the model for safe performance in downstream applications._" ## Files ## Right now, the following quantizations are available: * [stablelm-3b-4e1t-Q3_K_M](https://huggingface.co/rozek/StableLM-3B-4E1T_GGUF/blob/main/stablelm-3b-4e1t-Q3_K_M.bin) * [stablelm-3b-4e1t-Q4_K_M](https://huggingface.co/rozek/StableLM-3B-4E1T_GGUF/blob/main/stablelm-3b-4e1t-Q4_K_M.bin) * [stablelm-3b-4e1t-Q5_K_M](https://huggingface.co/rozek/StableLM-3B-4E1T_GGUF/blob/main/stablelm-3b-4e1t-Q5_K_M.bin) * [stablelm-3b-4e1t-Q6_K](https://huggingface.co/rozek/StableLM-3B-4E1T_GGUF/blob/main/stablelm-3b-4e1t-Q6_K.bin) * [stablelm-3b-4e1t-Q8_K](https://huggingface.co/rozek/StableLM-3B-4E1T_GGUF/blob/main/stablelm-3b-4e1t-Q8_K.bin) (tell me if you need more) These files are presented here with the written permission of Stability AI (although access to the model itself is still "gated"). ## Usage Details ## Any technical details can be found on the [original model card](https://huggingface.co/stabilityai/stablelm-3b-4e1t) and in a paper on [StableLM-3B-4E1T](https://stability.wandb.io/stability-llm/stable-lm/reports/StableLM-3B-4E1T--VmlldzoyMjU4?accessToken=u3zujipenkx5g7rtcj9qojjgxpconyjktjkli2po09nffrffdhhchq045vp0wyfo). The most important ones for using this model are * context length is 4096 * there does not seem to be a specific prompt structure - just provide the text you want to be completed ### Text Completion with LLaMA.cpp ### For simple inferencing, use a command similar to ``` ./main -m stablelm-3b-4e1t-Q8_0.bin --temp 0 --top-k 4 --prompt "who was Joseph Weizenbaum?" ``` ### Text Tokenization with LLaMA.cpp ### To get a list of tokens, use a command similar to ``` ./tokenization -m stablelm-3b-4e1t-Q8_0.bin --prompt "who was Joseph Weizenbaum?" ``` ### Embeddings Calculation with LLaMA.cpp ### Text embeddings are calculated with a command similar to ``` ./embedding -m stablelm-3b-4e1t-Q8_0.bin --prompt "who was Joseph Weizenbaum?" ``` ## Conversion Details ## Conversion was done using a Docker container based on `python:3.10.13-slim-bookworm` After downloading the original model files into a separate directory, the container was started with ``` docker run --interactive \ --mount type=bind,src=,dst=/llm \ python:3.10.13-slim-bookworm ``` where `` was the path to the folder containing the downloaded model. Within the container's terminal, the following commands were issued: ``` apt-get update apt-get install build-essential git -y git clone https://github.com/ggerganov/llama.cpp cd llama.cpp ## Important: uncomment the make command that fits to your host computer! ## on Apple Silicon machines: (see https://github.com/ggerganov/llama.cpp/issues/1655) # UNAME_M=arm64 UNAME_p=arm LLAMA_NO_METAL=1 make ## otherwise # make python3 -m pip install -r requirements.txt pip install torch transformers # see https://github.com/ggerganov/llama.cpp/issues/3344 python3 convert-hf-to-gguf.py /llm mv /llm/ggml-model-f16.gguf /llm/stablelm-3b-4e1t.gguf # the following command is just an example, modify it as needed ./quantize /llm/stablelm-3b-4e1t.gguf /llm/stablelm-3b-4e1t_Q3_K_M.gguf q3_k_m ``` After conversion, the mounted folder (the one that originally contained the model only) now also contains all conversions. The container itself may now be safely deleted - the conversions will remain on disk. ## License ## The original "_Model checkpoints are licensed under the Creative Commons license ([CC BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/)). Under this license, you must give [credit](https://creativecommons.org/licenses/by/4.0/#) to Stability AI, provide a link to the license, and [indicate if changes were made](https://creativecommons.org/licenses/by/4.0/#). You may do so in any reasonable manner, but not in any way that suggests the Stability AI endorses you or your use._" So, in order to be fair and give credits to whom they belong: * the original model was created and published by [Stability AI](https://huggingface.co/stabilityai) * besides quantization, no changes were applied to the model itself