Felladrin/Llama-68M-Chat-v1-GGUF
Quantized GGUF model files for Llama-68M-Chat-v1 from Felladrin
Name | Quant method | Size |
---|---|---|
llama-68m-chat-v1.fp16.gguf | fp16 | 136.79 MB |
llama-68m-chat-v1.q2_k.gguf | q2_k | 35.88 MB |
llama-68m-chat-v1.q3_k_m.gguf | q3_k_m | 40.66 MB |
llama-68m-chat-v1.q4_k_m.gguf | q4_k_m | 46.10 MB |
llama-68m-chat-v1.q5_k_m.gguf | q5_k_m | 51.16 MB |
llama-68m-chat-v1.q6_k.gguf | q6_k | 56.54 MB |
llama-68m-chat-v1.q8_0.gguf | q8_0 | 73.02 MB |
Original Model Card:
A Llama Chat Model of 68M Parameters
- Base model: JackFram/llama-68m
- Datasets:
- Availability in other ML formats:
Recommended Prompt Format
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{user_message}<|im_end|>
<|im_start|>assistant
Recommended Inference Parameters
penalty_alpha: 0.5
top_k: 4
- Downloads last month
- 28