Felladrin/Llama-160M-Chat-v1-GGUF
Quantized GGUF model files for Llama-160M-Chat-v1 from Felladrin
Name | Quant method | Size |
---|---|---|
llama-160m-chat-v1.fp16.gguf | fp16 | 326.58 MB |
llama-160m-chat-v1.q2_k.gguf | q2_k | 77.23 MB |
llama-160m-chat-v1.q3_k_m.gguf | q3_k_m | 87.54 MB |
llama-160m-chat-v1.q4_k_m.gguf | q4_k_m | 104.03 MB |
llama-160m-chat-v1.q5_k_m.gguf | q5_k_m | 119.04 MB |
llama-160m-chat-v1.q6_k.gguf | q6_k | 135.00 MB |
llama-160m-chat-v1.q8_0.gguf | q8_0 | 174.33 MB |
Original Model Card:
A Llama Chat Model of 160M Parameters
- Base model: JackFram/llama-160m
- Datasets:
- Availability in other ML formats:
Recommended Prompt Format
The recommended prompt format is as follows:
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{user_message}<|im_end|>
<|im_start|>assistant
Recommended Inference Parameters
To get the best results, prefer using contrastive search for inference:
penalty_alpha: 0.5
top_k: 5
- Downloads last month
- 33