VLLM with error Blockwise quantization only supports 16/32-bit floats, but got torch.uint8
6
#3 opened 9 days ago
by
ChloeHuang1
How to convert this model to GGUF?
2
#2 opened 9 days ago
by
degot
The `tokenizer_config.json` is missing the `chat_template` jinja?
1
#1 opened 20 days ago
by
ubergarm