30b 4bit lora please
#1
by
Gumibit
- opened
Hi, It would be great if we could have the 4bit flavor of this
thank you.
Hey, I'll work on it.
Currently I am benchmarking the models. The ones trained wo LoRA have much better performance, maybe load them in 8bit instead? medalpaca-7b quantized for inference should outperform the 30b model.
In the meantime please raise an issue on GitHub, so I won't forget to do it.
Thank you, I will try your suggestions
Gumibit
changed discussion status to
closed