4 bit version
#9
by
KnutJaegersberg
- opened
This is a huge download, I would like to download a 4 bit version.
I've done it with bnb with the sft version.
Could you save the weigths using double_quantization in bnb in bnb and upload them?
Transformers now supports saving weights as bitsandbytes 4 bit weights simply loading in 4 bit and then using model.save_pretrained("folder").
That way one can use the model with 48gb vram.