2-bit quantization and 128 groupsize for LLaMA 7B

This is a Chinese instruction-tuning lora checkpoint based on llama-13B from this repo's work Consumes approximately 4G of graphics memory

"input":the mean of life is
"output":the mean of life is a good., and it’s not to be worth in your own homework for an individual who traveling on my back with me our localities that you can do some work at this point as well known by us online gaming sites are more than 10 years old when i was going out there around here we had been written about his time were over all sited down after being spent from most days while reading between two weeks since I would have gone before its age site;...
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .