Merging Deltas
How do you merge the deltas with the original Llama weights?
(This one is done already, but I would like to do the 13b since I have the files)
I've updated the README.md with details of how I did the merge, and links to the merged weights in HF format.
I'm going to be doing 13B myself a bit later, so if you can wait a few hours then I'll have it uploaded as well.
Have you been able to get these GPTQ files working yet?
I haven't experimented with this one yet! I had just saw you merged the weights and thought I would ask how it was done ^^;
Thanks for uploading, and thanks for sharing how you did it =]
You're welcome!
Unfortunately I cannot get the GPTQ files working at all at the moment. I'm still working on that.
But the unquantized GGML version works very well in llama.cpp and produces very good output.
I'll ping when I've been able to put the 13B up - won't be too long.