Request: Quants for 3 new dolphin 2.8 7B merges
Just made 3 merged models, each one of these are dolphin models, 2 are merges with the new WizardLM 2 7B but at different strenghts/bases. Would love your help to quantize them if you've got the time. thanks!!
https://huggingface.co/Noodlz/WizardLaker-7B
https://huggingface.co/Noodlz/DolphinLake-7B
https://huggingface.co/Noodlz/Dolph-Lund-Wizard-7B
Hmm, I already had dolphinlake in my queue tonight but deleted it because of some problem I can't recall.
I'll add all of them and see what happens :)
Maybe I added dolphinlake before it was uploaded, because it survived gguf conversion fine so far this time. They are assigned to a server and will show up within the next few hours probably. Normally the quants should just work, but in rare cases the vocabulary detection of llama.cpp fails and it generates garbage. If that happens, just poke me and I will redo them. Also, I normally don't do imatrix quants for 7bs by default, but if you want them, I'll be happy to provide them.
awesome thanks man! yea we probably don’t do need the imats since they’re small enough most people would run ok with it. thanks again man!
No problem, we are all in it together. If you want something quantized in the future, don't hesitate to ask :)
oh hey, actually i just decided to try out a self-merge of Llama 3. problem is i can't seem to quanitize it myself. not sure what im doing wrong. uploading now but its like 243 gigs so might take a bit. any tips so i can do a local one to test it out first?
I have almost zero practical experience with lama 3 yet. I have quantized a few yet, but not tested any.
But running convert.py, quantize, and then e.g. "main -m xxx.gguf -p hi," will give some confidence - if that works, it's probably correct (some models need better prompts, but most give a good output). If it survives those three, any remaining problems must be subtle. For lama 3, no arcane switches or anything should be needed.
ah got it. looks like this fixed it for me. https://github.com/ggerganov/llama.cpp/issues/6690
basically i had to add --vocab-type bpe
to the command line.
the 70B self merge didnt turn out well though. so scrapping that merge for now. may do another this weekend =)
Well, the easy cases are where convert.py stops with an error message. The bad cases are where it runs without issue and just converts garbage.
yea... so far its the latter, sorta. like i get stopped with an error message and i power through with workaround and now i get garbage lol. gonna take another route though and experiment with maybe finetuning first, then merge