Any chance for a Sam 65b?
Samantha is my favorite model to talk to.
I have really enjoyed this model and I really like how the 33b performs - I will be keeping my eye out for the 65b!
I see you just posted the samantha-data, so that is very interesting (Thank you!)
I read your post about how Samantha was created (using Fastchat,etc) for the lower models - will you be publishing a write-up for how you were able to do it in Qlora for the 65b?
I haven't been - I think first I'm gonna do Samantha-Falcon-40b, which they say performs better than Llama-65b
@ehartford check out this new model : guanaco-65B-GPTQ . its trained in less time with less memory . hope in your next training it will save your time and resources.
Oh yeah, I am considering qLoRA. I still need to find a good solid solution for training that way. But FastChat doesn't yet support it, and I use FastChat for training these conversational datasets.