Any chance for a Sam 65b?

by disarmyouwitha - opened May 31, 2023

May 31, 2023

Samantha is my favorite model to talk to.
I have really enjoyed this model and I really like how the 33b performs - I will be keeping my eye out for the 65b!

I see you just posted the samantha-data, so that is very interesting (Thank you!)

I read your post about how Samantha was created (using Fastchat,etc) for the lower models - will you be publishing a write-up for how you were able to do it in Qlora for the 65b?

ehartford

Cognitive Computations org May 31, 2023

I haven't been - I think first I'm gonna do Samantha-Falcon-40b, which they say performs better than Llama-65b

zelda9

Jun 1, 2023

@ehartford check out this new model : guanaco-65B-GPTQ . its trained in less time with less memory . hope in your next training it will save your time and resources.

ehartford

Cognitive Computations org Jun 1, 2023

•

edited Jun 3, 2023

Oh yeah, I am considering qLoRA. I still need to find a good solid solution for training that way. But FastChat doesn't yet support it, and I use FastChat for training these conversational datasets.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment