You should try training a model with 2B parameters and context length 32000.

#3
by win10 - opened

You should try training a model with 2B parameters and context length 32000.
Wishing you a happy new year and a successful new year.

I am more interested on why @PY007 settled on 1.1B? How did he/she come out with this number?

Sign up or log in to comment