Training code?

by flashvenom - opened Jun 7, 2023

Jun 7, 2023

If you don't mind sharing, what was the code used to train the model? Both from the dataset and to increase context length -- for context length have you tested how well it works post 2k tokens?

jondurbin

Owner Jun 7, 2023

Responded to your other issue - no I goofed and didn't fully test. Contexts up to about 2200-2300 seem to work, but yeah total fail on longer ones.

jondurbin changed discussion status to closed Jun 7, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment