A friendly reminder: change the max_seq_len in text-generation-web-ui, otherwise, you get CUDA outta memory.