Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
nroggendorff 
posted an update Nov 5, 2024
Post
2261
I still think whitespace in tokenizers are so dumb.
Congrats, you just doubled your vocab size for no reason.

Any alternative ideas?🤔

·

merges.txt :spinnyhat: