File size: 215 Bytes
a4b467c |
1 |
Tokenisers trained on the MiniPile. The `_raw_tokenisers` folder contains the original tokenisers trained with a vocabulary size of 320k. Then, each folder is a `transformers`-compatible tokeniser of a smaller size. |