This is a Unigram tokenizer trained on the Wikitext dataset. Refer to the train_unigram.py
script within this repository to know how it was trained.
This is a Unigram tokenizer trained on the Wikitext dataset. Refer to the train_unigram.py
script within this repository to know how it was trained.