Its just a model file, Tokenizer is not the same as model. You can use any Tokenizer from Transformers like DistilBertTokenizer or BertTokenizer --- language: - English tags - Text - Sequence-Classification - Sarcasm - DistilBert datasets: - Kaggle Dataset on News Headline Sarcasm - https://www.kaggle.com/rmisra/news-headlines-dataset-for-sarcasm-detection