tmanabe commited on
Commit
c2feee1
·
verified ·
1 Parent(s): 93bd481

make `tokenizer_class` same as the actual class name

Browse files

Hi. Recent versions of transformers library log a warning if `tokenizer_class` is different from the actual class name.
If there is no special reason, can you make them equal? Thanks.

For reference: https://github.com/huggingface/transformers/blob/v4.44.0/src/transformers/tokenization_utils_base.py#L2398-L2404

Files changed (1) hide show
  1. tokenizer_config.json +1 -1
tokenizer_config.json CHANGED
@@ -17,7 +17,7 @@
17
  "__type":"AddedToken"
18
  },
19
  "tokenize_chinese_chars":false,
20
- "tokenizer_class": "BertJapaneseTokenizer",
21
  "word_tokenizer_type": "mecab",
22
  "subword_tokenizer_type": "sentencepiece",
23
  "mecab_kwargs": {
 
17
  "__type":"AddedToken"
18
  },
19
  "tokenize_chinese_chars":false,
20
+ "tokenizer_class": "DistilBertJapaneseTokenizer",
21
  "word_tokenizer_type": "mecab",
22
  "subword_tokenizer_type": "sentencepiece",
23
  "mecab_kwargs": {