make `tokenizer_class` same as the actual class name
Browse filesHi. Recent versions of transformers library log a warning if `tokenizer_class` is different from the actual class name.
If there is no special reason, can you make them equal? Thanks.
For reference: https://github.com/huggingface/transformers/blob/v4.44.0/src/transformers/tokenization_utils_base.py#L2398-L2404
- tokenizer_config.json +1 -1
tokenizer_config.json
CHANGED
@@ -17,7 +17,7 @@
|
|
17 |
"__type":"AddedToken"
|
18 |
},
|
19 |
"tokenize_chinese_chars":false,
|
20 |
-
"tokenizer_class": "
|
21 |
"word_tokenizer_type": "mecab",
|
22 |
"subword_tokenizer_type": "sentencepiece",
|
23 |
"mecab_kwargs": {
|
|
|
17 |
"__type":"AddedToken"
|
18 |
},
|
19 |
"tokenize_chinese_chars":false,
|
20 |
+
"tokenizer_class": "DistilBertJapaneseTokenizer",
|
21 |
"word_tokenizer_type": "mecab",
|
22 |
"subword_tokenizer_type": "sentencepiece",
|
23 |
"mecab_kwargs": {
|