schnell commited on
Commit
b7a24db
1 Parent(s): efbbd44

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -50,7 +50,7 @@ output = model(**encoded_input)
50
 
51
  ### Preprocessing
52
 
53
- The texts are normalized using [neologdn](https://github.com/ikegami-yukino/neologdn), segmented into words using Juman++, and tokenized using BPE. Juman++ 2.0.0-rc3 was used for pretraining.
54
 
55
  The model was trained on 8 NVIDIA A100 GPUs.
56
 
 
50
 
51
  ### Preprocessing
52
 
53
+ The texts are normalized using [neologdn](https://github.com/ikegami-yukino/neologdn), segmented into words using [Juman++](https://github.com/ku-nlp/jumanpp), and tokenized by [BPE](https://huggingface.co/docs/tokenizers/api/models#tokenizers.models.BPE). Juman++ 2.0.0-rc3 was used for pretraining.
54
 
55
  The model was trained on 8 NVIDIA A100 GPUs.
56