Update README.md
Browse files
README.md
CHANGED
@@ -36,10 +36,10 @@ model = AutoModelForCausalLM.from_pretrained("rinna/japanese-gpt2-medium")
|
|
36 |
A 24-layer, 1024-hidden-size transformer-based language model.
|
37 |
|
38 |
# Training
|
39 |
-
The model was trained on [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz) to optimize a traditional language modelling objective on 8
|
40 |
|
41 |
# Tokenization
|
42 |
-
The model uses a [sentencepiece](https://github.com/google/sentencepiece)-based tokenizer, the vocabulary
|
43 |
|
44 |
# Licenese
|
45 |
[The MIT license](https://opensource.org/licenses/MIT)
|
|
|
36 |
A 24-layer, 1024-hidden-size transformer-based language model.
|
37 |
|
38 |
# Training
|
39 |
+
The model was trained on [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz) to optimize a traditional language modelling objective on 8\\*V100 GPUs for around 30 days. It reaches around 18 perplexity on a chosen validation set from the same data.
|
40 |
|
41 |
# Tokenization
|
42 |
+
The model uses a [sentencepiece](https://github.com/google/sentencepiece)-based tokenizer, the vocabulary was trained on the Japanese Wikipedia using the official sentencepiece training script.
|
43 |
|
44 |
# Licenese
|
45 |
[The MIT license](https://opensource.org/licenses/MIT)
|