tianyuz commited on
Commit
2c51d12
·
1 Parent(s): 1954765

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -1
README.md CHANGED
@@ -22,6 +22,12 @@ This repository provides a base-sized Japanese RoBERTa model. The model is provi
22
 
23
  # How to use the model
24
 
 
 
 
 
 
 
25
  *NOTE:* Use `T5Tokenizer` to initiate the tokenizer.
26
 
27
  ~~~~
@@ -37,7 +43,7 @@ model = RobertaModel.from_pretrained("rinna/japanese-roberta-base", use_auth_tok
37
  A 12-layer, 768-hidden-size transformer-based masked language model.
38
 
39
  # Training
40
- The model was trained on [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz) and [Japanese Wikipedia](https://dumps.wikimedia.org/jawiki/) to optimize a masked language modelling objective on 8\\\\\\\\\\\\\\\\*V100 GPUs for around 15 days.
41
 
42
  # Tokenization
43
  The model uses a [sentencepiece](https://github.com/google/sentencepiece)-based tokenizer, the vocabulary was trained on the Japanese Wikipedia using the official sentencepiece training script.
 
22
 
23
  # How to use the model
24
 
25
+ Since this is a private repo, first login your huggingface account from the command line:
26
+
27
+ ~~~
28
+ transformer-cli login
29
+ ~~~
30
+
31
  *NOTE:* Use `T5Tokenizer` to initiate the tokenizer.
32
 
33
  ~~~~
 
43
  A 12-layer, 768-hidden-size transformer-based masked language model.
44
 
45
  # Training
46
+ The model was trained on [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz) and [Japanese Wikipedia](https://dumps.wikimedia.org/jawiki/) to optimize a masked language modelling objective on 8*V100 GPUs for around 15 days. It reaches ~3.9 perplexity on a dev set sampled from CC-100.
47
 
48
  # Tokenization
49
  The model uses a [sentencepiece](https://github.com/google/sentencepiece)-based tokenizer, the vocabulary was trained on the Japanese Wikipedia using the official sentencepiece training script.