izumilab commited on
Commit
058fed1
·
1 Parent(s): ca3a700

add README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -0
README.md ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+
3
+ language: ja
4
+
5
+ license: cc-by-sa-4.0
6
+
7
+ datasets:
8
+
9
+ - wikipedia
10
+
11
+ widget:
12
+
13
+ - text: 東京大学で[MASK]の研究をしています。
14
+
15
+ ---
16
+
17
+ # BERT small Japanese finance
18
+
19
+ This is a [BERT](https://github.com/google-research/bert) model pretrained on texts in the Japanese language.
20
+
21
+ The codes for the pretraining are available at [retarfi/language-pretraining](https://github.com/retarfi/language-pretraining/tree/v1.0).
22
+
23
+ ## Model architecture
24
+
25
+ The model architecture is the same as BERT small in the [original ELECTRA paper](https://arxiv.org/abs/2003.10555); 12 layers, 256 dimensions of hidden states, and 4 attention heads.
26
+
27
+ ## Training Data
28
+
29
+ The models are trained on the Japanese version of Wikipedia.
30
+
31
+ The training corpus is generated from the Japanese version of Wikipedia, using Wikipedia dump file as of June 1, 2021.
32
+
33
+ The corpus file is 2.9GB, consisting of approximately 20M sentences.
34
+
35
+ ## Tokenization
36
+
37
+ The texts are first tokenized by MeCab with IPA dictionary and then split into subwords by the WordPiece algorithm.
38
+
39
+ The vocabulary size is 32768.
40
+
41
+ ## Training
42
+
43
+ The models are trained with the same configuration as BERT small in the [original ELECTRA paper](https://arxiv.org/abs/2003.10555); 128 tokens per instance, 128 instances per batch, and 1.45M training steps.
44
+
45
+ ## Citation
46
+
47
+ **There will be another paper for this pretrained model. Be sure to check here again when you cite.**
48
+
49
+ ```
50
+ @inproceedings{bert_electra_japanese,
51
+ title = {Construction and Validation of a Pre-Trained Language Model
52
+ Using Financial Documents}
53
+ author = {Masahiro Suzuki and Hiroki Sakaji and Masanori Hirano and Kiyoshi Izumi},
54
+ month = {oct},
55
+ year = {2021},
56
+ booktitle = {"Proceedings of JSAI Special Interest Group on Financial Infomatics (SIG-FIN) 27"}
57
+ }
58
+ ```
59
+
60
+ ## Licenses
61
+
62
+ The pretrained models are distributed under the terms of the [Creative Commons Attribution-ShareAlike 4.0](https://creativecommons.org/licenses/by-sa/4.0/).
63
+
64
+ ## Acknowledgments
65
+
66
+ This work was supported by JSPS KAKENHI Grant Number JP21K12010.