Update README.md
Browse files
README.md
CHANGED
@@ -9,6 +9,8 @@ Bioformer is a lightweight BERT model for biomedical text mining. Bioformer uses
|
|
9 |
|
10 |
Bioformer has 8 layers (transformer blocks) with a hidden embedding size of 512, and the number of self-attention heads is 8. Its total number of parameters is 42,820,610.
|
11 |
|
|
|
|
|
12 |
## Vocabulary of Bioformer
|
13 |
Bioformer uses a cased WordPiece vocabulary trained from a biomedical corpus, which included all PubMed abstracts (33 million, as of Feb 1, 2021) and 1 million PMC full-text articles. PMC has 3.6 million articles but we down-sampled them to 1 million such that the total size of PubMed abstracts and PMC full-text articles are approximately equal. To mitigate the out-of-vocabulary issue and include special symbols (e.g. male and female symbols) in biomedical literature, we trained Bioformer’s vocabulary from the Unicode text of the two resources. The vocabulary size of Bioformer is 32768 (2^15), which is similar to that of the original BERT.
|
14 |
|
|
|
9 |
|
10 |
Bioformer has 8 layers (transformer blocks) with a hidden embedding size of 512, and the number of self-attention heads is 8. Its total number of parameters is 42,820,610.
|
11 |
|
12 |
+
The usage of Bioformer is the same as a standard BERT model. The documentation of BERT can be found [here](https://huggingface.co/docs/transformers/model_doc/bert).
|
13 |
+
|
14 |
## Vocabulary of Bioformer
|
15 |
Bioformer uses a cased WordPiece vocabulary trained from a biomedical corpus, which included all PubMed abstracts (33 million, as of Feb 1, 2021) and 1 million PMC full-text articles. PMC has 3.6 million articles but we down-sampled them to 1 million such that the total size of PubMed abstracts and PMC full-text articles are approximately equal. To mitigate the out-of-vocabulary issue and include special symbols (e.g. male and female symbols) in biomedical literature, we trained Bioformer’s vocabulary from the Unicode text of the two resources. The vocabulary size of Bioformer is 32768 (2^15), which is similar to that of the original BERT.
|
16 |
|