Charangan commited on
Commit
315cdfc
1 Parent(s): 6aef6fc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -9,7 +9,7 @@ tags:
9
 
10
  # MedBERT Model
11
 
12
- MedBERT is a newly pre-trained transformer-based language model for biomedical named entity recognition: initialized with [Bio_ClinicalBERT](https://arxiv.org/abs/1904.03323) & pre-trained on N2C2, BioNLP, and CRAFT community datasets.
13
 
14
  ## Pretraining
15
 
@@ -28,7 +28,7 @@ The `MedBERT` model was trained on N2C2, BioNLP, and CRAFT community datasets.
28
  The model was trained using code from [Google's BERT repository](https://github.com/google-research/bert). Model parameters were initialized with Bio_ClinicalBERT.
29
 
30
  ### Hyperparameters
31
- We used a batch size of 32, a maximum sequence length of 128, and a learning rate of 1·10−4 for pre-training our models. The models trained for 200,000 steps. The dup factor for duplicating input data with different masks was set to 5. All other default parameters were used (specifically, masked language model probability = 0.15
32
  and max predictions per sequence = 22).
33
 
34
  ## How to use
 
9
 
10
  # MedBERT Model
11
 
12
+ **MedBERT** is a newly pre-trained transformer-based language model for biomedical named entity recognition: initialized with [Bio_ClinicalBERT](https://arxiv.org/abs/1904.03323) & pre-trained on N2C2, BioNLP, and CRAFT community datasets.
13
 
14
  ## Pretraining
15
 
 
28
  The model was trained using code from [Google's BERT repository](https://github.com/google-research/bert). Model parameters were initialized with Bio_ClinicalBERT.
29
 
30
  ### Hyperparameters
31
+ We used a batch size of 32, a maximum sequence length of 256, and a learning rate of 1·10−4 for pre-training our models. The models trained for 200,000 steps. The dup factor for duplicating input data with different masks was set to 5. All other default parameters were used (specifically, masked language model probability = 0.15
32
  and max predictions per sequence = 22).
33
 
34
  ## How to use