mmukh commited on
Commit
a1c09bd
·
1 Parent(s): 453e405

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -2,7 +2,7 @@
2
 
3
  ## Model Description
4
 
5
- SOBertBase is a 109M parameter BERT models trained on 27 billion tokens of SO data StackOverflow answer and comment text using the Megatron Toolkit.
6
 
7
  SOBert is pre-trained with 19 GB data presented as 15 million samples where each sample contains an entire post and all its corresponding comments. We also include
8
  all code in each answer so that our model is bimodal in nature. We use a SentencePiece tokenizer trained with BytePair Encoding, which has the benefit over WordPiece of never labeling tokens as “unknown".
 
2
 
3
  ## Model Description
4
 
5
+ SOBertBase is a 109M parameter BERT model trained on 27 billion tokens of SO data StackOverflow answer and comment text using the Megatron Toolkit.
6
 
7
  SOBert is pre-trained with 19 GB data presented as 15 million samples where each sample contains an entire post and all its corresponding comments. We also include
8
  all code in each answer so that our model is bimodal in nature. We use a SentencePiece tokenizer trained with BytePair Encoding, which has the benefit over WordPiece of never labeling tokens as “unknown".