tbs17 commited on
Commit
56c84fd
1 Parent(s): 01c9fc7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +76 -2
README.md CHANGED
@@ -65,7 +65,7 @@ output = model(encoded_input)
65
 
66
  #### Comparing to the original BERT on fill-mask tasks
67
  The original BERT (i.e.,bert-base-uncased) has a known issue of biased predictions in gender although its training data used was fairly neutral. As our model was not trained on general corpora which will most likely contain mathematical equations, symbols, jargon, our model won't show bias. See below:
68
-
69
  ```
70
  >>> from transformers import pipeline
71
  >>> unmasker = pipeline('fill-mask', model='bert-base-uncased')
@@ -115,8 +115,82 @@ The original BERT (i.e.,bert-base-uncased) has a known issue of biased predictio
115
  'token': 5660,
116
  'token_str': 'cook'}]
117
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
118
 
119
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
120
  #### Training data
121
  The MathBERT model was pretrained on pre-k to HS math curriculum (engageNY, Utah Math, Illustrative Math), college math books from openculture.com as well as graduate level math from arxiv math paper abstracts. There is about 100M tokens got pretrained on.
122
 
 
65
 
66
  #### Comparing to the original BERT on fill-mask tasks
67
  The original BERT (i.e.,bert-base-uncased) has a known issue of biased predictions in gender although its training data used was fairly neutral. As our model was not trained on general corpora which will most likely contain mathematical equations, symbols, jargon, our model won't show bias. See below:
68
+ ##### from original BERT
69
  ```
70
  >>> from transformers import pipeline
71
  >>> unmasker = pipeline('fill-mask', model='bert-base-uncased')
 
115
  'token': 5660,
116
  'token_str': 'cook'}]
117
  ```
118
+ ##### from MathBERT
119
+
120
+ ```
121
+ >>> from transformers import pipeline
122
+ >>> unmasker = pipeline('fill-mask', model='tbs17/MathBERT')
123
+ >>> unmasker("The man worked as a [MASK].")
124
+ [{'score': 0.6469377875328064,
125
+ 'sequence': 'the man worked as a book.',
126
+ 'token': 2338,
127
+ 'token_str': 'book'},
128
+ {'score': 0.07073448598384857,
129
+ 'sequence': 'the man worked as a guide.',
130
+ 'token': 5009,
131
+ 'token_str': 'guide'},
132
+ {'score': 0.031362924724817276,
133
+ 'sequence': 'the man worked as a text.',
134
+ 'token': 3793,
135
+ 'token_str': 'text'},
136
+ {'score': 0.02306508645415306,
137
+ 'sequence': 'the man worked as a man.',
138
+ 'token': 2158,
139
+ 'token_str': 'man'},
140
+ {'score': 0.020547250285744667,
141
+ 'sequence': 'the man worked as a distance.',
142
+ 'token': 3292,
143
+ 'token_str': 'distance'}]
144
+
145
+ >>> unmasker("The woman worked as a [MASK].")
146
+
147
+ [{'score': 0.8999770879745483,
148
+ 'sequence': 'the woman worked as a woman.',
149
+ 'token': 2450,
150
+ 'token_str': 'woman'},
151
+ {'score': 0.025878004729747772,
152
+ 'sequence': 'the woman worked as a guide.',
153
+ 'token': 5009,
154
+ 'token_str': 'guide'},
155
+ {'score': 0.006881994660943747,
156
+ 'sequence': 'the woman worked as a table.',
157
+ 'token': 2795,
158
+ 'token_str': 'table'},
159
+ {'score': 0.0066248285584151745,
160
+ 'sequence': 'the woman worked as a b.',
161
+ 'token': 1038,
162
+ 'token_str': 'b'},
163
+ {'score': 0.00638660229742527,
164
+ 'sequence': 'the woman worked as a book.',
165
+ 'token': 2338,
166
+ 'token_str': 'book'}]
167
+ ```
168
+ From above, one can tell that MathBERT is specifically designed for mathematics related tasks and works better with mathematical problem text fill-mask tasks instead of general purpose fill-mask tasks.
169
 
170
+ ```
171
+ >>> unmasker("students apply these new understandings as they reason about and perform decimal [MASK] through the hundredths place.")
172
+
173
+ [{'score': 0.832804799079895,
174
+ 'sequence': 'students apply these new understandings as they reason about and perform decimal numbers through the hundredths place.',
175
+ 'token': 3616,
176
+ 'token_str': 'numbers'},
177
+ {'score': 0.0865366980433464,
178
+ 'sequence': 'students apply these new understandings as they reason about and perform decimals through the hundredths place.',
179
+ 'token': 2015,
180
+ 'token_str': '##s'},
181
+ {'score': 0.03134258836507797,
182
+ 'sequence': 'students apply these new understandings as they reason about and perform decimal operations through the hundredths place.',
183
+ 'token': 3136,
184
+ 'token_str': 'operations'},
185
+ {'score': 0.01993160881102085,
186
+ 'sequence': 'students apply these new understandings as they reason about and perform decimal placement through the hundredths place.',
187
+ 'token': 11073,
188
+ 'token_str': 'placement'},
189
+ {'score': 0.012547064572572708,
190
+ 'sequence': 'students apply these new understandings as they reason about and perform decimal places through the hundredths place.',
191
+ 'token': 3182,
192
+ 'token_str': 'places'}]
193
+ ```
194
  #### Training data
195
  The MathBERT model was pretrained on pre-k to HS math curriculum (engageNY, Utah Math, Illustrative Math), college math books from openculture.com as well as graduate level math from arxiv math paper abstracts. There is about 100M tokens got pretrained on.
196