Update README.md
Browse files
README.md
CHANGED
@@ -14,10 +14,9 @@ datasets:
|
|
14 |
- vazish/autofill_dataset
|
15 |
---
|
16 |
|
17 |
-
##
|
18 |
|
19 |
-
|
20 |
-
between self-attentions and feed-forward networks.
|
21 |
|
22 |
This checkpoint is the original TinyBert Optimized Uncased English:
|
23 |
[TinyBert](https://huggingface.co/google/bert_uncased_L-2_H-128_A-2)
|
@@ -83,4 +82,13 @@ CC Expiration Month 0.972 0.972 0.972 36
|
|
83 |
accuracy 0.967 1846
|
84 |
macro avg 0.923 0.907 0.910 1846
|
85 |
weighted avg 0.968 0.967 0.967 1846
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
86 |
```
|
|
|
14 |
- vazish/autofill_dataset
|
15 |
---
|
16 |
|
17 |
+
## BERT Miniatures
|
18 |
|
19 |
+
This is the tiny version of the 24 BERT models referenced in Well-Read Students Learn Better: On the Importance of Pre-training Compact Models (English only, uncased, trained with WordPiece masking).
|
|
|
20 |
|
21 |
This checkpoint is the original TinyBert Optimized Uncased English:
|
22 |
[TinyBert](https://huggingface.co/google/bert_uncased_L-2_H-128_A-2)
|
|
|
82 |
accuracy 0.967 1846
|
83 |
macro avg 0.923 0.907 0.910 1846
|
84 |
weighted avg 0.968 0.967 0.967 1846
|
85 |
+
```
|
86 |
+
|
87 |
+
```
|
88 |
+
@article{turc2019,
|
89 |
+
title={Well-Read Students Learn Better: On the Importance of Pre-training Compact Models},
|
90 |
+
author={Turc, Iulia and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
|
91 |
+
journal={arXiv preprint arXiv:1908.08962v2 },
|
92 |
+
year={2019}
|
93 |
+
}
|
94 |
```
|