File size: 3,027 Bytes
6e89d21 4496635 6e89d21 7f3bad4 6e89d21 99c3caf 6e89d21 2b5df29 6e89d21 4143a3b 6e89d21 9b5c1d7 6e89d21 9b5c1d7 6e89d21 9b5c1d7 6e89d21 9b5c1d7 7f3bad4 6e89d21 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 |
---
language: en
thumbnail: https://huggingface.co/front/thumbnails/google.png
license: apache-2.0
base_model:
- google/bert_uncased_L-2_H-128_A-2
pipeline_tag: text-classification
library_name: transformers
metrics:
- f1
- precision
- recall
datasets:
- Mozilla/autofill_dataset
---
## BERT Miniatures
This is the tiny version of the 24 BERT models referenced in [Well-Read Students Learn Better: On the Importance of Pre-training Compact Models](https://arxiv.org/abs/1908.08962) (English only, uncased, trained with WordPiece masking).
This checkpoint is the original TinyBert Optimized Uncased English:
[TinyBert](https://huggingface.co/google/bert_uncased_L-2_H-128_A-2)
checkpoint.
This model was fine-tuned on html tags and labels using [Fathom](https://mozilla.github.io/fathom/commands/label.html).
## How to use TinyBert in `transformers`
```python
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="Mozilla/tinybert-uncased-autofill"
)
print(
classifier('<input class="cc-number" placeholder="Enter credit card number..." />')
)
```
## Model Training Info
```python
HyperParameters: {
'learning_rate': 0.000082,
'num_train_epochs': 59,
'weight_decay': 0.1,
'per_device_train_batch_size': 32,
}
```
More information on how the model was trained can be found here: https://github.com/mozilla/smart_autofill
# Model Performance
```
Test Performance:
Precision: 0.96778
Recall: 0.96696
F1: 0.9668
precision recall f1-score support
CC Expiration 1.000 0.750 0.857 16
CC Expiration Month 0.972 0.972 0.972 36
CC Expiration Year 0.946 0.946 0.946 37
CC Name 0.882 0.968 0.923 31
CC Number 0.942 0.980 0.961 50
CC Payment Type 0.918 0.893 0.905 75
CC Security Code 0.950 0.927 0.938 41
CC Type 0.917 0.786 0.846 14
Confirm Password 0.961 0.860 0.907 57
Email 0.909 0.959 0.933 73
First Name 0.800 0.800 0.800 5
Form 0.974 0.974 0.974 39
Last Name 0.714 1.000 0.833 5
New Password 0.913 0.979 0.945 97
Other 0.986 0.983 0.985 1235
Phone 1.000 0.667 0.800 3
Zip Code 0.912 0.969 0.939 32
accuracy 0.967 1846
macro avg 0.923 0.907 0.910 1846
weighted avg 0.968 0.967 0.967 1846
```
```
@article{turc2019,
title={Well-Read Students Learn Better: On the Importance of Pre-training Compact Models},
author={Turc, Iulia and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
journal={arXiv preprint arXiv:1908.08962v2 },
year={2019}
}
``` |