File size: 2,146 Bytes
6e89d21
 
 
 
 
3f26b3c
6e89d21
 
 
 
 
 
 
4496635
6e89d21
 
3f26b3c
6e89d21
dc8fa51
6e89d21
10e3808
6e89d21
3f26b3c
6e89d21
 
 
 
 
 
2b5df29
6e89d21
 
 
 
 
 
 
 
 
 
 
4143a3b
3f26b3c
4143a3b
 
6e89d21
 
 
 
 
 
 
 
dc8fa51
 
 
6e89d21
dc8fa51
6e89d21
dc8fa51
 
 
 
 
 
 
 
 
 
 
 
7f3bad4
dc8fa51
 
 
6e89d21
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
---
language: en
thumbnail: https://huggingface.co/front/thumbnails/google.png
license: apache-2.0
base_model:
- cross-encoder/ms-marco-TinyBERT-L-2-v2
pipeline_tag: text-classification
library_name: transformers
metrics:
- f1
- precision
- recall
datasets:
- Mozilla/autofill_dataset
---

## Cross-Encoder for MS Marco with TinyBert

This is a fine-tuned version of the model checkpointed at [cross-encoder/ms-marco-TinyBert-L-2-v2](https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-2-v2).

It was fine-tuned on html tags and labels generated using [Fathom](https://mozilla.github.io/fathom/commands/label.html).

## How to use this model in `transformers`

```python
from transformers import pipeline

classifier = pipeline(
	"text-classification",
	model="Mozilla/tinybert-uncased-autofill"
)

print(
	classifier('<input class="cc-number" placeholder="Enter credit card number..." />')
)

```

## Model Training Info
```python
HyperParameters: {
    'learning_rate': 0.000082,
    'num_train_epochs': 71,
    'weight_decay': 0.1,
    'per_device_train_batch_size': 32,
}
```

More information on how the model was trained can be found here: https://github.com/mozilla/smart_autofill

# Model Performance
```
Test Performance:
Precision: 0.913
Recall: 0.872
F1: 0.887

             precision    recall  f1-score   support

      cc-csc      0.943     0.950     0.946       139
      cc-exp      1.000     0.883     0.938        60
cc-exp-month      0.954     0.922     0.938        90
 cc-exp-year      0.904     0.934     0.919        91
     cc-name      0.835     0.989     0.905        92
   cc-number      0.953     0.970     0.961       167
     cc-type      0.920     0.940     0.930       183
       email      0.918     0.927     0.922       205
  given-name      0.727     0.421     0.533        19
   last-name      0.833     0.588     0.690        17
       other      0.994     0.994     0.994      8000
 postal-code      0.980     0.951     0.965       102

    accuracy                          0.985      9165
   macro avg      0.913     0.872     0.887      9165
weighted avg      0.986     0.985     0.985      9165
```