Cross-Encoder for MS Marco with TinyBert
This is a fine-tuned version of the model checkpointed at cross-encoder/ms-marco-MiniLM-L-4-v2.
It was fine-tuned on html tags and labels generated using Fathom.
How to use this model in transformers
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="Mozilla/tinybert-uncased-autofill"
)
print(
classifier('Card information input Card number cc-number <SEP> <SEP> input First name <SEP> <SEP>')
)
Model Training Info
HyperParameters = {
'learning_rate': 2.3878733582558547e-05,
'num_train_epochs': 21,
'weight_decay': 0.0005288040458920454,
'per_device_train_batch_size': 32
}
More information on how the model was trained can be found here: https://github.com/mozilla/smart_autofill
Model Performance
Test Performance:
Precision: 0.913
Recall: 0.872
F1: 0.887
precision recall f1-score support
cc-csc 0.943 0.950 0.946 139
cc-exp 1.000 0.883 0.938 60
cc-exp-month 0.954 0.922 0.938 90
cc-exp-year 0.904 0.934 0.919 91
cc-name 0.835 0.989 0.905 92
cc-number 0.953 0.970 0.961 167
cc-type 0.920 0.940 0.930 183
email 0.918 0.927 0.922 205
given-name 0.727 0.421 0.533 19
last-name 0.833 0.588 0.690 17
other 0.994 0.994 0.994 8000
postal-code 0.980 0.951 0.965 102
accuracy 0.985 9165
macro avg 0.913 0.872 0.887 9165
weighted avg 0.986 0.985 0.985 9165
- Downloads last month
- 74
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for Mozilla/tinybert-uncased-autofill
Base model
cross-encoder/ms-marco-MiniLM-L-4-v2