--- language: en thumbnail: https://huggingface.co/front/thumbnails/google.png license: apache-2.0 base_model: - cross-encoder/ms-marco-MiniLM-L-4-v2 pipeline_tag: text-classification library_name: transformers metrics: - f1 - precision - recall datasets: - Mozilla/autofill_dataset --- ## Cross-Encoder for MS Marco with TinyBert This is a fine-tuned version of the model checkpointed at [cross-encoder/ms-marco-MiniLM-L-4-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-4-v2). It was fine-tuned on html tags and labels generated using [Fathom](https://mozilla.github.io/fathom/commands/label.html). ## How to use this model in `transformers` ```python from transformers import pipeline classifier = pipeline( "text-classification", model="Mozilla/tinybert-uncased-autofill" ) print( classifier('Card information input Card number cc-number input First name ') ) ``` ## Model Training Info ```python HyperParameters = { 'learning_rate': 2.3878733582558547e-05, 'num_train_epochs': 21, 'weight_decay': 0.0005288040458920454, 'per_device_train_batch_size': 32 } ``` More information on how the model was trained can be found here: https://github.com/mozilla/smart_autofill # Model Performance ``` Test Performance: Precision: 0.913 Recall: 0.872 F1: 0.887 precision recall f1-score support cc-csc 0.943 0.950 0.946 139 cc-exp 1.000 0.883 0.938 60 cc-exp-month 0.954 0.922 0.938 90 cc-exp-year 0.904 0.934 0.919 91 cc-name 0.835 0.989 0.905 92 cc-number 0.953 0.970 0.961 167 cc-type 0.920 0.940 0.930 183 email 0.918 0.927 0.922 205 given-name 0.727 0.421 0.533 19 last-name 0.833 0.588 0.690 17 other 0.994 0.994 0.994 8000 postal-code 0.980 0.951 0.965 102 accuracy 0.985 9165 macro avg 0.913 0.872 0.887 9165 weighted avg 0.986 0.985 0.985 9165 ```