categorization-finetuned-20220721-164940-distilled-20220811-132317

This model is a fine-tuned version of carted-nlp/categorization-finetuned-20220721-164940 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1522
  • Accuracy: 0.8783
  • F1: 0.8779

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4e-05
  • train_batch_size: 64
  • eval_batch_size: 128
  • seed: 314
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 2000
  • num_epochs: 30.0

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
0.5212 0.56 2500 0.2564 0.7953 0.7921
0.243 1.12 5000 0.2110 0.8270 0.8249
0.2105 1.69 7500 0.1925 0.8409 0.8391
0.1939 2.25 10000 0.1837 0.8476 0.8465
0.1838 2.81 12500 0.1771 0.8528 0.8517
0.1729 3.37 15000 0.1722 0.8564 0.8555
0.1687 3.94 17500 0.1684 0.8593 0.8576
0.1602 4.5 20000 0.1653 0.8614 0.8604
0.1572 5.06 22500 0.1629 0.8648 0.8638
0.1507 5.62 25000 0.1605 0.8654 0.8646
0.1483 6.19 27500 0.1602 0.8661 0.8653
0.1431 6.75 30000 0.1597 0.8669 0.8663
0.1393 7.31 32500 0.1581 0.8691 0.8687
0.1374 7.87 35000 0.1556 0.8704 0.8697
0.1321 8.43 37500 0.1558 0.8707 0.8700
0.1328 9.0 40000 0.1536 0.8719 0.8711
0.1261 9.56 42500 0.1544 0.8716 0.8708
0.1256 10.12 45000 0.1541 0.8731 0.8725
0.122 10.68 47500 0.1520 0.8741 0.8734
0.1196 11.25 50000 0.1529 0.8734 0.8728
0.1182 11.81 52500 0.1510 0.8758 0.8751
0.1145 12.37 55000 0.1526 0.8746 0.8737
0.1141 12.93 57500 0.1512 0.8765 0.8759
0.1094 13.5 60000 0.1517 0.8760 0.8753
0.1098 14.06 62500 0.1513 0.8771 0.8764
0.1058 14.62 65000 0.1506 0.8775 0.8768
0.1048 15.18 67500 0.1521 0.8774 0.8768
0.1028 15.74 70000 0.1520 0.8778 0.8773
0.1006 16.31 72500 0.1517 0.8780 0.8774
0.1001 16.87 75000 0.1505 0.8794 0.8790
0.0971 17.43 77500 0.1520 0.8784 0.8778
0.0973 17.99 80000 0.1514 0.8796 0.8790
0.0938 18.56 82500 0.1516 0.8795 0.8789
0.0942 19.12 85000 0.1522 0.8794 0.8789
0.0918 19.68 87500 0.1518 0.8799 0.8793
0.0909 20.24 90000 0.1528 0.8803 0.8796
0.0901 20.81 92500 0.1516 0.8799 0.8793
0.0882 21.37 95000 0.1519 0.8800 0.8794
0.088 21.93 97500 0.1517 0.8802 0.8798
0.086 22.49 100000 0.1530 0.8800 0.8795
0.0861 23.05 102500 0.1523 0.8806 0.8801
0.0846 23.62 105000 0.1524 0.8808 0.8802
0.0843 24.18 107500 0.1522 0.8805 0.8800
0.0836 24.74 110000 0.1525 0.8808 0.8803
0.083 25.3 112500 0.1528 0.8810 0.8803
0.0829 25.87 115000 0.1528 0.8808 0.8802
0.082 26.43 117500 0.1529 0.8808 0.8802
0.0818 26.99 120000 0.1525 0.8811 0.8805
0.0816 27.55 122500 0.1526 0.8811 0.8806
0.0809 28.12 125000 0.1528 0.8810 0.8805
0.0809 28.68 127500 0.1527 0.8810 0.8804
0.0814 29.24 130000 0.1528 0.8808 0.8802
0.0807 29.8 132500 0.1528 0.8808 0.8802

Framework versions

  • Transformers 4.17.0
  • Pytorch 1.11.0+cu113
  • Datasets 2.3.2
  • Tokenizers 0.11.6
Downloads last month
22
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.