categorization-finetuned-20220721-164940-distilled-20220811-074207

This model is a fine-tuned version of carted-nlp/categorization-finetuned-20220721-164940 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1499
  • Accuracy: 0.8771
  • F1: 0.8763

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 64
  • eval_batch_size: 96
  • seed: 314
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 1500
  • num_epochs: 30.0

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
0.5644 0.56 2500 0.2739 0.7822 0.7774
0.2658 1.12 5000 0.2288 0.8159 0.8127
0.2307 1.69 7500 0.2082 0.8298 0.8273
0.2126 2.25 10000 0.1970 0.8389 0.8370
0.2012 2.81 12500 0.1888 0.8450 0.8433
0.1903 3.37 15000 0.1829 0.8496 0.8485
0.1846 3.94 17500 0.1783 0.8529 0.8511
0.1771 4.5 20000 0.1750 0.8548 0.8537
0.1726 5.06 22500 0.1727 0.8577 0.8564
0.1673 5.62 25000 0.1683 0.8602 0.8591
0.1648 6.19 27500 0.1675 0.8608 0.8597
0.1596 6.75 30000 0.1657 0.8630 0.8620
0.1563 7.31 32500 0.1635 0.8646 0.8639
0.154 7.87 35000 0.1613 0.8656 0.8647
0.1496 8.43 37500 0.1611 0.8666 0.8656
0.1496 9.0 40000 0.1598 0.8676 0.8669
0.1445 9.56 42500 0.1594 0.8681 0.8671
0.1435 10.12 45000 0.1588 0.8688 0.8679
0.1407 10.68 47500 0.1568 0.8703 0.8695
0.1382 11.25 50000 0.1564 0.8708 0.8700
0.1372 11.81 52500 0.1550 0.8720 0.8713
0.1344 12.37 55000 0.1559 0.8718 0.8708
0.1337 12.93 57500 0.1540 0.8735 0.8729
0.1303 13.5 60000 0.1541 0.8729 0.8721
0.1304 14.06 62500 0.1531 0.8735 0.8727
0.1274 14.62 65000 0.1535 0.8736 0.8727
0.1266 15.18 67500 0.1527 0.8750 0.8742
0.1251 15.74 70000 0.1525 0.8755 0.8748
0.1234 16.31 72500 0.1528 0.8753 0.8745
0.1229 16.87 75000 0.1516 0.8760 0.8753
0.121 17.43 77500 0.1523 0.8759 0.8752
0.1212 17.99 80000 0.1515 0.8760 0.8754
0.1185 18.56 82500 0.1514 0.8765 0.8757
0.1186 19.12 85000 0.1516 0.8766 0.8760
0.1172 19.68 87500 0.1506 0.8774 0.8767
0.1164 20.24 90000 0.1513 0.8770 0.8763
0.116 20.81 92500 0.1507 0.8774 0.8767
0.1145 21.37 95000 0.1507 0.8777 0.8770
0.1143 21.93 97500 0.1506 0.8776 0.8770
0.1131 22.49 100000 0.1507 0.8779 0.8772
0.1131 23.05 102500 0.1505 0.8779 0.8772
0.1123 23.62 105000 0.1506 0.8781 0.8774
0.1117 24.18 107500 0.1504 0.8783 0.8776
0.1118 24.74 110000 0.1503 0.8784 0.8777
0.1111 25.3 112500 0.1503 0.8783 0.8776
0.1111 25.87 115000 0.1502 0.8784 0.8777
0.1105 26.43 117500 0.1504 0.8783 0.8776
0.1105 26.99 120000 0.1502 0.8786 0.8779
0.1104 27.55 122500 0.1503 0.8786 0.8779
0.1096 28.12 125000 0.1502 0.8785 0.8779
0.1101 28.68 127500 0.1501 0.8786 0.8779
0.1101 29.24 130000 0.1502 0.8786 0.8779
0.1094 29.8 132500 0.1501 0.8786 0.8779

Framework versions

  • Transformers 4.17.0
  • Pytorch 1.11.0+cu113
  • Datasets 2.3.2
  • Tokenizers 0.11.6
Downloads last month
23
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.