categorization-finetuned-20220721-164940-distilled-20220810-185342
This model is a fine-tuned version of carted-nlp/categorization-finetuned-20220721-164940 on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.0639
- Accuracy: 0.87
- F1: 0.8690
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 314
- gradient_accumulation_steps: 4
- total_train_batch_size: 256
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 1500
- num_epochs: 30.0
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
---|---|---|---|---|---|
0.269 | 0.56 | 2500 | 0.1280 | 0.7547 | 0.7461 |
0.125 | 1.12 | 5000 | 0.1052 | 0.7960 | 0.7916 |
0.1079 | 1.69 | 7500 | 0.0950 | 0.8132 | 0.8102 |
0.0992 | 2.25 | 10000 | 0.0898 | 0.8216 | 0.8188 |
0.0938 | 2.81 | 12500 | 0.0859 | 0.8294 | 0.8268 |
0.0891 | 3.37 | 15000 | 0.0828 | 0.8349 | 0.8329 |
0.0863 | 3.94 | 17500 | 0.0806 | 0.8391 | 0.8367 |
0.0834 | 4.5 | 20000 | 0.0788 | 0.8417 | 0.8400 |
0.081 | 5.06 | 22500 | 0.0774 | 0.8449 | 0.8430 |
0.0792 | 5.62 | 25000 | 0.0754 | 0.8475 | 0.8460 |
0.0778 | 6.19 | 27500 | 0.0749 | 0.8489 | 0.8474 |
0.0758 | 6.75 | 30000 | 0.0738 | 0.8517 | 0.8502 |
0.0745 | 7.31 | 32500 | 0.0729 | 0.8531 | 0.8519 |
0.0733 | 7.87 | 35000 | 0.0720 | 0.8544 | 0.8528 |
0.072 | 8.43 | 37500 | 0.0714 | 0.8559 | 0.8546 |
0.0716 | 9.0 | 40000 | 0.0707 | 0.8565 | 0.8554 |
0.0701 | 9.56 | 42500 | 0.0704 | 0.8574 | 0.8558 |
0.0693 | 10.12 | 45000 | 0.0700 | 0.8581 | 0.8569 |
0.0686 | 10.68 | 47500 | 0.0690 | 0.8600 | 0.8588 |
0.0675 | 11.25 | 50000 | 0.0690 | 0.8605 | 0.8593 |
0.0673 | 11.81 | 52500 | 0.0682 | 0.8614 | 0.8603 |
0.0663 | 12.37 | 55000 | 0.0682 | 0.8619 | 0.8606 |
0.0657 | 12.93 | 57500 | 0.0675 | 0.8634 | 0.8624 |
0.0648 | 13.5 | 60000 | 0.0674 | 0.8636 | 0.8625 |
0.0647 | 14.06 | 62500 | 0.0668 | 0.8644 | 0.8633 |
0.0638 | 14.62 | 65000 | 0.0669 | 0.8648 | 0.8635 |
0.0634 | 15.18 | 67500 | 0.0665 | 0.8654 | 0.8643 |
0.063 | 15.74 | 70000 | 0.0663 | 0.8664 | 0.8654 |
0.0623 | 16.31 | 72500 | 0.0662 | 0.8663 | 0.8652 |
0.0622 | 16.87 | 75000 | 0.0657 | 0.8669 | 0.8660 |
0.0615 | 17.43 | 77500 | 0.0658 | 0.8670 | 0.8660 |
0.0616 | 17.99 | 80000 | 0.0655 | 0.8676 | 0.8667 |
0.0608 | 18.56 | 82500 | 0.0653 | 0.8683 | 0.8672 |
0.0606 | 19.12 | 85000 | 0.0653 | 0.8679 | 0.8669 |
0.0602 | 19.68 | 87500 | 0.0648 | 0.8690 | 0.8680 |
0.0599 | 20.24 | 90000 | 0.0650 | 0.8688 | 0.8677 |
0.0598 | 20.81 | 92500 | 0.0647 | 0.8689 | 0.8680 |
0.0592 | 21.37 | 95000 | 0.0647 | 0.8692 | 0.8681 |
0.0591 | 21.93 | 97500 | 0.0646 | 0.8698 | 0.8688 |
0.0587 | 22.49 | 100000 | 0.0645 | 0.8699 | 0.8690 |
0.0586 | 23.05 | 102500 | 0.0644 | 0.8699 | 0.8690 |
0.0583 | 23.62 | 105000 | 0.0644 | 0.8699 | 0.8690 |
0.058 | 24.18 | 107500 | 0.0642 | 0.8703 | 0.8693 |
0.058 | 24.74 | 110000 | 0.0642 | 0.8704 | 0.8694 |
0.0578 | 25.3 | 112500 | 0.0641 | 0.8703 | 0.8693 |
0.0576 | 25.87 | 115000 | 0.0641 | 0.8708 | 0.8699 |
0.0573 | 26.43 | 117500 | 0.0641 | 0.8708 | 0.8698 |
0.0574 | 26.99 | 120000 | 0.0639 | 0.8711 | 0.8702 |
0.0571 | 27.55 | 122500 | 0.0640 | 0.8711 | 0.8701 |
0.0569 | 28.12 | 125000 | 0.0639 | 0.8711 | 0.8702 |
0.0569 | 28.68 | 127500 | 0.0639 | 0.8712 | 0.8703 |
0.057 | 29.24 | 130000 | 0.0639 | 0.8712 | 0.8703 |
0.0566 | 29.8 | 132500 | 0.0638 | 0.8713 | 0.8704 |
Framework versions
- Transformers 4.17.0
- Pytorch 1.11.0+cu113
- Datasets 2.3.2
- Tokenizers 0.11.6
- Downloads last month
- 2
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.