categorization-finetuned-20220721-164940-distilled-20220810-185342

This model is a fine-tuned version of carted-nlp/categorization-finetuned-20220721-164940 on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0639
Accuracy: 0.87
F1: 0.8690

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 64
eval_batch_size: 64
seed: 314
gradient_accumulation_steps: 4
total_train_batch_size: 256
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1500
num_epochs: 30.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
0.269	0.56	2500	0.1280	0.7547	0.7461
0.125	1.12	5000	0.1052	0.7960	0.7916
0.1079	1.69	7500	0.0950	0.8132	0.8102
0.0992	2.25	10000	0.0898	0.8216	0.8188
0.0938	2.81	12500	0.0859	0.8294	0.8268
0.0891	3.37	15000	0.0828	0.8349	0.8329
0.0863	3.94	17500	0.0806	0.8391	0.8367
0.0834	4.5	20000	0.0788	0.8417	0.8400
0.081	5.06	22500	0.0774	0.8449	0.8430
0.0792	5.62	25000	0.0754	0.8475	0.8460
0.0778	6.19	27500	0.0749	0.8489	0.8474
0.0758	6.75	30000	0.0738	0.8517	0.8502
0.0745	7.31	32500	0.0729	0.8531	0.8519
0.0733	7.87	35000	0.0720	0.8544	0.8528
0.072	8.43	37500	0.0714	0.8559	0.8546
0.0716	9.0	40000	0.0707	0.8565	0.8554
0.0701	9.56	42500	0.0704	0.8574	0.8558
0.0693	10.12	45000	0.0700	0.8581	0.8569
0.0686	10.68	47500	0.0690	0.8600	0.8588
0.0675	11.25	50000	0.0690	0.8605	0.8593
0.0673	11.81	52500	0.0682	0.8614	0.8603
0.0663	12.37	55000	0.0682	0.8619	0.8606
0.0657	12.93	57500	0.0675	0.8634	0.8624
0.0648	13.5	60000	0.0674	0.8636	0.8625
0.0647	14.06	62500	0.0668	0.8644	0.8633
0.0638	14.62	65000	0.0669	0.8648	0.8635
0.0634	15.18	67500	0.0665	0.8654	0.8643
0.063	15.74	70000	0.0663	0.8664	0.8654
0.0623	16.31	72500	0.0662	0.8663	0.8652
0.0622	16.87	75000	0.0657	0.8669	0.8660
0.0615	17.43	77500	0.0658	0.8670	0.8660
0.0616	17.99	80000	0.0655	0.8676	0.8667
0.0608	18.56	82500	0.0653	0.8683	0.8672
0.0606	19.12	85000	0.0653	0.8679	0.8669
0.0602	19.68	87500	0.0648	0.8690	0.8680
0.0599	20.24	90000	0.0650	0.8688	0.8677
0.0598	20.81	92500	0.0647	0.8689	0.8680
0.0592	21.37	95000	0.0647	0.8692	0.8681
0.0591	21.93	97500	0.0646	0.8698	0.8688
0.0587	22.49	100000	0.0645	0.8699	0.8690
0.0586	23.05	102500	0.0644	0.8699	0.8690
0.0583	23.62	105000	0.0644	0.8699	0.8690
0.058	24.18	107500	0.0642	0.8703	0.8693
0.058	24.74	110000	0.0642	0.8704	0.8694
0.0578	25.3	112500	0.0641	0.8703	0.8693
0.0576	25.87	115000	0.0641	0.8708	0.8699
0.0573	26.43	117500	0.0641	0.8708	0.8698
0.0574	26.99	120000	0.0639	0.8711	0.8702
0.0571	27.55	122500	0.0640	0.8711	0.8701
0.0569	28.12	125000	0.0639	0.8711	0.8702
0.0569	28.68	127500	0.0639	0.8712	0.8703
0.057	29.24	130000	0.0639	0.8712	0.8703
0.0566	29.8	132500	0.0638	0.8713	0.8704

Framework versions

Transformers 4.17.0
Pytorch 1.11.0+cu113
Datasets 2.3.2
Tokenizers 0.11.6

carted-nlp
/

categorization-finetuned-20220721-164940-distilled-20220810-185342

categorization-finetuned-20220721-164940-distilled-20220810-185342

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results