distilbert-base-uncased-finetuned-legal_data

This model is a fine-tuned version of distilbert-base-uncased on the None dataset. It achieves the following results on the evaluation set:

Loss: 6.9101

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	26	5.3529
No log	2.0	52	5.4226
No log	3.0	78	5.2550
No log	4.0	104	5.1011
No log	5.0	130	5.1857
No log	6.0	156	5.5119
No log	7.0	182	5.4480
No log	8.0	208	5.6993
No log	9.0	234	5.9614
No log	10.0	260	5.6987
No log	11.0	286	5.6679
No log	12.0	312	5.9850
No log	13.0	338	5.6065
No log	14.0	364	5.3162
No log	15.0	390	5.7856
No log	16.0	416	5.5786
No log	17.0	442	5.6028
No log	18.0	468	5.7649
No log	19.0	494	5.5382
1.8345	20.0	520	6.3654
1.8345	21.0	546	5.3575
1.8345	22.0	572	5.3808
1.8345	23.0	598	5.9340
1.8345	24.0	624	6.1475
1.8345	25.0	650	6.2188
1.8345	26.0	676	5.7651
1.8345	27.0	702	6.2629
1.8345	28.0	728	6.1356
1.8345	29.0	754	5.9255
1.8345	30.0	780	6.4252
1.8345	31.0	806	5.6967
1.8345	32.0	832	6.4324
1.8345	33.0	858	6.5087
1.8345	34.0	884	6.1113
1.8345	35.0	910	6.7443
1.8345	36.0	936	6.6970
1.8345	37.0	962	6.5578
1.8345	38.0	988	6.1963
0.2251	39.0	1014	6.4893
0.2251	40.0	1040	6.6347
0.2251	41.0	1066	6.7106
0.2251	42.0	1092	6.8129
0.2251	43.0	1118	6.6386
0.2251	44.0	1144	6.4134
0.2251	45.0	1170	6.6883
0.2251	46.0	1196	6.6406
0.2251	47.0	1222	6.3065
0.2251	48.0	1248	7.0281
0.2251	49.0	1274	7.3646
0.2251	50.0	1300	7.1086
0.2251	51.0	1326	6.4749
0.2251	52.0	1352	6.3303
0.2251	53.0	1378	6.2919
0.2251	54.0	1404	6.3855
0.2251	55.0	1430	6.9501
0.2251	56.0	1456	6.8714
0.2251	57.0	1482	6.9856
0.0891	58.0	1508	6.9910
0.0891	59.0	1534	6.9293
0.0891	60.0	1560	7.3493
0.0891	61.0	1586	7.1834
0.0891	62.0	1612	7.0479
0.0891	63.0	1638	6.7674
0.0891	64.0	1664	6.7553
0.0891	65.0	1690	7.3074
0.0891	66.0	1716	6.8071
0.0891	67.0	1742	7.6622
0.0891	68.0	1768	6.9555
0.0891	69.0	1794	7.0153
0.0891	70.0	1820	7.2085
0.0891	71.0	1846	6.7582
0.0891	72.0	1872	6.7989
0.0891	73.0	1898	6.7012
0.0891	74.0	1924	7.0088
0.0891	75.0	1950	7.1024
0.0891	76.0	1976	6.6968
0.058	77.0	2002	7.5249
0.058	78.0	2028	6.9199
0.058	79.0	2054	7.1995
0.058	80.0	2080	6.9349
0.058	81.0	2106	7.4025
0.058	82.0	2132	7.4199
0.058	83.0	2158	6.8081
0.058	84.0	2184	7.4777
0.058	85.0	2210	7.1990
0.058	86.0	2236	7.0062
0.058	87.0	2262	7.5724
0.058	88.0	2288	6.9362
0.058	89.0	2314	7.1368
0.058	90.0	2340	7.2183
0.058	91.0	2366	6.8684
0.058	92.0	2392	7.1433
0.058	93.0	2418	7.2161
0.058	94.0	2444	7.1442
0.058	95.0	2470	7.3098
0.058	96.0	2496	7.1264
0.0512	97.0	2522	6.9424
0.0512	98.0	2548	6.9155
0.0512	99.0	2574	6.9038
0.0512	100.0	2600	6.9101

Framework versions

Transformers 4.11.3
Pytorch 1.9.0+cu102
Datasets 1.12.1
Tokenizers 0.10.3

MariamD
/

distilbert-base-uncased-finetuned-legal_data

distilbert-base-uncased-finetuned-legal_data

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results