impossible-llms-dutch-natural-3

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 5.3664

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 123
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 60
  • mixed_precision_training: Native AMP
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss
84.6738 0.2254 10 9.6849
75.2212 0.4507 20 9.0059
71.4022 0.6761 30 8.7309
69.142 0.9014 40 8.4243
67.3085 1.1352 50 8.1240
64.394 1.3606 60 7.8235
61.7536 1.5859 70 7.5267
59.5653 1.8113 80 7.2309
57.6555 2.0451 90 6.9380
54.7632 2.2704 100 6.6369
52.2221 2.4958 110 6.3715
50.2304 2.7211 120 6.1346
48.8427 2.9465 130 5.9647
48.0248 3.1803 140 5.8598
46.8309 3.4056 150 5.7895
46.165 3.6310 160 5.7349
45.6975 3.8563 170 5.6961
46.1271 4.0901 180 5.6595
45.3585 4.3155 190 5.6229
44.7282 4.5408 200 5.5928
44.5899 4.7662 210 5.5585
44.4591 4.9915 220 5.5333
44.5908 5.2254 230 5.5203
43.9283 5.4507 240 5.4937
43.6853 5.6761 250 5.4737
43.6807 5.9014 260 5.4533
43.8913 6.1352 270 5.4340
43.2255 6.3606 280 5.4128
43.0267 6.5859 290 5.3955
42.9429 6.8113 300 5.3788
43.3161 7.0451 310 5.3671
42.4849 7.2704 320 5.3500
42.2961 7.4958 330 5.3368
42.1095 7.7211 340 5.3232
42.3612 7.9465 350 5.3187
42.1893 8.1803 360 5.3016
41.6822 8.4056 370 5.2851
41.7326 8.6310 380 5.2732
41.7149 8.8563 390 5.2618
41.6805 9.0901 400 5.2446
40.9705 9.3155 410 5.2347
41.1708 9.5408 420 5.2196
41.1171 9.7662 430 5.2069
40.765 9.9915 440 5.1916
40.8325 10.2254 450 5.1781
40.548 10.4507 460 5.1667
40.2597 10.6761 470 5.1509
40.1867 10.9014 480 5.1418
40.6037 11.1352 490 5.1274
40.0224 11.3606 500 5.1148
39.4417 11.5859 510 5.1036
39.6558 11.8113 520 5.0871
39.6803 12.0451 530 5.0778
39.1524 12.2704 540 5.0646
39.0993 12.4958 550 5.0520
39.0383 12.7211 560 5.0397
38.9589 12.9465 570 5.0269
39.0936 13.1803 580 5.0223
38.4753 13.4056 590 5.0096
38.4053 13.6310 600 4.9954
38.3536 13.8563 610 4.9801
38.5044 14.0901 620 4.9739
37.819 14.3155 630 4.9666
37.9272 14.5408 640 4.9524
37.9232 14.7662 650 4.9494
37.6142 14.9915 660 4.9300
37.7402 15.2254 670 4.9269
37.2276 15.4507 680 4.9237
37.2481 15.6761 690 4.9107
37.1197 15.9014 700 4.9031
37.3689 16.1352 710 4.8937
36.5883 16.3606 720 4.8933
36.6394 16.5859 730 4.8834
36.6653 16.8113 740 4.8765
37.0723 17.0451 750 4.8695
36.1312 17.2704 760 4.8689
35.9925 17.4958 770 4.8631
36.0532 17.7211 780 4.8589
36.1537 17.9465 790 4.8538
35.9778 18.1803 800 4.8478
35.6045 18.4056 810 4.8485
35.7288 18.6310 820 4.8420
35.4722 18.8563 830 4.8380
36.0354 19.0901 840 4.8347
34.7944 19.3155 850 4.8359
35.0414 19.5408 860 4.8362
35.2014 19.7662 870 4.8250
35.2055 19.9915 880 4.8211
34.8006 20.2254 890 4.8283
34.5988 20.4507 900 4.8288
34.4991 20.6761 910 4.8233
34.7292 20.9014 920 4.8164
34.7012 21.1352 930 4.8197
34.1004 21.3606 940 4.8243
34.0364 21.5859 950 4.8193
34.1717 21.8113 960 4.8151
34.4908 22.0451 970 4.8217
33.3615 22.2704 980 4.8290
33.668 22.4958 990 4.8206
33.641 22.7211 1000 4.8220
33.6943 22.9465 1010 4.8172
33.6806 23.1803 1020 4.8296
32.8408 23.4056 1030 4.8311
33.2998 23.6310 1040 4.8295
33.2505 23.8563 1050 4.8245
33.4135 24.0901 1060 4.8297
32.8855 24.3155 1070 4.8402
32.5233 24.5408 1080 4.8419
32.5512 24.7662 1090 4.8402
32.785 24.9915 1100 4.8381
32.6117 25.2254 1110 4.8497
31.9971 25.4507 1120 4.8523
32.3718 25.6761 1130 4.8521
32.3031 25.9014 1140 4.8483
32.2409 26.1352 1150 4.8605
31.6713 26.3606 1160 4.8663
31.7719 26.5859 1170 4.8670
31.8051 26.8113 1180 4.8640
32.3006 27.0451 1190 4.8702
31.3598 27.2704 1200 4.8845
31.296 27.4958 1210 4.8879
31.3315 27.7211 1220 4.8850
31.408 27.9465 1230 4.8866
31.4039 28.1803 1240 4.8976
30.7473 28.4056 1250 4.9051
31.0986 28.6310 1260 4.9048
30.8876 28.8563 1270 4.9001
31.1611 29.0901 1280 4.9132
30.427 29.3155 1290 4.9258
30.3484 29.5408 1300 4.9265
30.5664 29.7662 1310 4.9249
30.7325 29.9915 1320 4.9243
30.4933 30.2254 1330 4.9460
29.9895 30.4507 1340 4.9492
30.1401 30.6761 1350 4.9455
30.1644 30.9014 1360 4.9495
30.2682 31.1352 1370 4.9574
29.5472 31.3606 1380 4.9684
29.7077 31.5859 1390 4.9740
29.8152 31.8113 1400 4.9677
30.2051 32.0451 1410 4.9807
29.3022 32.2704 1420 4.9904
29.3693 32.4958 1430 4.9998
29.3703 32.7211 1440 4.9933
29.3662 32.9465 1450 4.9938
29.2136 33.1803 1460 5.0175
29.015 33.4056 1470 5.0158
29.06 33.6310 1480 5.0249
29.0414 33.8563 1490 5.0248
29.162 34.0901 1500 5.0354
28.4905 34.3155 1510 5.0445
28.5755 34.5408 1520 5.0498
28.6749 34.7662 1530 5.0440
28.872 34.9915 1540 5.0478
28.5365 35.2254 1550 5.0688
28.2136 35.4507 1560 5.0708
28.2986 35.6761 1570 5.0712
28.5098 35.9014 1580 5.0688
28.5359 36.1352 1590 5.0875
27.7739 36.3606 1600 5.0965
27.901 36.5859 1610 5.0979
28.1265 36.8113 1620 5.0926
28.4815 37.0451 1630 5.1091
27.6902 37.2704 1640 5.1179
27.4692 37.4958 1650 5.1246
27.6252 37.7211 1660 5.1194
27.9282 37.9465 1670 5.1118
27.7281 38.1803 1680 5.1396
27.4473 38.4056 1690 5.1475
27.3981 38.6310 1700 5.1473
27.4454 38.8563 1710 5.1447
27.8462 39.0901 1720 5.1554
27.0362 39.3155 1730 5.1640
27.0569 39.5408 1740 5.1690
27.1892 39.7662 1750 5.1601
27.2672 39.9915 1760 5.1652
27.1379 40.2254 1770 5.1813
26.8499 40.4507 1780 5.1848
27.0074 40.6761 1790 5.1880
26.975 40.9014 1800 5.1830
26.9335 41.1352 1810 5.2035
26.5708 41.3606 1820 5.2113
26.7212 41.5859 1830 5.2019
26.8303 41.8113 1840 5.2070
26.9664 42.0451 1850 5.2123
26.3525 42.2704 1860 5.2265
26.4464 42.4958 1870 5.2281
26.4199 42.7211 1880 5.2319
26.5224 42.9465 1890 5.2285
26.4886 43.1803 1900 5.2398
26.1526 43.4056 1910 5.2464
26.2738 43.6310 1920 5.2449
26.3004 43.8563 1930 5.2457
26.5239 44.0901 1940 5.2538
26.0768 44.3155 1950 5.2602
26.013 44.5408 1960 5.2649
26.1522 44.7662 1970 5.2603
26.0467 44.9915 1980 5.2613
26.2853 45.2254 1990 5.2714
25.8095 45.4507 2000 5.2763
25.8647 45.6761 2010 5.2803
25.8712 45.9014 2020 5.2794
26.074 46.1352 2030 5.2882
25.651 46.3606 2040 5.2933
25.8052 46.5859 2050 5.2948
25.7256 46.8113 2060 5.2963
26.06 47.0451 2070 5.2963
25.3492 47.2704 2080 5.3059
25.6168 47.4958 2090 5.3086
25.7018 47.7211 2100 5.3083
25.6243 47.9465 2110 5.3089
25.7128 48.1803 2120 5.3150
25.3346 48.4056 2130 5.3159
25.584 48.6310 2140 5.3205
25.5297 48.8563 2150 5.3237
25.7957 49.0901 2160 5.3214
25.1928 49.3155 2170 5.3249
25.3411 49.5408 2180 5.3280
25.4259 49.7662 2190 5.3288
25.3861 49.9915 2200 5.3291
25.5995 50.2254 2210 5.3361
25.2147 50.4507 2220 5.3370
25.2697 50.6761 2230 5.3388
25.2186 50.9014 2240 5.3386
25.5731 51.1352 2250 5.3430
25.0251 51.3606 2260 5.3427
25.2741 51.5859 2270 5.3454
25.1371 51.8113 2280 5.3459
25.471 52.0451 2290 5.3466
25.0118 52.2704 2300 5.3531
25.0553 52.4958 2310 5.3516
25.1054 52.7211 2320 5.3522
25.2015 52.9465 2330 5.3506
25.2773 53.1803 2340 5.3546
25.1119 53.4056 2350 5.3560
25.0583 53.6310 2360 5.3583
24.9569 53.8563 2370 5.3556
25.4228 54.0901 2380 5.3564
25.0493 54.3155 2390 5.3604
24.9059 54.5408 2400 5.3586
24.9301 54.7662 2410 5.3605
25.0125 54.9915 2420 5.3607
25.2703 55.2254 2430 5.3615
24.8681 55.4507 2440 5.3632
25.0404 55.6761 2450 5.3644
24.9964 55.9014 2460 5.3623
25.2473 56.1352 2470 5.3641
24.9676 56.3606 2480 5.3656
24.8686 56.5859 2490 5.3647
24.9229 56.8113 2500 5.3655
25.1986 57.0451 2510 5.3654
25.0148 57.2704 2520 5.3656
24.8668 57.4958 2530 5.3663
25.0365 57.7211 2540 5.3667
24.8396 57.9465 2550 5.3665
25.0727 58.1803 2560 5.3661
24.8652 58.4056 2570 5.3662
24.9151 58.6310 2580 5.3663
24.9295 58.8563 2590 5.3663
25.2173 59.0901 2600 5.3663
24.9239 59.3155 2610 5.3663
24.9107 59.5408 2620 5.3664
24.9084 59.7662 2630 5.3664
24.8394 59.9915 2640 5.3664

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.4.0+cu121
  • Datasets 3.4.0
  • Tokenizers 0.21.0
Downloads last month
13
Safetensors
Model size
126M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including IParraMartin/impossible-llms-dutch-natural-3