--- license: gemma base_model: google/gemma-2-2b tags: - trl - sft - generated_from_trainer model-index: - name: collapse_gemma-2-2b_hs2_accumulate_iter17_sftsd0 results: [] --- # collapse_gemma-2-2b_hs2_accumulate_iter17_sftsd0 This model is a fine-tuned version of [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 1.1003 - Num Input Tokens Seen: 89019216 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 8e-06 - train_batch_size: 8 - eval_batch_size: 16 - seed: 0 - gradient_accumulation_steps: 16 - total_train_batch_size: 128 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: constant_with_warmup - lr_scheduler_warmup_ratio: 0.05 - num_epochs: 1 ### Training results | Training Loss | Epoch | Step | Validation Loss | Input Tokens Seen | |:-------------:|:------:|:----:|:---------------:|:-----------------:| | No log | 0 | 0 | 1.3909 | 0 | | 1.6994 | 0.0031 | 5 | 1.3905 | 272200 | | 1.6801 | 0.0061 | 10 | 1.3827 | 546088 | | 1.6443 | 0.0092 | 15 | 1.3569 | 810672 | | 1.6204 | 0.0122 | 20 | 1.3238 | 1081712 | | 1.546 | 0.0153 | 25 | 1.2782 | 1354680 | | 1.4506 | 0.0184 | 30 | 1.2439 | 1627472 | | 1.2996 | 0.0214 | 35 | 1.2234 | 1898952 | | 1.3043 | 0.0245 | 40 | 1.1991 | 2175064 | | 1.1955 | 0.0275 | 45 | 1.2064 | 2448608 | | 0.9258 | 0.0306 | 50 | 1.2280 | 2729664 | | 0.8256 | 0.0337 | 55 | 1.2796 | 3002736 | | 0.7169 | 0.0367 | 60 | 1.3149 | 3273720 | | 0.5948 | 0.0398 | 65 | 1.3494 | 3540936 | | 0.6068 | 0.0428 | 70 | 1.3754 | 3806408 | | 0.3883 | 0.0459 | 75 | 1.3697 | 4081624 | | 0.3017 | 0.0490 | 80 | 1.3294 | 4349184 | | 0.3503 | 0.0520 | 85 | 1.2649 | 4621264 | | 0.3387 | 0.0551 | 90 | 1.2668 | 4901192 | | 0.2368 | 0.0581 | 95 | 1.2385 | 5177112 | | 0.2975 | 0.0612 | 100 | 1.2361 | 5448624 | | 0.18 | 0.0642 | 105 | 1.2237 | 5721304 | | 0.2429 | 0.0673 | 110 | 1.2294 | 5991928 | | 0.2351 | 0.0704 | 115 | 1.2112 | 6264232 | | 0.1608 | 0.0734 | 120 | 1.2197 | 6543976 | | 0.269 | 0.0765 | 125 | 1.2032 | 6811672 | | 0.197 | 0.0795 | 130 | 1.2026 | 7090240 | | 0.2021 | 0.0826 | 135 | 1.1991 | 7358920 | | 0.1978 | 0.0857 | 140 | 1.2094 | 7630456 | | 0.1944 | 0.0887 | 145 | 1.1948 | 7901616 | | 0.1594 | 0.0918 | 150 | 1.2021 | 8169472 | | 0.1696 | 0.0948 | 155 | 1.1932 | 8444928 | | 0.2287 | 0.0979 | 160 | 1.1883 | 8712632 | | 0.2309 | 0.1010 | 165 | 1.1872 | 8978376 | | 0.2072 | 0.1040 | 170 | 1.1848 | 9247560 | | 0.147 | 0.1071 | 175 | 1.1808 | 9517088 | | 0.1585 | 0.1101 | 180 | 1.1850 | 9789888 | | 0.1815 | 0.1132 | 185 | 1.1801 | 10062672 | | 0.1777 | 0.1163 | 190 | 1.1825 | 10341688 | | 0.1807 | 0.1193 | 195 | 1.1813 | 10619320 | | 0.197 | 0.1224 | 200 | 1.1792 | 10901712 | | 0.1497 | 0.1254 | 205 | 1.1742 | 11176912 | | 0.1282 | 0.1285 | 210 | 1.1784 | 11449256 | | 0.1578 | 0.1316 | 215 | 1.1712 | 11722528 | | 0.1074 | 0.1346 | 220 | 1.1716 | 11996296 | | 0.1979 | 0.1377 | 225 | 1.1746 | 12265456 | | 0.1555 | 0.1407 | 230 | 1.1719 | 12538840 | | 0.218 | 0.1438 | 235 | 1.1700 | 12808296 | | 0.1469 | 0.1469 | 240 | 1.1704 | 13086440 | | 0.2141 | 0.1499 | 245 | 1.1669 | 13359200 | | 0.213 | 0.1530 | 250 | 1.1656 | 13632880 | | 0.1467 | 0.1560 | 255 | 1.1617 | 13912552 | | 0.1346 | 0.1591 | 260 | 1.1643 | 14183944 | | 0.1543 | 0.1621 | 265 | 1.1630 | 14452264 | | 0.0921 | 0.1652 | 270 | 1.1594 | 14729560 | | 0.1302 | 0.1683 | 275 | 1.1595 | 15004368 | | 0.2288 | 0.1713 | 280 | 1.1618 | 15275312 | | 0.0788 | 0.1744 | 285 | 1.1602 | 15547512 | | 0.1082 | 0.1774 | 290 | 1.1604 | 15816448 | | 0.1323 | 0.1805 | 295 | 1.1579 | 16094736 | | 0.159 | 0.1836 | 300 | 1.1605 | 16364712 | | 0.1364 | 0.1866 | 305 | 1.1591 | 16637896 | | 0.1182 | 0.1897 | 310 | 1.1598 | 16917544 | | 0.0775 | 0.1927 | 315 | 1.1575 | 17192592 | | 0.146 | 0.1958 | 320 | 1.1566 | 17466600 | | 0.1484 | 0.1989 | 325 | 1.1573 | 17733216 | | 0.1802 | 0.2019 | 330 | 1.1535 | 18002032 | | 0.0883 | 0.2050 | 335 | 1.1555 | 18268904 | | 0.1609 | 0.2080 | 340 | 1.1583 | 18533480 | | 0.1217 | 0.2111 | 345 | 1.1521 | 18807264 | | 0.1252 | 0.2142 | 350 | 1.1528 | 19071992 | | 0.1376 | 0.2172 | 355 | 1.1554 | 19351616 | | 0.1337 | 0.2203 | 360 | 1.1483 | 19623440 | | 0.204 | 0.2233 | 365 | 1.1497 | 19890352 | | 0.1568 | 0.2264 | 370 | 1.1494 | 20163512 | | 0.1557 | 0.2295 | 375 | 1.1478 | 20430144 | | 0.1648 | 0.2325 | 380 | 1.1497 | 20708464 | | 0.1458 | 0.2356 | 385 | 1.1516 | 20979448 | | 0.1553 | 0.2386 | 390 | 1.1520 | 21250640 | | 0.1362 | 0.2417 | 395 | 1.1485 | 21523488 | | 0.1153 | 0.2448 | 400 | 1.1498 | 21796800 | | 0.2017 | 0.2478 | 405 | 1.1516 | 22072712 | | 0.1279 | 0.2509 | 410 | 1.1479 | 22348792 | | 0.1776 | 0.2539 | 415 | 1.1448 | 22622112 | | 0.1917 | 0.2570 | 420 | 1.1468 | 22885824 | | 0.1634 | 0.2600 | 425 | 1.1453 | 23159224 | | 0.1464 | 0.2631 | 430 | 1.1435 | 23435488 | | 0.134 | 0.2662 | 435 | 1.1452 | 23703264 | | 0.1725 | 0.2692 | 440 | 1.1409 | 23974640 | | 0.1253 | 0.2723 | 445 | 1.1399 | 24246456 | | 0.1326 | 0.2753 | 450 | 1.1436 | 24519176 | | 0.156 | 0.2784 | 455 | 1.1403 | 24790624 | | 0.119 | 0.2815 | 460 | 1.1401 | 25065648 | | 0.1505 | 0.2845 | 465 | 1.1408 | 25342616 | | 0.0667 | 0.2876 | 470 | 1.1396 | 25615824 | | 0.1569 | 0.2906 | 475 | 1.1392 | 25890064 | | 0.092 | 0.2937 | 480 | 1.1383 | 26153952 | | 0.1095 | 0.2968 | 485 | 1.1383 | 26423264 | | 0.1537 | 0.2998 | 490 | 1.1373 | 26692536 | | 0.169 | 0.3029 | 495 | 1.1389 | 26954592 | | 0.0957 | 0.3059 | 500 | 1.1392 | 27224864 | | 0.1248 | 0.3090 | 505 | 1.1386 | 27494608 | | 0.1513 | 0.3121 | 510 | 1.1384 | 27769840 | | 0.1246 | 0.3151 | 515 | 1.1367 | 28035512 | | 0.1398 | 0.3182 | 520 | 1.1353 | 28309112 | | 0.0963 | 0.3212 | 525 | 1.1340 | 28578208 | | 0.1483 | 0.3243 | 530 | 1.1355 | 28852600 | | 0.0748 | 0.3274 | 535 | 1.1381 | 29124336 | | 0.2016 | 0.3304 | 540 | 1.1377 | 29394824 | | 0.1208 | 0.3335 | 545 | 1.1347 | 29666464 | | 0.1483 | 0.3365 | 550 | 1.1319 | 29938200 | | 0.1113 | 0.3396 | 555 | 1.1323 | 30208064 | | 0.1464 | 0.3427 | 560 | 1.1322 | 30475408 | | 0.1081 | 0.3457 | 565 | 1.1323 | 30745408 | | 0.1735 | 0.3488 | 570 | 1.1325 | 31019536 | | 0.108 | 0.3518 | 575 | 1.1317 | 31294192 | | 0.1634 | 0.3549 | 580 | 1.1358 | 31560960 | | 0.1299 | 0.3579 | 585 | 1.1331 | 31840160 | | 0.1434 | 0.3610 | 590 | 1.1332 | 32117584 | | 0.126 | 0.3641 | 595 | 1.1337 | 32393856 | | 0.0881 | 0.3671 | 600 | 1.1322 | 32669032 | | 0.151 | 0.3702 | 605 | 1.1326 | 32941680 | | 0.1408 | 0.3732 | 610 | 1.1310 | 33215264 | | 0.1956 | 0.3763 | 615 | 1.1342 | 33474496 | | 0.1068 | 0.3794 | 620 | 1.1317 | 33752472 | | 0.1597 | 0.3824 | 625 | 1.1269 | 34021928 | | 0.1317 | 0.3855 | 630 | 1.1294 | 34294992 | | 0.1734 | 0.3885 | 635 | 1.1318 | 34572208 | | 0.1164 | 0.3916 | 640 | 1.1290 | 34843824 | | 0.1683 | 0.3947 | 645 | 1.1281 | 35115216 | | 0.1249 | 0.3977 | 650 | 1.1274 | 35387288 | | 0.1857 | 0.4008 | 655 | 1.1277 | 35659536 | | 0.0797 | 0.4038 | 660 | 1.1278 | 35931368 | | 0.1369 | 0.4069 | 665 | 1.1249 | 36207624 | | 0.1628 | 0.4100 | 670 | 1.1269 | 36484744 | | 0.2372 | 0.4130 | 675 | 1.1280 | 36756872 | | 0.1625 | 0.4161 | 680 | 1.1235 | 37037136 | | 0.1845 | 0.4191 | 685 | 1.1244 | 37309056 | | 0.1584 | 0.4222 | 690 | 1.1263 | 37583176 | | 0.2048 | 0.4253 | 695 | 1.1242 | 37858456 | | 0.1161 | 0.4283 | 700 | 1.1240 | 38130240 | | 0.1396 | 0.4314 | 705 | 1.1224 | 38400424 | | 0.0942 | 0.4344 | 710 | 1.1229 | 38676232 | | 0.0872 | 0.4375 | 715 | 1.1264 | 38952824 | | 0.1327 | 0.4406 | 720 | 1.1247 | 39227008 | | 0.1342 | 0.4436 | 725 | 1.1217 | 39500168 | | 0.1757 | 0.4467 | 730 | 1.1233 | 39772928 | | 0.0874 | 0.4497 | 735 | 1.1240 | 40040976 | | 0.0895 | 0.4528 | 740 | 1.1220 | 40307552 | | 0.1787 | 0.4558 | 745 | 1.1239 | 40583328 | | 0.1701 | 0.4589 | 750 | 1.1232 | 40856600 | | 0.1388 | 0.4620 | 755 | 1.1193 | 41121728 | | 0.1103 | 0.4650 | 760 | 1.1207 | 41385672 | | 0.1341 | 0.4681 | 765 | 1.1235 | 41664216 | | 0.1011 | 0.4711 | 770 | 1.1225 | 41933720 | | 0.1166 | 0.4742 | 775 | 1.1206 | 42213088 | | 0.1285 | 0.4773 | 780 | 1.1200 | 42481864 | | 0.0745 | 0.4803 | 785 | 1.1217 | 42751744 | | 0.1188 | 0.4834 | 790 | 1.1223 | 43025760 | | 0.0654 | 0.4864 | 795 | 1.1213 | 43295168 | | 0.116 | 0.4895 | 800 | 1.1204 | 43566272 | | 0.0939 | 0.4926 | 805 | 1.1195 | 43842392 | | 0.1418 | 0.4956 | 810 | 1.1203 | 44107224 | | 0.1532 | 0.4987 | 815 | 1.1191 | 44380496 | | 0.0976 | 0.5017 | 820 | 1.1189 | 44653648 | | 0.1206 | 0.5048 | 825 | 1.1196 | 44919784 | | 0.0796 | 0.5079 | 830 | 1.1208 | 45193112 | | 0.189 | 0.5109 | 835 | 1.1180 | 45462016 | | 0.1185 | 0.5140 | 840 | 1.1171 | 45732304 | | 0.1109 | 0.5170 | 845 | 1.1191 | 46006088 | | 0.1179 | 0.5201 | 850 | 1.1181 | 46282744 | | 0.1294 | 0.5232 | 855 | 1.1180 | 46551840 | | 0.1444 | 0.5262 | 860 | 1.1186 | 46832064 | | 0.1618 | 0.5293 | 865 | 1.1165 | 47109448 | | 0.107 | 0.5323 | 870 | 1.1169 | 47382488 | | 0.1525 | 0.5354 | 875 | 1.1181 | 47656952 | | 0.1116 | 0.5385 | 880 | 1.1167 | 47931560 | | 0.1227 | 0.5415 | 885 | 1.1151 | 48192568 | | 0.1386 | 0.5446 | 890 | 1.1146 | 48465184 | | 0.1385 | 0.5476 | 895 | 1.1184 | 48740280 | | 0.1574 | 0.5507 | 900 | 1.1162 | 49011192 | | 0.0848 | 0.5537 | 905 | 1.1147 | 49279624 | | 0.1111 | 0.5568 | 910 | 1.1138 | 49551568 | | 0.1686 | 0.5599 | 915 | 1.1142 | 49825552 | | 0.1197 | 0.5629 | 920 | 1.1131 | 50101384 | | 0.1572 | 0.5660 | 925 | 1.1136 | 50372568 | | 0.1398 | 0.5690 | 930 | 1.1120 | 50649456 | | 0.105 | 0.5721 | 935 | 1.1126 | 50925600 | | 0.0825 | 0.5752 | 940 | 1.1132 | 51197536 | | 0.1326 | 0.5782 | 945 | 1.1124 | 51473456 | | 0.0868 | 0.5813 | 950 | 1.1109 | 51745816 | | 0.1149 | 0.5843 | 955 | 1.1138 | 52017544 | | 0.1606 | 0.5874 | 960 | 1.1131 | 52291312 | | 0.1145 | 0.5905 | 965 | 1.1119 | 52564056 | | 0.0848 | 0.5935 | 970 | 1.1105 | 52842448 | | 0.1031 | 0.5966 | 975 | 1.1112 | 53113168 | | 0.1897 | 0.5996 | 980 | 1.1127 | 53389664 | | 0.052 | 0.6027 | 985 | 1.1115 | 53665536 | | 0.1245 | 0.6058 | 990 | 1.1102 | 53941184 | | 0.1475 | 0.6088 | 995 | 1.1090 | 54208760 | | 0.1118 | 0.6119 | 1000 | 1.1102 | 54483536 | | 0.0723 | 0.6149 | 1005 | 1.1112 | 54759496 | | 0.1014 | 0.6180 | 1010 | 1.1115 | 55022000 | | 0.147 | 0.6211 | 1015 | 1.1103 | 55300336 | | 0.1188 | 0.6241 | 1020 | 1.1092 | 55573112 | | 0.1098 | 0.6272 | 1025 | 1.1090 | 55849696 | | 0.0946 | 0.6302 | 1030 | 1.1116 | 56115808 | | 0.1629 | 0.6333 | 1035 | 1.1105 | 56393488 | | 0.1445 | 0.6364 | 1040 | 1.1097 | 56668760 | | 0.147 | 0.6394 | 1045 | 1.1081 | 56940424 | | 0.2138 | 0.6425 | 1050 | 1.1089 | 57207504 | | 0.0522 | 0.6455 | 1055 | 1.1115 | 57476232 | | 0.1012 | 0.6486 | 1060 | 1.1112 | 57747896 | | 0.1184 | 0.6517 | 1065 | 1.1081 | 58023976 | | 0.1483 | 0.6547 | 1070 | 1.1074 | 58296808 | | 0.0588 | 0.6578 | 1075 | 1.1097 | 58571720 | | 0.1079 | 0.6608 | 1080 | 1.1104 | 58842536 | | 0.1279 | 0.6639 | 1085 | 1.1095 | 59118072 | | 0.0599 | 0.6669 | 1090 | 1.1103 | 59393032 | | 0.1112 | 0.6700 | 1095 | 1.1107 | 59665472 | | 0.156 | 0.6731 | 1100 | 1.1090 | 59935336 | | 0.0933 | 0.6761 | 1105 | 1.1068 | 60205600 | | 0.1008 | 0.6792 | 1110 | 1.1084 | 60470144 | | 0.1183 | 0.6822 | 1115 | 1.1113 | 60748184 | | 0.1231 | 0.6853 | 1120 | 1.1081 | 61027336 | | 0.2034 | 0.6884 | 1125 | 1.1074 | 61294416 | | 0.1144 | 0.6914 | 1130 | 1.1085 | 61562648 | | 0.0933 | 0.6945 | 1135 | 1.1082 | 61833312 | | 0.1518 | 0.6975 | 1140 | 1.1067 | 62111584 | | 0.1298 | 0.7006 | 1145 | 1.1075 | 62389808 | | 0.0846 | 0.7037 | 1150 | 1.1068 | 62661992 | | 0.0851 | 0.7067 | 1155 | 1.1066 | 62944208 | | 0.1061 | 0.7098 | 1160 | 1.1082 | 63223688 | | 0.178 | 0.7128 | 1165 | 1.1076 | 63499088 | | 0.0979 | 0.7159 | 1170 | 1.1072 | 63769456 | | 0.1446 | 0.7190 | 1175 | 1.1087 | 64036976 | | 0.1286 | 0.7220 | 1180 | 1.1084 | 64308192 | | 0.0925 | 0.7251 | 1185 | 1.1052 | 64579040 | | 0.1432 | 0.7281 | 1190 | 1.1056 | 64856712 | | 0.1416 | 0.7312 | 1195 | 1.1095 | 65132040 | | 0.1412 | 0.7343 | 1200 | 1.1095 | 65402456 | | 0.0876 | 0.7373 | 1205 | 1.1051 | 65678136 | | 0.1592 | 0.7404 | 1210 | 1.1050 | 65955664 | | 0.0731 | 0.7434 | 1215 | 1.1049 | 66226328 | | 0.0975 | 0.7465 | 1220 | 1.1059 | 66501952 | | 0.1141 | 0.7496 | 1225 | 1.1060 | 66774168 | | 0.1432 | 0.7526 | 1230 | 1.1052 | 67040992 | | 0.1219 | 0.7557 | 1235 | 1.1045 | 67319760 | | 0.1056 | 0.7587 | 1240 | 1.1057 | 67581856 | | 0.1276 | 0.7618 | 1245 | 1.1058 | 67854520 | | 0.0811 | 0.7648 | 1250 | 1.1054 | 68128160 | | 0.1243 | 0.7679 | 1255 | 1.1057 | 68402472 | | 0.1134 | 0.7710 | 1260 | 1.1067 | 68675584 | | 0.1946 | 0.7740 | 1265 | 1.1050 | 68942304 | | 0.1222 | 0.7771 | 1270 | 1.1028 | 69217672 | | 0.1139 | 0.7801 | 1275 | 1.1048 | 69476328 | | 0.138 | 0.7832 | 1280 | 1.1060 | 69747400 | | 0.0792 | 0.7863 | 1285 | 1.1045 | 70023216 | | 0.1221 | 0.7893 | 1290 | 1.1025 | 70288136 | | 0.1102 | 0.7924 | 1295 | 1.1021 | 70565600 | | 0.222 | 0.7954 | 1300 | 1.1027 | 70830848 | | 0.1042 | 0.7985 | 1305 | 1.1032 | 71104104 | | 0.1141 | 0.8016 | 1310 | 1.1049 | 71385656 | | 0.1217 | 0.8046 | 1315 | 1.1046 | 71662688 | | 0.0718 | 0.8077 | 1320 | 1.1036 | 71934808 | | 0.0963 | 0.8107 | 1325 | 1.1020 | 72202664 | | 0.1232 | 0.8138 | 1330 | 1.1007 | 72472624 | | 0.1192 | 0.8169 | 1335 | 1.1015 | 72739272 | | 0.0919 | 0.8199 | 1340 | 1.1030 | 73014632 | | 0.1162 | 0.8230 | 1345 | 1.1045 | 73283608 | | 0.1404 | 0.8260 | 1350 | 1.1033 | 73555416 | | 0.1128 | 0.8291 | 1355 | 1.1026 | 73824680 | | 0.0925 | 0.8322 | 1360 | 1.1024 | 74093400 | | 0.0875 | 0.8352 | 1365 | 1.1024 | 74365304 | | 0.1258 | 0.8383 | 1370 | 1.1020 | 74636600 | | 0.1905 | 0.8413 | 1375 | 1.1019 | 74900840 | | 0.1921 | 0.8444 | 1380 | 1.1034 | 75169248 | | 0.1621 | 0.8475 | 1385 | 1.1042 | 75445304 | | 0.1512 | 0.8505 | 1390 | 1.1006 | 75718984 | | 0.1222 | 0.8536 | 1395 | 1.0999 | 75984696 | | 0.1156 | 0.8566 | 1400 | 1.1030 | 76256552 | | 0.0796 | 0.8597 | 1405 | 1.1041 | 76529888 | | 0.119 | 0.8627 | 1410 | 1.1015 | 76804312 | | 0.1922 | 0.8658 | 1415 | 1.1002 | 77080216 | | 0.1028 | 0.8689 | 1420 | 1.1013 | 77349808 | | 0.1667 | 0.8719 | 1425 | 1.1012 | 77623800 | | 0.1058 | 0.8750 | 1430 | 1.1005 | 77898520 | | 0.0913 | 0.8780 | 1435 | 1.1016 | 78170936 | | 0.1501 | 0.8811 | 1440 | 1.1032 | 78443040 | | 0.0984 | 0.8842 | 1445 | 1.1032 | 78713960 | | 0.1074 | 0.8872 | 1450 | 1.1021 | 78987160 | | 0.1245 | 0.8903 | 1455 | 1.1007 | 79255992 | | 0.1547 | 0.8933 | 1460 | 1.1015 | 79533904 | | 0.1184 | 0.8964 | 1465 | 1.1012 | 79805032 | | 0.1447 | 0.8995 | 1470 | 1.1001 | 80078384 | | 0.111 | 0.9025 | 1475 | 1.0990 | 80347360 | | 0.1149 | 0.9056 | 1480 | 1.1006 | 80618256 | | 0.0942 | 0.9086 | 1485 | 1.1021 | 80888904 | | 0.1132 | 0.9117 | 1490 | 1.1022 | 81164544 | | 0.1052 | 0.9148 | 1495 | 1.1017 | 81443096 | | 0.1775 | 0.9178 | 1500 | 1.1002 | 81708040 | | 0.1521 | 0.9209 | 1505 | 1.0995 | 81992080 | | 0.1106 | 0.9239 | 1510 | 1.1019 | 82272216 | | 0.1474 | 0.9270 | 1515 | 1.1026 | 82546768 | | 0.1077 | 0.9301 | 1520 | 1.1004 | 82815600 | | 0.1148 | 0.9331 | 1525 | 1.0992 | 83086872 | | 0.1579 | 0.9362 | 1530 | 1.0997 | 83355880 | | 0.1228 | 0.9392 | 1535 | 1.1006 | 83628592 | | 0.1274 | 0.9423 | 1540 | 1.1010 | 83896384 | | 0.1058 | 0.9454 | 1545 | 1.0999 | 84171928 | | 0.1579 | 0.9484 | 1550 | 1.1001 | 84445440 | | 0.1341 | 0.9515 | 1555 | 1.0991 | 84715192 | | 0.0876 | 0.9545 | 1560 | 1.0999 | 84989200 | | 0.188 | 0.9576 | 1565 | 1.1016 | 85262800 | | 0.1071 | 0.9606 | 1570 | 1.1011 | 85539672 | | 0.1017 | 0.9637 | 1575 | 1.0990 | 85811640 | | 0.1339 | 0.9668 | 1580 | 1.0985 | 86080896 | | 0.1486 | 0.9698 | 1585 | 1.1002 | 86342768 | | 0.0907 | 0.9729 | 1590 | 1.1008 | 86612632 | | 0.1727 | 0.9759 | 1595 | 1.1008 | 86885728 | | 0.1585 | 0.9790 | 1600 | 1.0996 | 87160880 | | 0.1153 | 0.9821 | 1605 | 1.0994 | 87438576 | | 0.0774 | 0.9851 | 1610 | 1.0990 | 87702840 | | 0.0688 | 0.9882 | 1615 | 1.0974 | 87982136 | | 0.1372 | 0.9912 | 1620 | 1.0987 | 88255968 | | 0.1527 | 0.9943 | 1625 | 1.0992 | 88532024 | | 0.1316 | 0.9974 | 1630 | 1.1000 | 88802400 | ### Framework versions - Transformers 4.44.0 - Pytorch 2.4.0+cu121 - Datasets 2.20.0 - Tokenizers 0.19.1