|
2023-10-11 19:13:51,177 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:13:51,179 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-11 19:13:51,179 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:13:51,180 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-11 19:13:51,180 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:13:51,180 Train: 7142 sentences |
|
2023-10-11 19:13:51,180 (train_with_dev=False, train_with_test=False) |
|
2023-10-11 19:13:51,180 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:13:51,180 Training Params: |
|
2023-10-11 19:13:51,180 - learning_rate: "0.00016" |
|
2023-10-11 19:13:51,180 - mini_batch_size: "4" |
|
2023-10-11 19:13:51,180 - max_epochs: "10" |
|
2023-10-11 19:13:51,180 - shuffle: "True" |
|
2023-10-11 19:13:51,180 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:13:51,180 Plugins: |
|
2023-10-11 19:13:51,180 - TensorboardLogger |
|
2023-10-11 19:13:51,180 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-11 19:13:51,181 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:13:51,181 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-11 19:13:51,181 - metric: "('micro avg', 'f1-score')" |
|
2023-10-11 19:13:51,181 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:13:51,181 Computation: |
|
2023-10-11 19:13:51,181 - compute on device: cuda:0 |
|
2023-10-11 19:13:51,181 - embedding storage: none |
|
2023-10-11 19:13:51,181 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:13:51,181 Model training base path: "hmbench-newseye/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-11 19:13:51,181 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:13:51,181 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:13:51,181 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-11 19:14:45,792 epoch 1 - iter 178/1786 - loss 2.81726663 - time (sec): 54.61 - samples/sec: 427.42 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-11 19:15:39,272 epoch 1 - iter 356/1786 - loss 2.63354128 - time (sec): 108.09 - samples/sec: 443.28 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-11 19:16:33,216 epoch 1 - iter 534/1786 - loss 2.35130578 - time (sec): 162.03 - samples/sec: 444.80 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-11 19:17:26,272 epoch 1 - iter 712/1786 - loss 2.04487747 - time (sec): 215.09 - samples/sec: 450.33 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-11 19:18:17,461 epoch 1 - iter 890/1786 - loss 1.77647319 - time (sec): 266.28 - samples/sec: 451.84 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-11 19:19:13,643 epoch 1 - iter 1068/1786 - loss 1.55935737 - time (sec): 322.46 - samples/sec: 453.84 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-11 19:20:09,974 epoch 1 - iter 1246/1786 - loss 1.37945532 - time (sec): 378.79 - samples/sec: 456.82 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-11 19:21:04,833 epoch 1 - iter 1424/1786 - loss 1.25092114 - time (sec): 433.65 - samples/sec: 456.93 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-11 19:21:59,519 epoch 1 - iter 1602/1786 - loss 1.14992315 - time (sec): 488.34 - samples/sec: 455.35 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-11 19:22:54,887 epoch 1 - iter 1780/1786 - loss 1.05807649 - time (sec): 543.70 - samples/sec: 456.19 - lr: 0.000159 - momentum: 0.000000 |
|
2023-10-11 19:22:56,573 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:22:56,574 EPOCH 1 done: loss 1.0556 - lr: 0.000159 |
|
2023-10-11 19:23:16,475 DEV : loss 0.1910761147737503 - f1-score (micro avg) 0.548 |
|
2023-10-11 19:23:16,504 saving best model |
|
2023-10-11 19:23:17,476 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:24:13,401 epoch 2 - iter 178/1786 - loss 0.19843727 - time (sec): 55.92 - samples/sec: 460.08 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-11 19:25:09,123 epoch 2 - iter 356/1786 - loss 0.19088934 - time (sec): 111.64 - samples/sec: 455.94 - lr: 0.000156 - momentum: 0.000000 |
|
2023-10-11 19:26:07,078 epoch 2 - iter 534/1786 - loss 0.17857709 - time (sec): 169.60 - samples/sec: 457.11 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-11 19:27:01,067 epoch 2 - iter 712/1786 - loss 0.17027634 - time (sec): 223.59 - samples/sec: 450.87 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-11 19:27:55,997 epoch 2 - iter 890/1786 - loss 0.16174542 - time (sec): 278.52 - samples/sec: 450.27 - lr: 0.000151 - momentum: 0.000000 |
|
2023-10-11 19:28:49,544 epoch 2 - iter 1068/1786 - loss 0.15308871 - time (sec): 332.07 - samples/sec: 451.00 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-11 19:29:43,254 epoch 2 - iter 1246/1786 - loss 0.14946536 - time (sec): 385.78 - samples/sec: 450.66 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 19:30:37,656 epoch 2 - iter 1424/1786 - loss 0.14427293 - time (sec): 440.18 - samples/sec: 450.49 - lr: 0.000146 - momentum: 0.000000 |
|
2023-10-11 19:31:31,781 epoch 2 - iter 1602/1786 - loss 0.13998247 - time (sec): 494.30 - samples/sec: 448.65 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-11 19:32:26,177 epoch 2 - iter 1780/1786 - loss 0.13658975 - time (sec): 548.70 - samples/sec: 451.31 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-11 19:32:28,111 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:32:28,112 EPOCH 2 done: loss 0.1363 - lr: 0.000142 |
|
2023-10-11 19:32:49,373 DEV : loss 0.11301875859498978 - f1-score (micro avg) 0.7643 |
|
2023-10-11 19:32:49,404 saving best model |
|
2023-10-11 19:32:54,507 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:33:49,062 epoch 3 - iter 178/1786 - loss 0.07443452 - time (sec): 54.55 - samples/sec: 454.66 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-11 19:34:45,096 epoch 3 - iter 356/1786 - loss 0.07792617 - time (sec): 110.58 - samples/sec: 451.54 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-11 19:35:39,332 epoch 3 - iter 534/1786 - loss 0.07368656 - time (sec): 164.82 - samples/sec: 449.84 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-11 19:36:35,943 epoch 3 - iter 712/1786 - loss 0.07163510 - time (sec): 221.43 - samples/sec: 443.25 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-11 19:37:33,213 epoch 3 - iter 890/1786 - loss 0.07448758 - time (sec): 278.70 - samples/sec: 440.47 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-11 19:38:28,862 epoch 3 - iter 1068/1786 - loss 0.07522810 - time (sec): 334.35 - samples/sec: 441.07 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-11 19:39:24,154 epoch 3 - iter 1246/1786 - loss 0.07393748 - time (sec): 389.64 - samples/sec: 441.29 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-11 19:40:21,470 epoch 3 - iter 1424/1786 - loss 0.07497568 - time (sec): 446.96 - samples/sec: 441.14 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-11 19:41:17,975 epoch 3 - iter 1602/1786 - loss 0.07401065 - time (sec): 503.46 - samples/sec: 442.08 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-11 19:42:14,212 epoch 3 - iter 1780/1786 - loss 0.07443437 - time (sec): 559.70 - samples/sec: 442.67 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-11 19:42:16,107 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:42:16,108 EPOCH 3 done: loss 0.0744 - lr: 0.000125 |
|
2023-10-11 19:42:37,828 DEV : loss 0.13155920803546906 - f1-score (micro avg) 0.7789 |
|
2023-10-11 19:42:37,858 saving best model |
|
2023-10-11 19:42:48,486 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:43:42,662 epoch 4 - iter 178/1786 - loss 0.05682850 - time (sec): 54.17 - samples/sec: 455.12 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-11 19:44:36,229 epoch 4 - iter 356/1786 - loss 0.05085291 - time (sec): 107.74 - samples/sec: 461.16 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-11 19:45:31,139 epoch 4 - iter 534/1786 - loss 0.05380353 - time (sec): 162.65 - samples/sec: 467.00 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-11 19:46:27,225 epoch 4 - iter 712/1786 - loss 0.05409746 - time (sec): 218.73 - samples/sec: 460.15 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-11 19:47:21,874 epoch 4 - iter 890/1786 - loss 0.05254482 - time (sec): 273.38 - samples/sec: 456.64 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-11 19:48:17,463 epoch 4 - iter 1068/1786 - loss 0.05090637 - time (sec): 328.97 - samples/sec: 455.59 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-11 19:49:12,569 epoch 4 - iter 1246/1786 - loss 0.05176189 - time (sec): 384.08 - samples/sec: 459.11 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-11 19:50:05,052 epoch 4 - iter 1424/1786 - loss 0.05266426 - time (sec): 436.56 - samples/sec: 457.32 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-11 19:50:59,830 epoch 4 - iter 1602/1786 - loss 0.05268196 - time (sec): 491.34 - samples/sec: 455.32 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-11 19:51:54,157 epoch 4 - iter 1780/1786 - loss 0.05210017 - time (sec): 545.67 - samples/sec: 454.53 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-11 19:51:55,885 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:51:55,885 EPOCH 4 done: loss 0.0520 - lr: 0.000107 |
|
2023-10-11 19:52:18,869 DEV : loss 0.14153534173965454 - f1-score (micro avg) 0.7719 |
|
2023-10-11 19:52:18,903 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:53:12,247 epoch 5 - iter 178/1786 - loss 0.03543846 - time (sec): 53.34 - samples/sec: 445.73 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-11 19:54:04,362 epoch 5 - iter 356/1786 - loss 0.03568755 - time (sec): 105.46 - samples/sec: 444.09 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-11 19:54:59,829 epoch 5 - iter 534/1786 - loss 0.03544231 - time (sec): 160.92 - samples/sec: 456.17 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-11 19:55:58,462 epoch 5 - iter 712/1786 - loss 0.03608462 - time (sec): 219.56 - samples/sec: 450.03 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-11 19:56:56,006 epoch 5 - iter 890/1786 - loss 0.03631887 - time (sec): 277.10 - samples/sec: 443.92 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-11 19:57:51,595 epoch 5 - iter 1068/1786 - loss 0.03584506 - time (sec): 332.69 - samples/sec: 440.11 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-11 19:58:52,566 epoch 5 - iter 1246/1786 - loss 0.03555371 - time (sec): 393.66 - samples/sec: 438.69 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-11 19:59:47,070 epoch 5 - iter 1424/1786 - loss 0.03537355 - time (sec): 448.16 - samples/sec: 438.91 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-11 20:00:42,523 epoch 5 - iter 1602/1786 - loss 0.03662827 - time (sec): 503.62 - samples/sec: 441.12 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-11 20:01:38,743 epoch 5 - iter 1780/1786 - loss 0.03694673 - time (sec): 559.84 - samples/sec: 442.55 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-11 20:01:40,649 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 20:01:40,650 EPOCH 5 done: loss 0.0370 - lr: 0.000089 |
|
2023-10-11 20:02:02,932 DEV : loss 0.1543937772512436 - f1-score (micro avg) 0.8069 |
|
2023-10-11 20:02:02,963 saving best model |
|
2023-10-11 20:02:34,808 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 20:03:31,794 epoch 6 - iter 178/1786 - loss 0.03407214 - time (sec): 56.98 - samples/sec: 440.90 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-11 20:04:27,201 epoch 6 - iter 356/1786 - loss 0.02838239 - time (sec): 112.39 - samples/sec: 441.41 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-11 20:05:22,131 epoch 6 - iter 534/1786 - loss 0.02705824 - time (sec): 167.32 - samples/sec: 440.23 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-11 20:06:17,967 epoch 6 - iter 712/1786 - loss 0.02835522 - time (sec): 223.15 - samples/sec: 442.11 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-11 20:07:13,247 epoch 6 - iter 890/1786 - loss 0.02711076 - time (sec): 278.43 - samples/sec: 441.74 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-11 20:08:09,742 epoch 6 - iter 1068/1786 - loss 0.02894506 - time (sec): 334.93 - samples/sec: 443.02 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-11 20:09:04,988 epoch 6 - iter 1246/1786 - loss 0.02842164 - time (sec): 390.18 - samples/sec: 444.19 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-11 20:10:01,812 epoch 6 - iter 1424/1786 - loss 0.02862201 - time (sec): 447.00 - samples/sec: 443.26 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-11 20:10:58,162 epoch 6 - iter 1602/1786 - loss 0.02770039 - time (sec): 503.35 - samples/sec: 442.70 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-11 20:11:52,537 epoch 6 - iter 1780/1786 - loss 0.02749648 - time (sec): 557.72 - samples/sec: 444.60 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-11 20:11:54,275 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 20:11:54,275 EPOCH 6 done: loss 0.0276 - lr: 0.000071 |
|
2023-10-11 20:12:15,480 DEV : loss 0.17768503725528717 - f1-score (micro avg) 0.8032 |
|
2023-10-11 20:12:15,511 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 20:13:10,371 epoch 7 - iter 178/1786 - loss 0.02194253 - time (sec): 54.86 - samples/sec: 440.84 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-11 20:14:05,705 epoch 7 - iter 356/1786 - loss 0.02239136 - time (sec): 110.19 - samples/sec: 442.59 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-11 20:15:03,230 epoch 7 - iter 534/1786 - loss 0.02455740 - time (sec): 167.72 - samples/sec: 434.51 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-11 20:15:58,719 epoch 7 - iter 712/1786 - loss 0.02161452 - time (sec): 223.21 - samples/sec: 441.61 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-11 20:16:55,364 epoch 7 - iter 890/1786 - loss 0.02185857 - time (sec): 279.85 - samples/sec: 442.72 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-11 20:17:53,035 epoch 7 - iter 1068/1786 - loss 0.02033916 - time (sec): 337.52 - samples/sec: 440.31 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-11 20:18:46,607 epoch 7 - iter 1246/1786 - loss 0.02021475 - time (sec): 391.09 - samples/sec: 442.63 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-11 20:19:41,538 epoch 7 - iter 1424/1786 - loss 0.02040982 - time (sec): 446.02 - samples/sec: 445.41 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-11 20:20:36,433 epoch 7 - iter 1602/1786 - loss 0.02004331 - time (sec): 500.92 - samples/sec: 446.50 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-11 20:21:30,020 epoch 7 - iter 1780/1786 - loss 0.02009444 - time (sec): 554.51 - samples/sec: 446.57 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-11 20:21:31,879 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 20:21:31,879 EPOCH 7 done: loss 0.0200 - lr: 0.000053 |
|
2023-10-11 20:21:52,842 DEV : loss 0.19214080274105072 - f1-score (micro avg) 0.8 |
|
2023-10-11 20:21:52,871 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 20:22:46,267 epoch 8 - iter 178/1786 - loss 0.01515433 - time (sec): 53.39 - samples/sec: 462.82 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-11 20:23:40,414 epoch 8 - iter 356/1786 - loss 0.01607011 - time (sec): 107.54 - samples/sec: 469.13 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-11 20:24:34,115 epoch 8 - iter 534/1786 - loss 0.01419008 - time (sec): 161.24 - samples/sec: 467.30 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-11 20:25:28,581 epoch 8 - iter 712/1786 - loss 0.01574404 - time (sec): 215.71 - samples/sec: 465.31 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-11 20:26:22,156 epoch 8 - iter 890/1786 - loss 0.01486595 - time (sec): 269.28 - samples/sec: 461.55 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-11 20:27:16,509 epoch 8 - iter 1068/1786 - loss 0.01436697 - time (sec): 323.64 - samples/sec: 459.27 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-11 20:28:11,431 epoch 8 - iter 1246/1786 - loss 0.01439559 - time (sec): 378.56 - samples/sec: 459.92 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-11 20:29:04,809 epoch 8 - iter 1424/1786 - loss 0.01444281 - time (sec): 431.94 - samples/sec: 455.36 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-11 20:30:00,610 epoch 8 - iter 1602/1786 - loss 0.01433488 - time (sec): 487.74 - samples/sec: 456.40 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-11 20:30:56,858 epoch 8 - iter 1780/1786 - loss 0.01428435 - time (sec): 543.99 - samples/sec: 456.02 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-11 20:30:58,578 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 20:30:58,579 EPOCH 8 done: loss 0.0143 - lr: 0.000036 |
|
2023-10-11 20:31:20,849 DEV : loss 0.20372258126735687 - f1-score (micro avg) 0.8021 |
|
2023-10-11 20:31:20,881 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 20:32:16,935 epoch 9 - iter 178/1786 - loss 0.00793109 - time (sec): 56.05 - samples/sec: 422.02 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-11 20:33:14,012 epoch 9 - iter 356/1786 - loss 0.01077629 - time (sec): 113.13 - samples/sec: 436.99 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-11 20:34:11,690 epoch 9 - iter 534/1786 - loss 0.01120412 - time (sec): 170.81 - samples/sec: 443.56 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-11 20:35:08,258 epoch 9 - iter 712/1786 - loss 0.00993767 - time (sec): 227.38 - samples/sec: 442.48 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-11 20:36:02,018 epoch 9 - iter 890/1786 - loss 0.00981936 - time (sec): 281.14 - samples/sec: 443.78 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-11 20:36:57,308 epoch 9 - iter 1068/1786 - loss 0.00964522 - time (sec): 336.42 - samples/sec: 445.06 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-11 20:37:51,844 epoch 9 - iter 1246/1786 - loss 0.00967314 - time (sec): 390.96 - samples/sec: 444.94 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-11 20:38:46,124 epoch 9 - iter 1424/1786 - loss 0.00944093 - time (sec): 445.24 - samples/sec: 445.21 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-11 20:39:40,185 epoch 9 - iter 1602/1786 - loss 0.00949793 - time (sec): 499.30 - samples/sec: 445.57 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-11 20:40:35,488 epoch 9 - iter 1780/1786 - loss 0.00978050 - time (sec): 554.61 - samples/sec: 446.96 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-11 20:40:37,245 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 20:40:37,245 EPOCH 9 done: loss 0.0098 - lr: 0.000018 |
|
2023-10-11 20:40:59,367 DEV : loss 0.21249088644981384 - f1-score (micro avg) 0.8021 |
|
2023-10-11 20:40:59,397 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 20:41:53,260 epoch 10 - iter 178/1786 - loss 0.00766748 - time (sec): 53.86 - samples/sec: 467.31 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-11 20:42:47,621 epoch 10 - iter 356/1786 - loss 0.00666879 - time (sec): 108.22 - samples/sec: 473.72 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-11 20:43:40,341 epoch 10 - iter 534/1786 - loss 0.00669677 - time (sec): 160.94 - samples/sec: 463.30 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-11 20:44:33,751 epoch 10 - iter 712/1786 - loss 0.00714241 - time (sec): 214.35 - samples/sec: 463.12 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-11 20:45:27,178 epoch 10 - iter 890/1786 - loss 0.00681352 - time (sec): 267.78 - samples/sec: 457.07 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-11 20:46:22,447 epoch 10 - iter 1068/1786 - loss 0.00770174 - time (sec): 323.05 - samples/sec: 461.61 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-11 20:47:15,081 epoch 10 - iter 1246/1786 - loss 0.00775296 - time (sec): 375.68 - samples/sec: 460.13 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-11 20:48:10,290 epoch 10 - iter 1424/1786 - loss 0.00785199 - time (sec): 430.89 - samples/sec: 459.46 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-11 20:49:06,235 epoch 10 - iter 1602/1786 - loss 0.00742856 - time (sec): 486.84 - samples/sec: 458.91 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-11 20:50:01,353 epoch 10 - iter 1780/1786 - loss 0.00753833 - time (sec): 541.95 - samples/sec: 457.97 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-11 20:50:02,958 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 20:50:02,959 EPOCH 10 done: loss 0.0075 - lr: 0.000000 |
|
2023-10-11 20:50:25,995 DEV : loss 0.21911536157131195 - f1-score (micro avg) 0.8046 |
|
2023-10-11 20:50:27,138 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 20:50:27,141 Loading model from best epoch ... |
|
2023-10-11 20:50:31,856 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-11 20:51:43,012 |
|
Results: |
|
- F-score (micro) 0.6934 |
|
- F-score (macro) 0.6106 |
|
- Accuracy 0.5471 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.7176 0.7242 0.7209 1095 |
|
PER 0.7850 0.7431 0.7635 1012 |
|
ORG 0.4150 0.5742 0.4818 357 |
|
HumanProd 0.3922 0.6061 0.4762 33 |
|
|
|
micro avg 0.6787 0.7089 0.6934 2497 |
|
macro avg 0.5774 0.6619 0.6106 2497 |
|
weighted avg 0.6974 0.7089 0.7007 2497 |
|
|
|
2023-10-11 20:51:43,012 ---------------------------------------------------------------------------------------------------- |
|
|