|
2023-10-18 23:11:51,022 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:11:51,023 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 23:11:51,023 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:11:51,023 MultiCorpus: 5777 train + 722 dev + 723 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl |
|
2023-10-18 23:11:51,023 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:11:51,023 Train: 5777 sentences |
|
2023-10-18 23:11:51,023 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 23:11:51,023 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:11:51,023 Training Params: |
|
2023-10-18 23:11:51,023 - learning_rate: "3e-05" |
|
2023-10-18 23:11:51,023 - mini_batch_size: "8" |
|
2023-10-18 23:11:51,023 - max_epochs: "10" |
|
2023-10-18 23:11:51,023 - shuffle: "True" |
|
2023-10-18 23:11:51,023 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:11:51,023 Plugins: |
|
2023-10-18 23:11:51,023 - TensorboardLogger |
|
2023-10-18 23:11:51,023 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 23:11:51,023 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:11:51,023 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 23:11:51,024 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 23:11:51,024 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:11:51,024 Computation: |
|
2023-10-18 23:11:51,024 - compute on device: cuda:0 |
|
2023-10-18 23:11:51,024 - embedding storage: none |
|
2023-10-18 23:11:51,024 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:11:51,024 Model training base path: "hmbench-icdar/nl-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-18 23:11:51,024 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:11:51,024 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:11:51,024 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 23:11:52,810 epoch 1 - iter 72/723 - loss 2.44216644 - time (sec): 1.79 - samples/sec: 9891.33 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 23:11:54,513 epoch 1 - iter 144/723 - loss 2.30997642 - time (sec): 3.49 - samples/sec: 9666.55 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 23:11:56,360 epoch 1 - iter 216/723 - loss 2.05393390 - time (sec): 5.34 - samples/sec: 9829.87 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 23:11:58,201 epoch 1 - iter 288/723 - loss 1.77045723 - time (sec): 7.18 - samples/sec: 9840.23 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 23:12:00,137 epoch 1 - iter 360/723 - loss 1.52008804 - time (sec): 9.11 - samples/sec: 9790.22 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 23:12:01,901 epoch 1 - iter 432/723 - loss 1.35013626 - time (sec): 10.88 - samples/sec: 9666.70 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 23:12:03,716 epoch 1 - iter 504/723 - loss 1.20226231 - time (sec): 12.69 - samples/sec: 9703.98 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 23:12:05,525 epoch 1 - iter 576/723 - loss 1.08283856 - time (sec): 14.50 - samples/sec: 9753.81 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 23:12:07,307 epoch 1 - iter 648/723 - loss 0.99626271 - time (sec): 16.28 - samples/sec: 9763.13 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 23:12:09,067 epoch 1 - iter 720/723 - loss 0.93060332 - time (sec): 18.04 - samples/sec: 9729.50 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 23:12:09,137 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:12:09,137 EPOCH 1 done: loss 0.9281 - lr: 0.000030 |
|
2023-10-18 23:12:10,425 DEV : loss 0.3564930558204651 - f1-score (micro avg) 0.0 |
|
2023-10-18 23:12:10,439 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:12:12,188 epoch 2 - iter 72/723 - loss 0.30663308 - time (sec): 1.75 - samples/sec: 9893.05 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 23:12:13,958 epoch 2 - iter 144/723 - loss 0.28240631 - time (sec): 3.52 - samples/sec: 10127.51 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 23:12:15,633 epoch 2 - iter 216/723 - loss 0.27645540 - time (sec): 5.19 - samples/sec: 9965.04 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 23:12:17,358 epoch 2 - iter 288/723 - loss 0.25965837 - time (sec): 6.92 - samples/sec: 10137.11 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 23:12:19,033 epoch 2 - iter 360/723 - loss 0.25438816 - time (sec): 8.59 - samples/sec: 10112.60 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 23:12:20,694 epoch 2 - iter 432/723 - loss 0.24919098 - time (sec): 10.26 - samples/sec: 10093.78 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 23:12:22,452 epoch 2 - iter 504/723 - loss 0.24538984 - time (sec): 12.01 - samples/sec: 10092.18 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 23:12:24,250 epoch 2 - iter 576/723 - loss 0.24291302 - time (sec): 13.81 - samples/sec: 10182.61 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 23:12:26,001 epoch 2 - iter 648/723 - loss 0.23897319 - time (sec): 15.56 - samples/sec: 10175.96 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 23:12:27,797 epoch 2 - iter 720/723 - loss 0.23500077 - time (sec): 17.36 - samples/sec: 10117.24 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 23:12:27,861 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:12:27,861 EPOCH 2 done: loss 0.2347 - lr: 0.000027 |
|
2023-10-18 23:12:29,600 DEV : loss 0.25880512595176697 - f1-score (micro avg) 0.0802 |
|
2023-10-18 23:12:29,616 saving best model |
|
2023-10-18 23:12:29,648 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:12:31,747 epoch 3 - iter 72/723 - loss 0.19921052 - time (sec): 2.10 - samples/sec: 8841.74 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 23:12:33,512 epoch 3 - iter 144/723 - loss 0.20066322 - time (sec): 3.86 - samples/sec: 9294.18 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 23:12:35,351 epoch 3 - iter 216/723 - loss 0.19999346 - time (sec): 5.70 - samples/sec: 9508.30 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 23:12:37,134 epoch 3 - iter 288/723 - loss 0.19507221 - time (sec): 7.49 - samples/sec: 9594.94 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 23:12:38,901 epoch 3 - iter 360/723 - loss 0.19575824 - time (sec): 9.25 - samples/sec: 9552.28 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 23:12:40,666 epoch 3 - iter 432/723 - loss 0.19871723 - time (sec): 11.02 - samples/sec: 9595.97 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 23:12:42,391 epoch 3 - iter 504/723 - loss 0.19742757 - time (sec): 12.74 - samples/sec: 9601.55 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 23:12:44,209 epoch 3 - iter 576/723 - loss 0.19808476 - time (sec): 14.56 - samples/sec: 9670.87 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 23:12:45,925 epoch 3 - iter 648/723 - loss 0.19493970 - time (sec): 16.28 - samples/sec: 9717.32 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 23:12:47,666 epoch 3 - iter 720/723 - loss 0.19510831 - time (sec): 18.02 - samples/sec: 9751.67 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 23:12:47,734 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:12:47,734 EPOCH 3 done: loss 0.1950 - lr: 0.000023 |
|
2023-10-18 23:12:49,485 DEV : loss 0.21803173422813416 - f1-score (micro avg) 0.3538 |
|
2023-10-18 23:12:49,499 saving best model |
|
2023-10-18 23:12:49,535 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:12:51,301 epoch 4 - iter 72/723 - loss 0.22975421 - time (sec): 1.77 - samples/sec: 9675.63 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 23:12:53,064 epoch 4 - iter 144/723 - loss 0.18735987 - time (sec): 3.53 - samples/sec: 9833.21 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 23:12:54,777 epoch 4 - iter 216/723 - loss 0.19159436 - time (sec): 5.24 - samples/sec: 9779.61 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 23:12:56,585 epoch 4 - iter 288/723 - loss 0.18563505 - time (sec): 7.05 - samples/sec: 9911.36 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 23:12:58,399 epoch 4 - iter 360/723 - loss 0.18791900 - time (sec): 8.86 - samples/sec: 10005.62 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 23:13:00,168 epoch 4 - iter 432/723 - loss 0.18324138 - time (sec): 10.63 - samples/sec: 10088.63 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 23:13:01,926 epoch 4 - iter 504/723 - loss 0.18133811 - time (sec): 12.39 - samples/sec: 9995.66 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 23:13:03,688 epoch 4 - iter 576/723 - loss 0.18072363 - time (sec): 14.15 - samples/sec: 9985.13 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 23:13:05,512 epoch 4 - iter 648/723 - loss 0.18091767 - time (sec): 15.98 - samples/sec: 9975.68 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 23:13:07,229 epoch 4 - iter 720/723 - loss 0.17928566 - time (sec): 17.69 - samples/sec: 9935.66 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 23:13:07,292 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:13:07,292 EPOCH 4 done: loss 0.1792 - lr: 0.000020 |
|
2023-10-18 23:13:09,386 DEV : loss 0.20422407984733582 - f1-score (micro avg) 0.4014 |
|
2023-10-18 23:13:09,401 saving best model |
|
2023-10-18 23:13:09,437 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:13:11,075 epoch 5 - iter 72/723 - loss 0.19248413 - time (sec): 1.64 - samples/sec: 10269.81 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 23:13:12,842 epoch 5 - iter 144/723 - loss 0.18350762 - time (sec): 3.41 - samples/sec: 9895.90 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 23:13:14,578 epoch 5 - iter 216/723 - loss 0.18053100 - time (sec): 5.14 - samples/sec: 9758.42 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 23:13:16,356 epoch 5 - iter 288/723 - loss 0.17488340 - time (sec): 6.92 - samples/sec: 9775.57 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 23:13:18,203 epoch 5 - iter 360/723 - loss 0.17643351 - time (sec): 8.77 - samples/sec: 9912.19 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 23:13:19,986 epoch 5 - iter 432/723 - loss 0.17288927 - time (sec): 10.55 - samples/sec: 9961.54 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 23:13:21,746 epoch 5 - iter 504/723 - loss 0.16875420 - time (sec): 12.31 - samples/sec: 9934.45 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 23:13:23,522 epoch 5 - iter 576/723 - loss 0.16965003 - time (sec): 14.08 - samples/sec: 9897.96 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 23:13:25,296 epoch 5 - iter 648/723 - loss 0.17161084 - time (sec): 15.86 - samples/sec: 9880.14 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 23:13:27,162 epoch 5 - iter 720/723 - loss 0.16754034 - time (sec): 17.73 - samples/sec: 9912.87 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 23:13:27,227 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:13:27,227 EPOCH 5 done: loss 0.1679 - lr: 0.000017 |
|
2023-10-18 23:13:29,008 DEV : loss 0.20114754140377045 - f1-score (micro avg) 0.4134 |
|
2023-10-18 23:13:29,022 saving best model |
|
2023-10-18 23:13:29,058 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:13:30,756 epoch 6 - iter 72/723 - loss 0.14807821 - time (sec): 1.70 - samples/sec: 10519.59 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 23:13:32,245 epoch 6 - iter 144/723 - loss 0.15415731 - time (sec): 3.19 - samples/sec: 11334.87 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 23:13:33,795 epoch 6 - iter 216/723 - loss 0.15522735 - time (sec): 4.74 - samples/sec: 11531.86 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 23:13:35,377 epoch 6 - iter 288/723 - loss 0.15473174 - time (sec): 6.32 - samples/sec: 11383.57 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 23:13:37,089 epoch 6 - iter 360/723 - loss 0.15550787 - time (sec): 8.03 - samples/sec: 11119.06 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 23:13:38,921 epoch 6 - iter 432/723 - loss 0.15664718 - time (sec): 9.86 - samples/sec: 10919.44 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 23:13:40,663 epoch 6 - iter 504/723 - loss 0.16128146 - time (sec): 11.60 - samples/sec: 10784.98 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 23:13:42,110 epoch 6 - iter 576/723 - loss 0.16186203 - time (sec): 13.05 - samples/sec: 10836.50 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 23:13:44,253 epoch 6 - iter 648/723 - loss 0.15979228 - time (sec): 15.19 - samples/sec: 10419.81 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 23:13:45,929 epoch 6 - iter 720/723 - loss 0.15793335 - time (sec): 16.87 - samples/sec: 10410.89 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 23:13:45,995 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:13:45,995 EPOCH 6 done: loss 0.1579 - lr: 0.000013 |
|
2023-10-18 23:13:47,779 DEV : loss 0.2004448026418686 - f1-score (micro avg) 0.416 |
|
2023-10-18 23:13:47,795 saving best model |
|
2023-10-18 23:13:47,834 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:13:49,636 epoch 7 - iter 72/723 - loss 0.13990666 - time (sec): 1.80 - samples/sec: 9771.45 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 23:13:51,369 epoch 7 - iter 144/723 - loss 0.14735534 - time (sec): 3.53 - samples/sec: 9523.67 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 23:13:53,172 epoch 7 - iter 216/723 - loss 0.14661026 - time (sec): 5.34 - samples/sec: 9781.19 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 23:13:54,922 epoch 7 - iter 288/723 - loss 0.14674013 - time (sec): 7.09 - samples/sec: 9725.99 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 23:13:56,722 epoch 7 - iter 360/723 - loss 0.14930807 - time (sec): 8.89 - samples/sec: 9778.01 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 23:13:58,314 epoch 7 - iter 432/723 - loss 0.15173579 - time (sec): 10.48 - samples/sec: 10017.22 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 23:13:59,917 epoch 7 - iter 504/723 - loss 0.15401105 - time (sec): 12.08 - samples/sec: 10112.71 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 23:14:01,793 epoch 7 - iter 576/723 - loss 0.15380009 - time (sec): 13.96 - samples/sec: 10127.35 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 23:14:03,567 epoch 7 - iter 648/723 - loss 0.15502582 - time (sec): 15.73 - samples/sec: 10055.51 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 23:14:05,325 epoch 7 - iter 720/723 - loss 0.15355652 - time (sec): 17.49 - samples/sec: 10050.03 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 23:14:05,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:14:05,387 EPOCH 7 done: loss 0.1535 - lr: 0.000010 |
|
2023-10-18 23:14:07,198 DEV : loss 0.1892377883195877 - f1-score (micro avg) 0.4715 |
|
2023-10-18 23:14:07,213 saving best model |
|
2023-10-18 23:14:07,246 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:14:09,021 epoch 8 - iter 72/723 - loss 0.14918626 - time (sec): 1.78 - samples/sec: 10048.93 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 23:14:10,805 epoch 8 - iter 144/723 - loss 0.14341743 - time (sec): 3.56 - samples/sec: 9753.10 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 23:14:12,659 epoch 8 - iter 216/723 - loss 0.14638127 - time (sec): 5.41 - samples/sec: 9806.40 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 23:14:14,509 epoch 8 - iter 288/723 - loss 0.14570047 - time (sec): 7.26 - samples/sec: 9830.45 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 23:14:16,626 epoch 8 - iter 360/723 - loss 0.14588077 - time (sec): 9.38 - samples/sec: 9468.27 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 23:14:18,438 epoch 8 - iter 432/723 - loss 0.14940172 - time (sec): 11.19 - samples/sec: 9520.91 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 23:14:20,233 epoch 8 - iter 504/723 - loss 0.14995954 - time (sec): 12.99 - samples/sec: 9625.72 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 23:14:21,993 epoch 8 - iter 576/723 - loss 0.14830811 - time (sec): 14.75 - samples/sec: 9612.22 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 23:14:23,804 epoch 8 - iter 648/723 - loss 0.14882827 - time (sec): 16.56 - samples/sec: 9598.29 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 23:14:25,585 epoch 8 - iter 720/723 - loss 0.14797825 - time (sec): 18.34 - samples/sec: 9578.73 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 23:14:25,657 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:14:25,658 EPOCH 8 done: loss 0.1479 - lr: 0.000007 |
|
2023-10-18 23:14:27,424 DEV : loss 0.19318030774593353 - f1-score (micro avg) 0.4586 |
|
2023-10-18 23:14:27,439 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:14:29,358 epoch 9 - iter 72/723 - loss 0.13183658 - time (sec): 1.92 - samples/sec: 9729.82 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 23:14:31,178 epoch 9 - iter 144/723 - loss 0.14774974 - time (sec): 3.74 - samples/sec: 9894.94 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 23:14:32,918 epoch 9 - iter 216/723 - loss 0.14096225 - time (sec): 5.48 - samples/sec: 10029.57 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 23:14:34,783 epoch 9 - iter 288/723 - loss 0.14464135 - time (sec): 7.34 - samples/sec: 9979.83 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 23:14:36,533 epoch 9 - iter 360/723 - loss 0.14935950 - time (sec): 9.09 - samples/sec: 9842.57 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 23:14:38,298 epoch 9 - iter 432/723 - loss 0.14704814 - time (sec): 10.86 - samples/sec: 9811.71 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 23:14:40,074 epoch 9 - iter 504/723 - loss 0.14643788 - time (sec): 12.63 - samples/sec: 9786.04 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 23:14:41,871 epoch 9 - iter 576/723 - loss 0.14449966 - time (sec): 14.43 - samples/sec: 9853.99 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 23:14:43,694 epoch 9 - iter 648/723 - loss 0.14535218 - time (sec): 16.26 - samples/sec: 9804.97 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 23:14:45,473 epoch 9 - iter 720/723 - loss 0.14713910 - time (sec): 18.03 - samples/sec: 9748.64 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 23:14:45,537 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:14:45,537 EPOCH 9 done: loss 0.1474 - lr: 0.000003 |
|
2023-10-18 23:14:47,299 DEV : loss 0.18770474195480347 - f1-score (micro avg) 0.471 |
|
2023-10-18 23:14:47,313 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:14:49,152 epoch 10 - iter 72/723 - loss 0.15104826 - time (sec): 1.84 - samples/sec: 9691.05 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 23:14:50,921 epoch 10 - iter 144/723 - loss 0.14308456 - time (sec): 3.61 - samples/sec: 9833.81 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 23:14:52,704 epoch 10 - iter 216/723 - loss 0.14243977 - time (sec): 5.39 - samples/sec: 9795.49 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 23:14:54,769 epoch 10 - iter 288/723 - loss 0.14355974 - time (sec): 7.45 - samples/sec: 9299.45 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 23:14:56,644 epoch 10 - iter 360/723 - loss 0.15340840 - time (sec): 9.33 - samples/sec: 9434.86 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 23:14:58,530 epoch 10 - iter 432/723 - loss 0.14997278 - time (sec): 11.22 - samples/sec: 9508.69 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 23:15:00,395 epoch 10 - iter 504/723 - loss 0.15043498 - time (sec): 13.08 - samples/sec: 9565.34 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 23:15:02,186 epoch 10 - iter 576/723 - loss 0.15019473 - time (sec): 14.87 - samples/sec: 9470.10 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 23:15:03,925 epoch 10 - iter 648/723 - loss 0.14807576 - time (sec): 16.61 - samples/sec: 9514.24 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 23:15:05,690 epoch 10 - iter 720/723 - loss 0.14659442 - time (sec): 18.38 - samples/sec: 9565.63 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 23:15:05,744 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:15:05,744 EPOCH 10 done: loss 0.1466 - lr: 0.000000 |
|
2023-10-18 23:15:07,520 DEV : loss 0.1885942816734314 - f1-score (micro avg) 0.4686 |
|
2023-10-18 23:15:07,565 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:15:07,565 Loading model from best epoch ... |
|
2023-10-18 23:15:07,651 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-18 23:15:08,989 |
|
Results: |
|
- F-score (micro) 0.4786 |
|
- F-score (macro) 0.3307 |
|
- Accuracy 0.3244 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.5887 0.5218 0.5532 458 |
|
PER 0.5595 0.3610 0.4388 482 |
|
ORG 0.0000 0.0000 0.0000 69 |
|
|
|
micro avg 0.5760 0.4093 0.4786 1009 |
|
macro avg 0.3827 0.2943 0.3307 1009 |
|
weighted avg 0.5345 0.4093 0.4608 1009 |
|
|
|
2023-10-18 23:15:08,989 ---------------------------------------------------------------------------------------------------- |
|
|