|
2023-10-18 22:51:24,698 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:51:24,698 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 22:51:24,698 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:51:24,699 MultiCorpus: 5777 train + 722 dev + 723 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl |
|
2023-10-18 22:51:24,699 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:51:24,699 Train: 5777 sentences |
|
2023-10-18 22:51:24,699 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 22:51:24,699 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:51:24,699 Training Params: |
|
2023-10-18 22:51:24,699 - learning_rate: "5e-05" |
|
2023-10-18 22:51:24,699 - mini_batch_size: "4" |
|
2023-10-18 22:51:24,699 - max_epochs: "10" |
|
2023-10-18 22:51:24,699 - shuffle: "True" |
|
2023-10-18 22:51:24,699 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:51:24,699 Plugins: |
|
2023-10-18 22:51:24,699 - TensorboardLogger |
|
2023-10-18 22:51:24,699 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 22:51:24,699 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:51:24,699 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 22:51:24,699 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 22:51:24,699 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:51:24,699 Computation: |
|
2023-10-18 22:51:24,699 - compute on device: cuda:0 |
|
2023-10-18 22:51:24,699 - embedding storage: none |
|
2023-10-18 22:51:24,699 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:51:24,699 Model training base path: "hmbench-icdar/nl-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-18 22:51:24,699 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:51:24,699 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:51:24,699 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 22:51:27,059 epoch 1 - iter 144/1445 - loss 3.29539325 - time (sec): 2.36 - samples/sec: 6871.85 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 22:51:29,599 epoch 1 - iter 288/1445 - loss 2.80667047 - time (sec): 4.90 - samples/sec: 6851.93 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 22:51:32,037 epoch 1 - iter 432/1445 - loss 2.17298777 - time (sec): 7.34 - samples/sec: 7006.80 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 22:51:34,467 epoch 1 - iter 576/1445 - loss 1.69993643 - time (sec): 9.77 - samples/sec: 7136.03 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 22:51:36,827 epoch 1 - iter 720/1445 - loss 1.42956310 - time (sec): 12.13 - samples/sec: 7137.99 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 22:51:39,163 epoch 1 - iter 864/1445 - loss 1.24708491 - time (sec): 14.46 - samples/sec: 7154.15 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 22:51:41,607 epoch 1 - iter 1008/1445 - loss 1.11449457 - time (sec): 16.91 - samples/sec: 7121.10 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 22:51:44,080 epoch 1 - iter 1152/1445 - loss 1.00767761 - time (sec): 19.38 - samples/sec: 7180.56 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 22:51:46,593 epoch 1 - iter 1296/1445 - loss 0.91487367 - time (sec): 21.89 - samples/sec: 7218.53 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 22:51:49,002 epoch 1 - iter 1440/1445 - loss 0.84936645 - time (sec): 24.30 - samples/sec: 7229.17 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-18 22:51:49,078 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:51:49,079 EPOCH 1 done: loss 0.8477 - lr: 0.000050 |
|
2023-10-18 22:51:50,368 DEV : loss 0.28886058926582336 - f1-score (micro avg) 0.0363 |
|
2023-10-18 22:51:50,382 saving best model |
|
2023-10-18 22:51:50,413 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:51:52,534 epoch 2 - iter 144/1445 - loss 0.20771123 - time (sec): 2.12 - samples/sec: 8378.68 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 22:51:54,705 epoch 2 - iter 288/1445 - loss 0.21302193 - time (sec): 4.29 - samples/sec: 8059.18 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 22:51:57,097 epoch 2 - iter 432/1445 - loss 0.21308278 - time (sec): 6.68 - samples/sec: 7668.37 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 22:51:59,685 epoch 2 - iter 576/1445 - loss 0.20687075 - time (sec): 9.27 - samples/sec: 7600.92 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 22:52:02,218 epoch 2 - iter 720/1445 - loss 0.20244105 - time (sec): 11.80 - samples/sec: 7393.23 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 22:52:04,652 epoch 2 - iter 864/1445 - loss 0.19886336 - time (sec): 14.24 - samples/sec: 7394.02 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 22:52:06,878 epoch 2 - iter 1008/1445 - loss 0.19645336 - time (sec): 16.46 - samples/sec: 7481.26 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 22:52:09,110 epoch 2 - iter 1152/1445 - loss 0.20134357 - time (sec): 18.70 - samples/sec: 7488.90 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 22:52:11,497 epoch 2 - iter 1296/1445 - loss 0.19916074 - time (sec): 21.08 - samples/sec: 7494.59 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 22:52:13,935 epoch 2 - iter 1440/1445 - loss 0.19916757 - time (sec): 23.52 - samples/sec: 7459.46 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 22:52:14,018 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:52:14,019 EPOCH 2 done: loss 0.1991 - lr: 0.000044 |
|
2023-10-18 22:52:15,801 DEV : loss 0.2434382140636444 - f1-score (micro avg) 0.3074 |
|
2023-10-18 22:52:15,815 saving best model |
|
2023-10-18 22:52:15,849 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:52:18,303 epoch 3 - iter 144/1445 - loss 0.16824685 - time (sec): 2.45 - samples/sec: 7641.85 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 22:52:20,724 epoch 3 - iter 288/1445 - loss 0.17832135 - time (sec): 4.87 - samples/sec: 7722.80 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 22:52:23,482 epoch 3 - iter 432/1445 - loss 0.17388072 - time (sec): 7.63 - samples/sec: 7228.05 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 22:52:25,913 epoch 3 - iter 576/1445 - loss 0.16949600 - time (sec): 10.06 - samples/sec: 7262.71 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 22:52:28,376 epoch 3 - iter 720/1445 - loss 0.17010711 - time (sec): 12.53 - samples/sec: 7214.37 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 22:52:30,724 epoch 3 - iter 864/1445 - loss 0.16944462 - time (sec): 14.87 - samples/sec: 7161.25 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 22:52:33,038 epoch 3 - iter 1008/1445 - loss 0.17031912 - time (sec): 17.19 - samples/sec: 7241.37 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 22:52:35,207 epoch 3 - iter 1152/1445 - loss 0.16944422 - time (sec): 19.36 - samples/sec: 7297.31 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 22:52:37,602 epoch 3 - iter 1296/1445 - loss 0.16674107 - time (sec): 21.75 - samples/sec: 7316.34 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 22:52:39,958 epoch 3 - iter 1440/1445 - loss 0.16647381 - time (sec): 24.11 - samples/sec: 7278.75 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 22:52:40,039 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:52:40,039 EPOCH 3 done: loss 0.1663 - lr: 0.000039 |
|
2023-10-18 22:52:41,821 DEV : loss 0.21055997908115387 - f1-score (micro avg) 0.4219 |
|
2023-10-18 22:52:41,835 saving best model |
|
2023-10-18 22:52:41,870 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:52:44,200 epoch 4 - iter 144/1445 - loss 0.15283862 - time (sec): 2.33 - samples/sec: 7307.47 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 22:52:46,578 epoch 4 - iter 288/1445 - loss 0.14799565 - time (sec): 4.71 - samples/sec: 7015.01 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 22:52:48,944 epoch 4 - iter 432/1445 - loss 0.15105169 - time (sec): 7.07 - samples/sec: 7106.38 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 22:52:51,342 epoch 4 - iter 576/1445 - loss 0.15217093 - time (sec): 9.47 - samples/sec: 7174.33 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 22:52:53,783 epoch 4 - iter 720/1445 - loss 0.15166536 - time (sec): 11.91 - samples/sec: 7162.93 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 22:52:56,210 epoch 4 - iter 864/1445 - loss 0.15237011 - time (sec): 14.34 - samples/sec: 7158.14 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 22:52:58,867 epoch 4 - iter 1008/1445 - loss 0.15079524 - time (sec): 17.00 - samples/sec: 7087.84 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 22:53:01,592 epoch 4 - iter 1152/1445 - loss 0.15204078 - time (sec): 19.72 - samples/sec: 7121.06 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 22:53:03,966 epoch 4 - iter 1296/1445 - loss 0.15103163 - time (sec): 22.10 - samples/sec: 7125.45 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 22:53:06,380 epoch 4 - iter 1440/1445 - loss 0.15143506 - time (sec): 24.51 - samples/sec: 7168.62 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 22:53:06,461 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:53:06,461 EPOCH 4 done: loss 0.1515 - lr: 0.000033 |
|
2023-10-18 22:53:08,244 DEV : loss 0.19626548886299133 - f1-score (micro avg) 0.4869 |
|
2023-10-18 22:53:08,258 saving best model |
|
2023-10-18 22:53:08,293 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:53:10,756 epoch 5 - iter 144/1445 - loss 0.14791715 - time (sec): 2.46 - samples/sec: 7246.53 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 22:53:13,168 epoch 5 - iter 288/1445 - loss 0.14595192 - time (sec): 4.87 - samples/sec: 7399.69 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 22:53:15,540 epoch 5 - iter 432/1445 - loss 0.14209987 - time (sec): 7.25 - samples/sec: 7414.07 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 22:53:17,946 epoch 5 - iter 576/1445 - loss 0.14142116 - time (sec): 9.65 - samples/sec: 7301.39 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 22:53:20,291 epoch 5 - iter 720/1445 - loss 0.13858369 - time (sec): 12.00 - samples/sec: 7226.89 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 22:53:22,707 epoch 5 - iter 864/1445 - loss 0.13944449 - time (sec): 14.41 - samples/sec: 7219.53 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 22:53:25,161 epoch 5 - iter 1008/1445 - loss 0.13904087 - time (sec): 16.87 - samples/sec: 7204.05 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 22:53:27,753 epoch 5 - iter 1152/1445 - loss 0.13936345 - time (sec): 19.46 - samples/sec: 7198.86 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 22:53:30,106 epoch 5 - iter 1296/1445 - loss 0.13877879 - time (sec): 21.81 - samples/sec: 7220.31 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 22:53:32,573 epoch 5 - iter 1440/1445 - loss 0.13778394 - time (sec): 24.28 - samples/sec: 7245.29 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 22:53:32,646 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:53:32,646 EPOCH 5 done: loss 0.1378 - lr: 0.000028 |
|
2023-10-18 22:53:34,776 DEV : loss 0.186900332570076 - f1-score (micro avg) 0.5261 |
|
2023-10-18 22:53:34,792 saving best model |
|
2023-10-18 22:53:34,828 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:53:37,198 epoch 6 - iter 144/1445 - loss 0.12941650 - time (sec): 2.37 - samples/sec: 6979.43 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 22:53:39,571 epoch 6 - iter 288/1445 - loss 0.12211079 - time (sec): 4.74 - samples/sec: 7141.82 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 22:53:41,959 epoch 6 - iter 432/1445 - loss 0.12637372 - time (sec): 7.13 - samples/sec: 7336.97 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 22:53:44,418 epoch 6 - iter 576/1445 - loss 0.12653560 - time (sec): 9.59 - samples/sec: 7368.21 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 22:53:46,769 epoch 6 - iter 720/1445 - loss 0.12839698 - time (sec): 11.94 - samples/sec: 7326.25 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 22:53:49,154 epoch 6 - iter 864/1445 - loss 0.12535563 - time (sec): 14.32 - samples/sec: 7325.47 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 22:53:51,561 epoch 6 - iter 1008/1445 - loss 0.12897091 - time (sec): 16.73 - samples/sec: 7261.16 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 22:53:53,981 epoch 6 - iter 1152/1445 - loss 0.12610695 - time (sec): 19.15 - samples/sec: 7266.88 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 22:53:56,366 epoch 6 - iter 1296/1445 - loss 0.12806517 - time (sec): 21.54 - samples/sec: 7280.58 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 22:53:58,957 epoch 6 - iter 1440/1445 - loss 0.12929665 - time (sec): 24.13 - samples/sec: 7283.06 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 22:53:59,044 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:53:59,044 EPOCH 6 done: loss 0.1292 - lr: 0.000022 |
|
2023-10-18 22:54:00,808 DEV : loss 0.1867137998342514 - f1-score (micro avg) 0.5326 |
|
2023-10-18 22:54:00,822 saving best model |
|
2023-10-18 22:54:00,858 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:54:03,301 epoch 7 - iter 144/1445 - loss 0.11727356 - time (sec): 2.44 - samples/sec: 6746.22 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 22:54:05,687 epoch 7 - iter 288/1445 - loss 0.11920637 - time (sec): 4.83 - samples/sec: 7110.68 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 22:54:08,078 epoch 7 - iter 432/1445 - loss 0.12142107 - time (sec): 7.22 - samples/sec: 7086.33 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 22:54:10,549 epoch 7 - iter 576/1445 - loss 0.12054887 - time (sec): 9.69 - samples/sec: 7176.58 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 22:54:12,850 epoch 7 - iter 720/1445 - loss 0.12121286 - time (sec): 11.99 - samples/sec: 7219.02 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 22:54:15,406 epoch 7 - iter 864/1445 - loss 0.12151057 - time (sec): 14.55 - samples/sec: 7155.32 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 22:54:17,870 epoch 7 - iter 1008/1445 - loss 0.12126598 - time (sec): 17.01 - samples/sec: 7193.07 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 22:54:20,281 epoch 7 - iter 1152/1445 - loss 0.12219795 - time (sec): 19.42 - samples/sec: 7208.87 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 22:54:22,749 epoch 7 - iter 1296/1445 - loss 0.12311442 - time (sec): 21.89 - samples/sec: 7217.81 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 22:54:25,139 epoch 7 - iter 1440/1445 - loss 0.12171338 - time (sec): 24.28 - samples/sec: 7233.22 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 22:54:25,219 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:54:25,220 EPOCH 7 done: loss 0.1216 - lr: 0.000017 |
|
2023-10-18 22:54:26,988 DEV : loss 0.19099119305610657 - f1-score (micro avg) 0.5487 |
|
2023-10-18 22:54:27,003 saving best model |
|
2023-10-18 22:54:27,040 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:54:29,347 epoch 8 - iter 144/1445 - loss 0.13909632 - time (sec): 2.31 - samples/sec: 7149.79 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 22:54:31,777 epoch 8 - iter 288/1445 - loss 0.12875856 - time (sec): 4.74 - samples/sec: 7333.26 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 22:54:34,165 epoch 8 - iter 432/1445 - loss 0.12665044 - time (sec): 7.13 - samples/sec: 7443.17 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 22:54:36,574 epoch 8 - iter 576/1445 - loss 0.12517964 - time (sec): 9.53 - samples/sec: 7321.07 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 22:54:38,887 epoch 8 - iter 720/1445 - loss 0.12049045 - time (sec): 11.85 - samples/sec: 7417.55 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 22:54:41,339 epoch 8 - iter 864/1445 - loss 0.11769624 - time (sec): 14.30 - samples/sec: 7392.12 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 22:54:43,795 epoch 8 - iter 1008/1445 - loss 0.11773878 - time (sec): 16.76 - samples/sec: 7377.68 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 22:54:46,163 epoch 8 - iter 1152/1445 - loss 0.11579635 - time (sec): 19.12 - samples/sec: 7331.13 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 22:54:48,494 epoch 8 - iter 1296/1445 - loss 0.11660615 - time (sec): 21.45 - samples/sec: 7351.02 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 22:54:50,905 epoch 8 - iter 1440/1445 - loss 0.11664721 - time (sec): 23.86 - samples/sec: 7365.31 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 22:54:50,980 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:54:50,980 EPOCH 8 done: loss 0.1166 - lr: 0.000011 |
|
2023-10-18 22:54:53,066 DEV : loss 0.19956204295158386 - f1-score (micro avg) 0.5434 |
|
2023-10-18 22:54:53,080 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:54:55,471 epoch 9 - iter 144/1445 - loss 0.11520321 - time (sec): 2.39 - samples/sec: 7521.67 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 22:54:57,898 epoch 9 - iter 288/1445 - loss 0.12001827 - time (sec): 4.82 - samples/sec: 7468.77 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 22:55:00,238 epoch 9 - iter 432/1445 - loss 0.11044915 - time (sec): 7.16 - samples/sec: 7365.68 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 22:55:02,679 epoch 9 - iter 576/1445 - loss 0.10805370 - time (sec): 9.60 - samples/sec: 7325.82 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 22:55:05,091 epoch 9 - iter 720/1445 - loss 0.11126004 - time (sec): 12.01 - samples/sec: 7351.91 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 22:55:07,481 epoch 9 - iter 864/1445 - loss 0.11107034 - time (sec): 14.40 - samples/sec: 7406.58 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 22:55:09,864 epoch 9 - iter 1008/1445 - loss 0.11224647 - time (sec): 16.78 - samples/sec: 7416.01 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 22:55:12,182 epoch 9 - iter 1152/1445 - loss 0.11351795 - time (sec): 19.10 - samples/sec: 7395.91 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 22:55:14,679 epoch 9 - iter 1296/1445 - loss 0.11382066 - time (sec): 21.60 - samples/sec: 7353.71 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 22:55:17,000 epoch 9 - iter 1440/1445 - loss 0.11288922 - time (sec): 23.92 - samples/sec: 7345.54 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 22:55:17,074 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:55:17,075 EPOCH 9 done: loss 0.1130 - lr: 0.000006 |
|
2023-10-18 22:55:18,856 DEV : loss 0.19409048557281494 - f1-score (micro avg) 0.5636 |
|
2023-10-18 22:55:18,870 saving best model |
|
2023-10-18 22:55:18,907 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:55:21,246 epoch 10 - iter 144/1445 - loss 0.11919407 - time (sec): 2.34 - samples/sec: 7369.04 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 22:55:23,624 epoch 10 - iter 288/1445 - loss 0.12311079 - time (sec): 4.72 - samples/sec: 7239.49 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 22:55:26,056 epoch 10 - iter 432/1445 - loss 0.12140389 - time (sec): 7.15 - samples/sec: 7266.09 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 22:55:28,507 epoch 10 - iter 576/1445 - loss 0.11657262 - time (sec): 9.60 - samples/sec: 7407.10 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 22:55:30,961 epoch 10 - iter 720/1445 - loss 0.11243952 - time (sec): 12.05 - samples/sec: 7385.04 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 22:55:33,205 epoch 10 - iter 864/1445 - loss 0.11354347 - time (sec): 14.30 - samples/sec: 7387.47 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 22:55:35,316 epoch 10 - iter 1008/1445 - loss 0.11085507 - time (sec): 16.41 - samples/sec: 7513.69 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 22:55:37,805 epoch 10 - iter 1152/1445 - loss 0.10959603 - time (sec): 18.90 - samples/sec: 7503.00 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 22:55:40,166 epoch 10 - iter 1296/1445 - loss 0.11097144 - time (sec): 21.26 - samples/sec: 7447.61 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 22:55:42,632 epoch 10 - iter 1440/1445 - loss 0.11183179 - time (sec): 23.72 - samples/sec: 7402.41 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 22:55:42,711 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:55:42,712 EPOCH 10 done: loss 0.1119 - lr: 0.000000 |
|
2023-10-18 22:55:44,486 DEV : loss 0.19939179718494415 - f1-score (micro avg) 0.5604 |
|
2023-10-18 22:55:44,530 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:55:44,530 Loading model from best epoch ... |
|
2023-10-18 22:55:44,613 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-18 22:55:45,906 |
|
Results: |
|
- F-score (micro) 0.5542 |
|
- F-score (macro) 0.392 |
|
- Accuracy 0.393 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.6227 0.6594 0.6405 458 |
|
PER 0.5196 0.4959 0.5074 482 |
|
ORG 0.5000 0.0145 0.0282 69 |
|
|
|
micro avg 0.5723 0.5372 0.5542 1009 |
|
macro avg 0.5474 0.3899 0.3920 1009 |
|
weighted avg 0.5650 0.5372 0.5351 1009 |
|
|
|
2023-10-18 22:55:45,907 ---------------------------------------------------------------------------------------------------- |
|
|