2023-10-16 23:07:55,433 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:07:55,434 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-16 23:07:55,434 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:07:55,435 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-16 23:07:55,435 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:07:55,435 Train: 6183 sentences 2023-10-16 23:07:55,435 (train_with_dev=False, train_with_test=False) 2023-10-16 23:07:55,435 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:07:55,435 Training Params: 2023-10-16 23:07:55,435 - learning_rate: "5e-05" 2023-10-16 23:07:55,435 - mini_batch_size: "4" 2023-10-16 23:07:55,435 - max_epochs: "10" 2023-10-16 23:07:55,435 - shuffle: "True" 2023-10-16 23:07:55,435 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:07:55,435 Plugins: 2023-10-16 23:07:55,435 - LinearScheduler | warmup_fraction: '0.1' 2023-10-16 23:07:55,435 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:07:55,435 Final evaluation on model from best epoch (best-model.pt) 2023-10-16 23:07:55,435 - metric: "('micro avg', 'f1-score')" 2023-10-16 23:07:55,435 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:07:55,435 Computation: 2023-10-16 23:07:55,435 - compute on device: cuda:0 2023-10-16 23:07:55,435 - embedding storage: none 2023-10-16 23:07:55,435 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:07:55,435 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-16 23:07:55,435 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:07:55,435 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:08:02,174 epoch 1 - iter 154/1546 - loss 1.64150864 - time (sec): 6.74 - samples/sec: 1772.73 - lr: 0.000005 - momentum: 0.000000 2023-10-16 23:08:08,962 epoch 1 - iter 308/1546 - loss 0.90057927 - time (sec): 13.53 - samples/sec: 1792.33 - lr: 0.000010 - momentum: 0.000000 2023-10-16 23:08:15,838 epoch 1 - iter 462/1546 - loss 0.64892933 - time (sec): 20.40 - samples/sec: 1787.05 - lr: 0.000015 - momentum: 0.000000 2023-10-16 23:08:22,933 epoch 1 - iter 616/1546 - loss 0.51849295 - time (sec): 27.50 - samples/sec: 1764.11 - lr: 0.000020 - momentum: 0.000000 2023-10-16 23:08:29,729 epoch 1 - iter 770/1546 - loss 0.44051424 - time (sec): 34.29 - samples/sec: 1767.49 - lr: 0.000025 - momentum: 0.000000 2023-10-16 23:08:36,578 epoch 1 - iter 924/1546 - loss 0.38682117 - time (sec): 41.14 - samples/sec: 1781.70 - lr: 0.000030 - momentum: 0.000000 2023-10-16 23:08:43,397 epoch 1 - iter 1078/1546 - loss 0.34961641 - time (sec): 47.96 - samples/sec: 1780.99 - lr: 0.000035 - momentum: 0.000000 2023-10-16 23:08:50,060 epoch 1 - iter 1232/1546 - loss 0.31440195 - time (sec): 54.62 - samples/sec: 1820.88 - lr: 0.000040 - momentum: 0.000000 2023-10-16 23:08:56,626 epoch 1 - iter 1386/1546 - loss 0.29206692 - time (sec): 61.19 - samples/sec: 1820.06 - lr: 0.000045 - momentum: 0.000000 2023-10-16 23:09:03,404 epoch 1 - iter 1540/1546 - loss 0.27442585 - time (sec): 67.97 - samples/sec: 1822.62 - lr: 0.000050 - momentum: 0.000000 2023-10-16 23:09:03,661 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:09:03,662 EPOCH 1 done: loss 0.2736 - lr: 0.000050 2023-10-16 23:09:05,483 DEV : loss 0.07161298394203186 - f1-score (micro avg) 0.7166 2023-10-16 23:09:05,497 saving best model 2023-10-16 23:09:05,846 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:09:12,769 epoch 2 - iter 154/1546 - loss 0.09385419 - time (sec): 6.92 - samples/sec: 1881.23 - lr: 0.000049 - momentum: 0.000000 2023-10-16 23:09:19,829 epoch 2 - iter 308/1546 - loss 0.10234709 - time (sec): 13.98 - samples/sec: 1874.39 - lr: 0.000049 - momentum: 0.000000 2023-10-16 23:09:26,787 epoch 2 - iter 462/1546 - loss 0.09615312 - time (sec): 20.94 - samples/sec: 1893.50 - lr: 0.000048 - momentum: 0.000000 2023-10-16 23:09:33,735 epoch 2 - iter 616/1546 - loss 0.09531468 - time (sec): 27.89 - samples/sec: 1884.97 - lr: 0.000048 - momentum: 0.000000 2023-10-16 23:09:40,620 epoch 2 - iter 770/1546 - loss 0.09328380 - time (sec): 34.77 - samples/sec: 1860.58 - lr: 0.000047 - momentum: 0.000000 2023-10-16 23:09:47,526 epoch 2 - iter 924/1546 - loss 0.09214802 - time (sec): 41.68 - samples/sec: 1827.45 - lr: 0.000047 - momentum: 0.000000 2023-10-16 23:09:54,387 epoch 2 - iter 1078/1546 - loss 0.09111570 - time (sec): 48.54 - samples/sec: 1803.47 - lr: 0.000046 - momentum: 0.000000 2023-10-16 23:10:01,304 epoch 2 - iter 1232/1546 - loss 0.09089521 - time (sec): 55.46 - samples/sec: 1802.48 - lr: 0.000046 - momentum: 0.000000 2023-10-16 23:10:08,109 epoch 2 - iter 1386/1546 - loss 0.09100394 - time (sec): 62.26 - samples/sec: 1798.03 - lr: 0.000045 - momentum: 0.000000 2023-10-16 23:10:14,863 epoch 2 - iter 1540/1546 - loss 0.09208714 - time (sec): 69.02 - samples/sec: 1795.63 - lr: 0.000044 - momentum: 0.000000 2023-10-16 23:10:15,134 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:10:15,134 EPOCH 2 done: loss 0.0921 - lr: 0.000044 2023-10-16 23:10:17,179 DEV : loss 0.08183355629444122 - f1-score (micro avg) 0.7184 2023-10-16 23:10:17,193 saving best model 2023-10-16 23:10:17,958 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:10:25,004 epoch 3 - iter 154/1546 - loss 0.07344422 - time (sec): 7.04 - samples/sec: 1874.17 - lr: 0.000044 - momentum: 0.000000 2023-10-16 23:10:31,995 epoch 3 - iter 308/1546 - loss 0.06636053 - time (sec): 14.03 - samples/sec: 1878.75 - lr: 0.000043 - momentum: 0.000000 2023-10-16 23:10:38,806 epoch 3 - iter 462/1546 - loss 0.06488901 - time (sec): 20.84 - samples/sec: 1841.40 - lr: 0.000043 - momentum: 0.000000 2023-10-16 23:10:45,667 epoch 3 - iter 616/1546 - loss 0.06350504 - time (sec): 27.71 - samples/sec: 1860.13 - lr: 0.000042 - momentum: 0.000000 2023-10-16 23:10:52,586 epoch 3 - iter 770/1546 - loss 0.06531454 - time (sec): 34.62 - samples/sec: 1844.65 - lr: 0.000042 - momentum: 0.000000 2023-10-16 23:10:59,469 epoch 3 - iter 924/1546 - loss 0.06187748 - time (sec): 41.51 - samples/sec: 1823.94 - lr: 0.000041 - momentum: 0.000000 2023-10-16 23:11:06,332 epoch 3 - iter 1078/1546 - loss 0.06010974 - time (sec): 48.37 - samples/sec: 1818.87 - lr: 0.000041 - momentum: 0.000000 2023-10-16 23:11:13,146 epoch 3 - iter 1232/1546 - loss 0.06062269 - time (sec): 55.18 - samples/sec: 1805.81 - lr: 0.000040 - momentum: 0.000000 2023-10-16 23:11:19,976 epoch 3 - iter 1386/1546 - loss 0.06301640 - time (sec): 62.01 - samples/sec: 1804.91 - lr: 0.000039 - momentum: 0.000000 2023-10-16 23:11:26,819 epoch 3 - iter 1540/1546 - loss 0.06084167 - time (sec): 68.86 - samples/sec: 1798.81 - lr: 0.000039 - momentum: 0.000000 2023-10-16 23:11:27,075 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:11:27,076 EPOCH 3 done: loss 0.0607 - lr: 0.000039 2023-10-16 23:11:29,133 DEV : loss 0.09026358276605606 - f1-score (micro avg) 0.7236 2023-10-16 23:11:29,146 saving best model 2023-10-16 23:11:29,603 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:11:36,652 epoch 4 - iter 154/1546 - loss 0.03262220 - time (sec): 7.05 - samples/sec: 1885.61 - lr: 0.000038 - momentum: 0.000000 2023-10-16 23:11:43,529 epoch 4 - iter 308/1546 - loss 0.03991408 - time (sec): 13.92 - samples/sec: 1813.40 - lr: 0.000038 - momentum: 0.000000 2023-10-16 23:11:50,498 epoch 4 - iter 462/1546 - loss 0.03847972 - time (sec): 20.89 - samples/sec: 1833.98 - lr: 0.000037 - momentum: 0.000000 2023-10-16 23:11:57,236 epoch 4 - iter 616/1546 - loss 0.04167851 - time (sec): 27.63 - samples/sec: 1817.61 - lr: 0.000037 - momentum: 0.000000 2023-10-16 23:12:04,059 epoch 4 - iter 770/1546 - loss 0.03931290 - time (sec): 34.45 - samples/sec: 1802.28 - lr: 0.000036 - momentum: 0.000000 2023-10-16 23:12:11,043 epoch 4 - iter 924/1546 - loss 0.03979994 - time (sec): 41.44 - samples/sec: 1818.98 - lr: 0.000036 - momentum: 0.000000 2023-10-16 23:12:17,803 epoch 4 - iter 1078/1546 - loss 0.04015082 - time (sec): 48.20 - samples/sec: 1814.82 - lr: 0.000035 - momentum: 0.000000 2023-10-16 23:12:24,796 epoch 4 - iter 1232/1546 - loss 0.04202577 - time (sec): 55.19 - samples/sec: 1816.10 - lr: 0.000034 - momentum: 0.000000 2023-10-16 23:12:31,592 epoch 4 - iter 1386/1546 - loss 0.04156639 - time (sec): 61.99 - samples/sec: 1811.53 - lr: 0.000034 - momentum: 0.000000 2023-10-16 23:12:38,362 epoch 4 - iter 1540/1546 - loss 0.04231192 - time (sec): 68.76 - samples/sec: 1803.10 - lr: 0.000033 - momentum: 0.000000 2023-10-16 23:12:38,617 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:12:38,617 EPOCH 4 done: loss 0.0423 - lr: 0.000033 2023-10-16 23:12:40,670 DEV : loss 0.08584349602460861 - f1-score (micro avg) 0.7352 2023-10-16 23:12:40,684 saving best model 2023-10-16 23:12:41,139 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:12:47,939 epoch 5 - iter 154/1546 - loss 0.02506310 - time (sec): 6.79 - samples/sec: 1743.27 - lr: 0.000033 - momentum: 0.000000 2023-10-16 23:12:54,788 epoch 5 - iter 308/1546 - loss 0.03027325 - time (sec): 13.64 - samples/sec: 1780.40 - lr: 0.000032 - momentum: 0.000000 2023-10-16 23:13:01,743 epoch 5 - iter 462/1546 - loss 0.02741686 - time (sec): 20.60 - samples/sec: 1780.58 - lr: 0.000032 - momentum: 0.000000 2023-10-16 23:13:08,396 epoch 5 - iter 616/1546 - loss 0.02674692 - time (sec): 27.25 - samples/sec: 1813.51 - lr: 0.000031 - momentum: 0.000000 2023-10-16 23:13:14,999 epoch 5 - iter 770/1546 - loss 0.02613219 - time (sec): 33.85 - samples/sec: 1818.93 - lr: 0.000031 - momentum: 0.000000 2023-10-16 23:13:21,668 epoch 5 - iter 924/1546 - loss 0.02705008 - time (sec): 40.52 - samples/sec: 1832.59 - lr: 0.000030 - momentum: 0.000000 2023-10-16 23:13:28,327 epoch 5 - iter 1078/1546 - loss 0.02734689 - time (sec): 47.18 - samples/sec: 1839.91 - lr: 0.000029 - momentum: 0.000000 2023-10-16 23:13:34,922 epoch 5 - iter 1232/1546 - loss 0.02818418 - time (sec): 53.78 - samples/sec: 1838.19 - lr: 0.000029 - momentum: 0.000000 2023-10-16 23:13:41,592 epoch 5 - iter 1386/1546 - loss 0.02874268 - time (sec): 60.45 - samples/sec: 1837.92 - lr: 0.000028 - momentum: 0.000000 2023-10-16 23:13:48,194 epoch 5 - iter 1540/1546 - loss 0.02830181 - time (sec): 67.05 - samples/sec: 1849.50 - lr: 0.000028 - momentum: 0.000000 2023-10-16 23:13:48,442 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:13:48,442 EPOCH 5 done: loss 0.0282 - lr: 0.000028 2023-10-16 23:13:50,469 DEV : loss 0.10642234981060028 - f1-score (micro avg) 0.7889 2023-10-16 23:13:50,482 saving best model 2023-10-16 23:13:50,956 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:13:57,883 epoch 6 - iter 154/1546 - loss 0.02830544 - time (sec): 6.92 - samples/sec: 1727.70 - lr: 0.000027 - momentum: 0.000000 2023-10-16 23:14:04,666 epoch 6 - iter 308/1546 - loss 0.02327229 - time (sec): 13.71 - samples/sec: 1777.58 - lr: 0.000027 - momentum: 0.000000 2023-10-16 23:14:11,575 epoch 6 - iter 462/1546 - loss 0.02380759 - time (sec): 20.62 - samples/sec: 1795.53 - lr: 0.000026 - momentum: 0.000000 2023-10-16 23:14:18,376 epoch 6 - iter 616/1546 - loss 0.02408850 - time (sec): 27.42 - samples/sec: 1815.29 - lr: 0.000026 - momentum: 0.000000 2023-10-16 23:14:25,311 epoch 6 - iter 770/1546 - loss 0.02387517 - time (sec): 34.35 - samples/sec: 1794.52 - lr: 0.000025 - momentum: 0.000000 2023-10-16 23:14:32,195 epoch 6 - iter 924/1546 - loss 0.02375313 - time (sec): 41.24 - samples/sec: 1792.76 - lr: 0.000024 - momentum: 0.000000 2023-10-16 23:14:39,200 epoch 6 - iter 1078/1546 - loss 0.02353132 - time (sec): 48.24 - samples/sec: 1814.92 - lr: 0.000024 - momentum: 0.000000 2023-10-16 23:14:46,108 epoch 6 - iter 1232/1546 - loss 0.02368503 - time (sec): 55.15 - samples/sec: 1814.78 - lr: 0.000023 - momentum: 0.000000 2023-10-16 23:14:52,887 epoch 6 - iter 1386/1546 - loss 0.02396904 - time (sec): 61.93 - samples/sec: 1807.11 - lr: 0.000023 - momentum: 0.000000 2023-10-16 23:14:59,631 epoch 6 - iter 1540/1546 - loss 0.02459035 - time (sec): 68.67 - samples/sec: 1802.15 - lr: 0.000022 - momentum: 0.000000 2023-10-16 23:14:59,903 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:14:59,904 EPOCH 6 done: loss 0.0245 - lr: 0.000022 2023-10-16 23:15:01,961 DEV : loss 0.09406588971614838 - f1-score (micro avg) 0.7415 2023-10-16 23:15:01,973 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:15:08,768 epoch 7 - iter 154/1546 - loss 0.01652669 - time (sec): 6.79 - samples/sec: 1834.76 - lr: 0.000022 - momentum: 0.000000 2023-10-16 23:15:15,697 epoch 7 - iter 308/1546 - loss 0.01523840 - time (sec): 13.72 - samples/sec: 1852.35 - lr: 0.000021 - momentum: 0.000000 2023-10-16 23:15:22,492 epoch 7 - iter 462/1546 - loss 0.01526124 - time (sec): 20.52 - samples/sec: 1832.89 - lr: 0.000021 - momentum: 0.000000 2023-10-16 23:15:29,345 epoch 7 - iter 616/1546 - loss 0.01673807 - time (sec): 27.37 - samples/sec: 1822.70 - lr: 0.000020 - momentum: 0.000000 2023-10-16 23:15:36,104 epoch 7 - iter 770/1546 - loss 0.01671240 - time (sec): 34.13 - samples/sec: 1798.36 - lr: 0.000019 - momentum: 0.000000 2023-10-16 23:15:42,928 epoch 7 - iter 924/1546 - loss 0.01503868 - time (sec): 40.95 - samples/sec: 1790.08 - lr: 0.000019 - momentum: 0.000000 2023-10-16 23:15:49,717 epoch 7 - iter 1078/1546 - loss 0.01488267 - time (sec): 47.74 - samples/sec: 1789.17 - lr: 0.000018 - momentum: 0.000000 2023-10-16 23:15:56,564 epoch 7 - iter 1232/1546 - loss 0.01428441 - time (sec): 54.59 - samples/sec: 1781.82 - lr: 0.000018 - momentum: 0.000000 2023-10-16 23:16:03,366 epoch 7 - iter 1386/1546 - loss 0.01474284 - time (sec): 61.39 - samples/sec: 1780.42 - lr: 0.000017 - momentum: 0.000000 2023-10-16 23:16:10,683 epoch 7 - iter 1540/1546 - loss 0.01579712 - time (sec): 68.71 - samples/sec: 1801.22 - lr: 0.000017 - momentum: 0.000000 2023-10-16 23:16:10,958 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:16:10,958 EPOCH 7 done: loss 0.0157 - lr: 0.000017 2023-10-16 23:16:12,997 DEV : loss 0.10287176817655563 - f1-score (micro avg) 0.7425 2023-10-16 23:16:13,010 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:16:19,880 epoch 8 - iter 154/1546 - loss 0.00666539 - time (sec): 6.87 - samples/sec: 1748.32 - lr: 0.000016 - momentum: 0.000000 2023-10-16 23:16:26,714 epoch 8 - iter 308/1546 - loss 0.01243512 - time (sec): 13.70 - samples/sec: 1773.90 - lr: 0.000016 - momentum: 0.000000 2023-10-16 23:16:33,531 epoch 8 - iter 462/1546 - loss 0.01341473 - time (sec): 20.52 - samples/sec: 1790.98 - lr: 0.000015 - momentum: 0.000000 2023-10-16 23:16:40,463 epoch 8 - iter 616/1546 - loss 0.01120251 - time (sec): 27.45 - samples/sec: 1799.60 - lr: 0.000014 - momentum: 0.000000 2023-10-16 23:16:47,295 epoch 8 - iter 770/1546 - loss 0.01063264 - time (sec): 34.28 - samples/sec: 1794.91 - lr: 0.000014 - momentum: 0.000000 2023-10-16 23:16:54,199 epoch 8 - iter 924/1546 - loss 0.01077928 - time (sec): 41.19 - samples/sec: 1793.03 - lr: 0.000013 - momentum: 0.000000 2023-10-16 23:17:01,068 epoch 8 - iter 1078/1546 - loss 0.01093684 - time (sec): 48.06 - samples/sec: 1803.03 - lr: 0.000013 - momentum: 0.000000 2023-10-16 23:17:07,890 epoch 8 - iter 1232/1546 - loss 0.01075192 - time (sec): 54.88 - samples/sec: 1810.94 - lr: 0.000012 - momentum: 0.000000 2023-10-16 23:17:14,686 epoch 8 - iter 1386/1546 - loss 0.01147320 - time (sec): 61.68 - samples/sec: 1809.98 - lr: 0.000012 - momentum: 0.000000 2023-10-16 23:17:21,438 epoch 8 - iter 1540/1546 - loss 0.01186634 - time (sec): 68.43 - samples/sec: 1809.47 - lr: 0.000011 - momentum: 0.000000 2023-10-16 23:17:21,706 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:17:21,706 EPOCH 8 done: loss 0.0118 - lr: 0.000011 2023-10-16 23:17:23,781 DEV : loss 0.12073371559381485 - f1-score (micro avg) 0.7485 2023-10-16 23:17:23,794 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:17:30,662 epoch 9 - iter 154/1546 - loss 0.00467496 - time (sec): 6.87 - samples/sec: 1820.27 - lr: 0.000011 - momentum: 0.000000 2023-10-16 23:17:37,398 epoch 9 - iter 308/1546 - loss 0.00407859 - time (sec): 13.60 - samples/sec: 1783.70 - lr: 0.000010 - momentum: 0.000000 2023-10-16 23:17:44,228 epoch 9 - iter 462/1546 - loss 0.00571073 - time (sec): 20.43 - samples/sec: 1835.50 - lr: 0.000009 - momentum: 0.000000 2023-10-16 23:17:50,984 epoch 9 - iter 616/1546 - loss 0.00474124 - time (sec): 27.19 - samples/sec: 1814.76 - lr: 0.000009 - momentum: 0.000000 2023-10-16 23:17:57,787 epoch 9 - iter 770/1546 - loss 0.00528951 - time (sec): 33.99 - samples/sec: 1801.64 - lr: 0.000008 - momentum: 0.000000 2023-10-16 23:18:04,711 epoch 9 - iter 924/1546 - loss 0.00728720 - time (sec): 40.92 - samples/sec: 1821.27 - lr: 0.000008 - momentum: 0.000000 2023-10-16 23:18:11,495 epoch 9 - iter 1078/1546 - loss 0.00663366 - time (sec): 47.70 - samples/sec: 1823.39 - lr: 0.000007 - momentum: 0.000000 2023-10-16 23:18:18,390 epoch 9 - iter 1232/1546 - loss 0.00665947 - time (sec): 54.59 - samples/sec: 1820.85 - lr: 0.000007 - momentum: 0.000000 2023-10-16 23:18:25,269 epoch 9 - iter 1386/1546 - loss 0.00627546 - time (sec): 61.47 - samples/sec: 1823.93 - lr: 0.000006 - momentum: 0.000000 2023-10-16 23:18:32,073 epoch 9 - iter 1540/1546 - loss 0.00613186 - time (sec): 68.28 - samples/sec: 1809.42 - lr: 0.000006 - momentum: 0.000000 2023-10-16 23:18:32,352 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:18:32,352 EPOCH 9 done: loss 0.0061 - lr: 0.000006 2023-10-16 23:18:34,443 DEV : loss 0.11544046550989151 - f1-score (micro avg) 0.7828 2023-10-16 23:18:34,456 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:18:41,279 epoch 10 - iter 154/1546 - loss 0.00224091 - time (sec): 6.82 - samples/sec: 1888.03 - lr: 0.000005 - momentum: 0.000000 2023-10-16 23:18:48,190 epoch 10 - iter 308/1546 - loss 0.00314672 - time (sec): 13.73 - samples/sec: 1859.51 - lr: 0.000004 - momentum: 0.000000 2023-10-16 23:18:55,114 epoch 10 - iter 462/1546 - loss 0.00255716 - time (sec): 20.66 - samples/sec: 1857.71 - lr: 0.000004 - momentum: 0.000000 2023-10-16 23:19:01,871 epoch 10 - iter 616/1546 - loss 0.00315360 - time (sec): 27.41 - samples/sec: 1843.80 - lr: 0.000003 - momentum: 0.000000 2023-10-16 23:19:08,781 epoch 10 - iter 770/1546 - loss 0.00288760 - time (sec): 34.32 - samples/sec: 1844.68 - lr: 0.000003 - momentum: 0.000000 2023-10-16 23:19:15,513 epoch 10 - iter 924/1546 - loss 0.00321502 - time (sec): 41.06 - samples/sec: 1824.82 - lr: 0.000002 - momentum: 0.000000 2023-10-16 23:19:22,279 epoch 10 - iter 1078/1546 - loss 0.00364917 - time (sec): 47.82 - samples/sec: 1822.30 - lr: 0.000002 - momentum: 0.000000 2023-10-16 23:19:29,150 epoch 10 - iter 1232/1546 - loss 0.00345427 - time (sec): 54.69 - samples/sec: 1824.82 - lr: 0.000001 - momentum: 0.000000 2023-10-16 23:19:35,901 epoch 10 - iter 1386/1546 - loss 0.00341484 - time (sec): 61.44 - samples/sec: 1812.47 - lr: 0.000001 - momentum: 0.000000 2023-10-16 23:19:42,701 epoch 10 - iter 1540/1546 - loss 0.00354867 - time (sec): 68.24 - samples/sec: 1813.03 - lr: 0.000000 - momentum: 0.000000 2023-10-16 23:19:42,964 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:19:42,964 EPOCH 10 done: loss 0.0035 - lr: 0.000000 2023-10-16 23:19:45,035 DEV : loss 0.12026786804199219 - f1-score (micro avg) 0.7762 2023-10-16 23:19:45,445 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:19:45,446 Loading model from best epoch ... 2023-10-16 23:19:47,046 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-16 23:19:52,967 Results: - F-score (micro) 0.7899 - F-score (macro) 0.6726 - Accuracy 0.6762 By class: precision recall f1-score support LOC 0.8624 0.8214 0.8414 946 BUILDING 0.5947 0.6108 0.6027 185 STREET 0.5593 0.5893 0.5739 56 micro avg 0.8026 0.7776 0.7899 1187 macro avg 0.6721 0.6738 0.6726 1187 weighted avg 0.8064 0.7776 0.7915 1187 2023-10-16 23:19:52,967 ----------------------------------------------------------------------------------------------------