2023-10-16 22:13:33,990 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:13:33,991 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-16 22:13:33,991 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:13:33,991 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-16 22:13:33,991 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:13:33,991 Train: 6183 sentences 2023-10-16 22:13:33,991 (train_with_dev=False, train_with_test=False) 2023-10-16 22:13:33,991 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:13:33,991 Training Params: 2023-10-16 22:13:33,992 - learning_rate: "3e-05" 2023-10-16 22:13:33,992 - mini_batch_size: "4" 2023-10-16 22:13:33,992 - max_epochs: "10" 2023-10-16 22:13:33,992 - shuffle: "True" 2023-10-16 22:13:33,992 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:13:33,992 Plugins: 2023-10-16 22:13:33,992 - LinearScheduler | warmup_fraction: '0.1' 2023-10-16 22:13:33,992 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:13:33,992 Final evaluation on model from best epoch (best-model.pt) 2023-10-16 22:13:33,992 - metric: "('micro avg', 'f1-score')" 2023-10-16 22:13:33,992 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:13:33,992 Computation: 2023-10-16 22:13:33,992 - compute on device: cuda:0 2023-10-16 22:13:33,992 - embedding storage: none 2023-10-16 22:13:33,992 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:13:33,992 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-16 22:13:33,992 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:13:33,992 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:13:41,036 epoch 1 - iter 154/1546 - loss 1.89169363 - time (sec): 7.04 - samples/sec: 1856.25 - lr: 0.000003 - momentum: 0.000000 2023-10-16 22:13:47,976 epoch 1 - iter 308/1546 - loss 1.11226889 - time (sec): 13.98 - samples/sec: 1798.15 - lr: 0.000006 - momentum: 0.000000 2023-10-16 22:13:54,826 epoch 1 - iter 462/1546 - loss 0.79922596 - time (sec): 20.83 - samples/sec: 1834.89 - lr: 0.000009 - momentum: 0.000000 2023-10-16 22:14:01,596 epoch 1 - iter 616/1546 - loss 0.63837166 - time (sec): 27.60 - samples/sec: 1822.95 - lr: 0.000012 - momentum: 0.000000 2023-10-16 22:14:08,419 epoch 1 - iter 770/1546 - loss 0.53584832 - time (sec): 34.43 - samples/sec: 1820.11 - lr: 0.000015 - momentum: 0.000000 2023-10-16 22:14:15,218 epoch 1 - iter 924/1546 - loss 0.47208580 - time (sec): 41.22 - samples/sec: 1803.72 - lr: 0.000018 - momentum: 0.000000 2023-10-16 22:14:21,994 epoch 1 - iter 1078/1546 - loss 0.42137026 - time (sec): 48.00 - samples/sec: 1805.09 - lr: 0.000021 - momentum: 0.000000 2023-10-16 22:14:28,765 epoch 1 - iter 1232/1546 - loss 0.38283230 - time (sec): 54.77 - samples/sec: 1806.79 - lr: 0.000024 - momentum: 0.000000 2023-10-16 22:14:35,670 epoch 1 - iter 1386/1546 - loss 0.35327797 - time (sec): 61.68 - samples/sec: 1808.79 - lr: 0.000027 - momentum: 0.000000 2023-10-16 22:14:42,498 epoch 1 - iter 1540/1546 - loss 0.32990339 - time (sec): 68.51 - samples/sec: 1808.50 - lr: 0.000030 - momentum: 0.000000 2023-10-16 22:14:42,763 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:14:42,763 EPOCH 1 done: loss 0.3292 - lr: 0.000030 2023-10-16 22:14:44,800 DEV : loss 0.06861560046672821 - f1-score (micro avg) 0.7008 2023-10-16 22:14:44,813 saving best model 2023-10-16 22:14:45,144 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:14:51,840 epoch 2 - iter 154/1546 - loss 0.10594349 - time (sec): 6.70 - samples/sec: 1833.60 - lr: 0.000030 - momentum: 0.000000 2023-10-16 22:14:58,684 epoch 2 - iter 308/1546 - loss 0.09339103 - time (sec): 13.54 - samples/sec: 1878.86 - lr: 0.000029 - momentum: 0.000000 2023-10-16 22:15:05,450 epoch 2 - iter 462/1546 - loss 0.09323722 - time (sec): 20.30 - samples/sec: 1823.80 - lr: 0.000029 - momentum: 0.000000 2023-10-16 22:15:12,348 epoch 2 - iter 616/1546 - loss 0.08968150 - time (sec): 27.20 - samples/sec: 1814.58 - lr: 0.000029 - momentum: 0.000000 2023-10-16 22:15:19,198 epoch 2 - iter 770/1546 - loss 0.09342737 - time (sec): 34.05 - samples/sec: 1798.90 - lr: 0.000028 - momentum: 0.000000 2023-10-16 22:15:26,224 epoch 2 - iter 924/1546 - loss 0.08829761 - time (sec): 41.08 - samples/sec: 1820.13 - lr: 0.000028 - momentum: 0.000000 2023-10-16 22:15:33,082 epoch 2 - iter 1078/1546 - loss 0.08746364 - time (sec): 47.94 - samples/sec: 1810.86 - lr: 0.000028 - momentum: 0.000000 2023-10-16 22:15:39,985 epoch 2 - iter 1232/1546 - loss 0.08799371 - time (sec): 54.84 - samples/sec: 1803.28 - lr: 0.000027 - momentum: 0.000000 2023-10-16 22:15:46,769 epoch 2 - iter 1386/1546 - loss 0.08684286 - time (sec): 61.62 - samples/sec: 1799.34 - lr: 0.000027 - momentum: 0.000000 2023-10-16 22:15:53,640 epoch 2 - iter 1540/1546 - loss 0.08551341 - time (sec): 68.49 - samples/sec: 1808.43 - lr: 0.000027 - momentum: 0.000000 2023-10-16 22:15:53,897 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:15:53,897 EPOCH 2 done: loss 0.0854 - lr: 0.000027 2023-10-16 22:15:55,946 DEV : loss 0.061695486307144165 - f1-score (micro avg) 0.7438 2023-10-16 22:15:55,958 saving best model 2023-10-16 22:15:56,336 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:16:03,116 epoch 3 - iter 154/1546 - loss 0.05352142 - time (sec): 6.78 - samples/sec: 1856.48 - lr: 0.000026 - momentum: 0.000000 2023-10-16 22:16:10,018 epoch 3 - iter 308/1546 - loss 0.05381165 - time (sec): 13.68 - samples/sec: 1831.30 - lr: 0.000026 - momentum: 0.000000 2023-10-16 22:16:16,836 epoch 3 - iter 462/1546 - loss 0.05138306 - time (sec): 20.50 - samples/sec: 1818.79 - lr: 0.000026 - momentum: 0.000000 2023-10-16 22:16:23,700 epoch 3 - iter 616/1546 - loss 0.05202487 - time (sec): 27.36 - samples/sec: 1799.35 - lr: 0.000025 - momentum: 0.000000 2023-10-16 22:16:30,519 epoch 3 - iter 770/1546 - loss 0.05270760 - time (sec): 34.18 - samples/sec: 1781.49 - lr: 0.000025 - momentum: 0.000000 2023-10-16 22:16:37,305 epoch 3 - iter 924/1546 - loss 0.05313926 - time (sec): 40.97 - samples/sec: 1769.51 - lr: 0.000025 - momentum: 0.000000 2023-10-16 22:16:44,171 epoch 3 - iter 1078/1546 - loss 0.05425075 - time (sec): 47.83 - samples/sec: 1774.25 - lr: 0.000024 - momentum: 0.000000 2023-10-16 22:16:51,082 epoch 3 - iter 1232/1546 - loss 0.05410382 - time (sec): 54.75 - samples/sec: 1782.41 - lr: 0.000024 - momentum: 0.000000 2023-10-16 22:16:57,962 epoch 3 - iter 1386/1546 - loss 0.05476332 - time (sec): 61.62 - samples/sec: 1789.99 - lr: 0.000024 - momentum: 0.000000 2023-10-16 22:17:04,918 epoch 3 - iter 1540/1546 - loss 0.05489666 - time (sec): 68.58 - samples/sec: 1805.90 - lr: 0.000023 - momentum: 0.000000 2023-10-16 22:17:05,181 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:17:05,181 EPOCH 3 done: loss 0.0547 - lr: 0.000023 2023-10-16 22:17:07,529 DEV : loss 0.061977606266736984 - f1-score (micro avg) 0.7663 2023-10-16 22:17:07,541 saving best model 2023-10-16 22:17:08,003 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:17:14,831 epoch 4 - iter 154/1546 - loss 0.03348088 - time (sec): 6.82 - samples/sec: 1727.26 - lr: 0.000023 - momentum: 0.000000 2023-10-16 22:17:21,744 epoch 4 - iter 308/1546 - loss 0.03828523 - time (sec): 13.73 - samples/sec: 1842.44 - lr: 0.000023 - momentum: 0.000000 2023-10-16 22:17:28,540 epoch 4 - iter 462/1546 - loss 0.03893862 - time (sec): 20.53 - samples/sec: 1804.31 - lr: 0.000022 - momentum: 0.000000 2023-10-16 22:17:35,383 epoch 4 - iter 616/1546 - loss 0.03748261 - time (sec): 27.37 - samples/sec: 1791.42 - lr: 0.000022 - momentum: 0.000000 2023-10-16 22:17:42,259 epoch 4 - iter 770/1546 - loss 0.03477433 - time (sec): 34.25 - samples/sec: 1810.90 - lr: 0.000022 - momentum: 0.000000 2023-10-16 22:17:49,167 epoch 4 - iter 924/1546 - loss 0.03501152 - time (sec): 41.15 - samples/sec: 1825.66 - lr: 0.000021 - momentum: 0.000000 2023-10-16 22:17:55,980 epoch 4 - iter 1078/1546 - loss 0.03703932 - time (sec): 47.97 - samples/sec: 1801.41 - lr: 0.000021 - momentum: 0.000000 2023-10-16 22:18:02,902 epoch 4 - iter 1232/1546 - loss 0.03858821 - time (sec): 54.89 - samples/sec: 1807.98 - lr: 0.000021 - momentum: 0.000000 2023-10-16 22:18:09,684 epoch 4 - iter 1386/1546 - loss 0.03914559 - time (sec): 61.67 - samples/sec: 1802.93 - lr: 0.000020 - momentum: 0.000000 2023-10-16 22:18:17,009 epoch 4 - iter 1540/1546 - loss 0.03933452 - time (sec): 69.00 - samples/sec: 1796.19 - lr: 0.000020 - momentum: 0.000000 2023-10-16 22:18:17,268 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:18:17,269 EPOCH 4 done: loss 0.0394 - lr: 0.000020 2023-10-16 22:18:19,309 DEV : loss 0.0907452255487442 - f1-score (micro avg) 0.7805 2023-10-16 22:18:19,323 saving best model 2023-10-16 22:18:19,768 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:18:26,862 epoch 5 - iter 154/1546 - loss 0.02856650 - time (sec): 7.09 - samples/sec: 1732.93 - lr: 0.000020 - momentum: 0.000000 2023-10-16 22:18:33,828 epoch 5 - iter 308/1546 - loss 0.02438427 - time (sec): 14.06 - samples/sec: 1786.41 - lr: 0.000019 - momentum: 0.000000 2023-10-16 22:18:40,685 epoch 5 - iter 462/1546 - loss 0.02487299 - time (sec): 20.91 - samples/sec: 1793.05 - lr: 0.000019 - momentum: 0.000000 2023-10-16 22:18:47,656 epoch 5 - iter 616/1546 - loss 0.02566201 - time (sec): 27.88 - samples/sec: 1825.66 - lr: 0.000019 - momentum: 0.000000 2023-10-16 22:18:54,562 epoch 5 - iter 770/1546 - loss 0.02715511 - time (sec): 34.79 - samples/sec: 1825.46 - lr: 0.000018 - momentum: 0.000000 2023-10-16 22:19:01,378 epoch 5 - iter 924/1546 - loss 0.02767064 - time (sec): 41.61 - samples/sec: 1828.00 - lr: 0.000018 - momentum: 0.000000 2023-10-16 22:19:08,338 epoch 5 - iter 1078/1546 - loss 0.02706633 - time (sec): 48.57 - samples/sec: 1834.40 - lr: 0.000018 - momentum: 0.000000 2023-10-16 22:19:15,159 epoch 5 - iter 1232/1546 - loss 0.02726636 - time (sec): 55.39 - samples/sec: 1823.57 - lr: 0.000017 - momentum: 0.000000 2023-10-16 22:19:21,988 epoch 5 - iter 1386/1546 - loss 0.02736502 - time (sec): 62.22 - samples/sec: 1808.57 - lr: 0.000017 - momentum: 0.000000 2023-10-16 22:19:28,800 epoch 5 - iter 1540/1546 - loss 0.02672501 - time (sec): 69.03 - samples/sec: 1795.96 - lr: 0.000017 - momentum: 0.000000 2023-10-16 22:19:29,066 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:19:29,067 EPOCH 5 done: loss 0.0267 - lr: 0.000017 2023-10-16 22:19:31,125 DEV : loss 0.10103413462638855 - f1-score (micro avg) 0.7935 2023-10-16 22:19:31,138 saving best model 2023-10-16 22:19:31,609 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:19:38,520 epoch 6 - iter 154/1546 - loss 0.01115327 - time (sec): 6.91 - samples/sec: 1773.33 - lr: 0.000016 - momentum: 0.000000 2023-10-16 22:19:45,472 epoch 6 - iter 308/1546 - loss 0.01261767 - time (sec): 13.86 - samples/sec: 1775.01 - lr: 0.000016 - momentum: 0.000000 2023-10-16 22:19:52,353 epoch 6 - iter 462/1546 - loss 0.01720971 - time (sec): 20.74 - samples/sec: 1760.86 - lr: 0.000016 - momentum: 0.000000 2023-10-16 22:19:59,182 epoch 6 - iter 616/1546 - loss 0.01700125 - time (sec): 27.57 - samples/sec: 1789.53 - lr: 0.000015 - momentum: 0.000000 2023-10-16 22:20:05,960 epoch 6 - iter 770/1546 - loss 0.01632348 - time (sec): 34.35 - samples/sec: 1797.98 - lr: 0.000015 - momentum: 0.000000 2023-10-16 22:20:12,846 epoch 6 - iter 924/1546 - loss 0.01648542 - time (sec): 41.23 - samples/sec: 1822.12 - lr: 0.000015 - momentum: 0.000000 2023-10-16 22:20:19,764 epoch 6 - iter 1078/1546 - loss 0.01624090 - time (sec): 48.15 - samples/sec: 1807.88 - lr: 0.000014 - momentum: 0.000000 2023-10-16 22:20:26,576 epoch 6 - iter 1232/1546 - loss 0.01732653 - time (sec): 54.96 - samples/sec: 1798.95 - lr: 0.000014 - momentum: 0.000000 2023-10-16 22:20:33,435 epoch 6 - iter 1386/1546 - loss 0.01698745 - time (sec): 61.82 - samples/sec: 1796.22 - lr: 0.000014 - momentum: 0.000000 2023-10-16 22:20:40,310 epoch 6 - iter 1540/1546 - loss 0.01680358 - time (sec): 68.70 - samples/sec: 1804.24 - lr: 0.000013 - momentum: 0.000000 2023-10-16 22:20:40,579 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:20:40,579 EPOCH 6 done: loss 0.0168 - lr: 0.000013 2023-10-16 22:20:42,593 DEV : loss 0.10263525694608688 - f1-score (micro avg) 0.7835 2023-10-16 22:20:42,605 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:20:49,419 epoch 7 - iter 154/1546 - loss 0.01073330 - time (sec): 6.81 - samples/sec: 1767.10 - lr: 0.000013 - momentum: 0.000000 2023-10-16 22:20:56,281 epoch 7 - iter 308/1546 - loss 0.01268942 - time (sec): 13.67 - samples/sec: 1748.02 - lr: 0.000013 - momentum: 0.000000 2023-10-16 22:21:03,138 epoch 7 - iter 462/1546 - loss 0.01130454 - time (sec): 20.53 - samples/sec: 1771.55 - lr: 0.000012 - momentum: 0.000000 2023-10-16 22:21:09,977 epoch 7 - iter 616/1546 - loss 0.01193437 - time (sec): 27.37 - samples/sec: 1775.93 - lr: 0.000012 - momentum: 0.000000 2023-10-16 22:21:16,911 epoch 7 - iter 770/1546 - loss 0.01224745 - time (sec): 34.30 - samples/sec: 1789.42 - lr: 0.000012 - momentum: 0.000000 2023-10-16 22:21:23,792 epoch 7 - iter 924/1546 - loss 0.01174877 - time (sec): 41.19 - samples/sec: 1805.47 - lr: 0.000011 - momentum: 0.000000 2023-10-16 22:21:30,715 epoch 7 - iter 1078/1546 - loss 0.01116054 - time (sec): 48.11 - samples/sec: 1807.28 - lr: 0.000011 - momentum: 0.000000 2023-10-16 22:21:37,649 epoch 7 - iter 1232/1546 - loss 0.01077128 - time (sec): 55.04 - samples/sec: 1799.13 - lr: 0.000011 - momentum: 0.000000 2023-10-16 22:21:44,464 epoch 7 - iter 1386/1546 - loss 0.01068182 - time (sec): 61.86 - samples/sec: 1799.59 - lr: 0.000010 - momentum: 0.000000 2023-10-16 22:21:51,345 epoch 7 - iter 1540/1546 - loss 0.01058741 - time (sec): 68.74 - samples/sec: 1803.61 - lr: 0.000010 - momentum: 0.000000 2023-10-16 22:21:51,605 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:21:51,605 EPOCH 7 done: loss 0.0106 - lr: 0.000010 2023-10-16 22:21:53,621 DEV : loss 0.11100788414478302 - f1-score (micro avg) 0.7739 2023-10-16 22:21:53,634 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:22:00,462 epoch 8 - iter 154/1546 - loss 0.00712344 - time (sec): 6.83 - samples/sec: 1667.10 - lr: 0.000010 - momentum: 0.000000 2023-10-16 22:22:07,489 epoch 8 - iter 308/1546 - loss 0.00737014 - time (sec): 13.85 - samples/sec: 1786.24 - lr: 0.000009 - momentum: 0.000000 2023-10-16 22:22:14,432 epoch 8 - iter 462/1546 - loss 0.00844824 - time (sec): 20.80 - samples/sec: 1819.72 - lr: 0.000009 - momentum: 0.000000 2023-10-16 22:22:21,368 epoch 8 - iter 616/1546 - loss 0.00756277 - time (sec): 27.73 - samples/sec: 1839.47 - lr: 0.000009 - momentum: 0.000000 2023-10-16 22:22:28,177 epoch 8 - iter 770/1546 - loss 0.00738767 - time (sec): 34.54 - samples/sec: 1825.85 - lr: 0.000008 - momentum: 0.000000 2023-10-16 22:22:34,927 epoch 8 - iter 924/1546 - loss 0.00797400 - time (sec): 41.29 - samples/sec: 1809.26 - lr: 0.000008 - momentum: 0.000000 2023-10-16 22:22:41,703 epoch 8 - iter 1078/1546 - loss 0.00749557 - time (sec): 48.07 - samples/sec: 1812.26 - lr: 0.000008 - momentum: 0.000000 2023-10-16 22:22:48,592 epoch 8 - iter 1232/1546 - loss 0.00775779 - time (sec): 54.96 - samples/sec: 1814.63 - lr: 0.000007 - momentum: 0.000000 2023-10-16 22:22:55,442 epoch 8 - iter 1386/1546 - loss 0.00804980 - time (sec): 61.81 - samples/sec: 1814.91 - lr: 0.000007 - momentum: 0.000000 2023-10-16 22:23:02,230 epoch 8 - iter 1540/1546 - loss 0.00758625 - time (sec): 68.60 - samples/sec: 1804.27 - lr: 0.000007 - momentum: 0.000000 2023-10-16 22:23:02,492 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:23:02,492 EPOCH 8 done: loss 0.0076 - lr: 0.000007 2023-10-16 22:23:04,849 DEV : loss 0.11394007503986359 - f1-score (micro avg) 0.7967 2023-10-16 22:23:04,862 saving best model 2023-10-16 22:23:05,318 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:23:12,200 epoch 9 - iter 154/1546 - loss 0.00573957 - time (sec): 6.88 - samples/sec: 1709.00 - lr: 0.000006 - momentum: 0.000000 2023-10-16 22:23:19,147 epoch 9 - iter 308/1546 - loss 0.00372078 - time (sec): 13.83 - samples/sec: 1719.76 - lr: 0.000006 - momentum: 0.000000 2023-10-16 22:23:26,016 epoch 9 - iter 462/1546 - loss 0.00469241 - time (sec): 20.70 - samples/sec: 1796.59 - lr: 0.000006 - momentum: 0.000000 2023-10-16 22:23:32,827 epoch 9 - iter 616/1546 - loss 0.00728260 - time (sec): 27.51 - samples/sec: 1786.73 - lr: 0.000005 - momentum: 0.000000 2023-10-16 22:23:39,731 epoch 9 - iter 770/1546 - loss 0.00701758 - time (sec): 34.41 - samples/sec: 1807.80 - lr: 0.000005 - momentum: 0.000000 2023-10-16 22:23:46,646 epoch 9 - iter 924/1546 - loss 0.00660586 - time (sec): 41.33 - samples/sec: 1797.68 - lr: 0.000005 - momentum: 0.000000 2023-10-16 22:23:53,527 epoch 9 - iter 1078/1546 - loss 0.00602231 - time (sec): 48.21 - samples/sec: 1805.05 - lr: 0.000004 - momentum: 0.000000 2023-10-16 22:24:00,289 epoch 9 - iter 1232/1546 - loss 0.00642192 - time (sec): 54.97 - samples/sec: 1801.13 - lr: 0.000004 - momentum: 0.000000 2023-10-16 22:24:07,070 epoch 9 - iter 1386/1546 - loss 0.00626127 - time (sec): 61.75 - samples/sec: 1800.99 - lr: 0.000004 - momentum: 0.000000 2023-10-16 22:24:13,888 epoch 9 - iter 1540/1546 - loss 0.00615296 - time (sec): 68.57 - samples/sec: 1807.95 - lr: 0.000003 - momentum: 0.000000 2023-10-16 22:24:14,136 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:24:14,136 EPOCH 9 done: loss 0.0062 - lr: 0.000003 2023-10-16 22:24:16,156 DEV : loss 0.11375954002141953 - f1-score (micro avg) 0.7926 2023-10-16 22:24:16,169 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:24:22,994 epoch 10 - iter 154/1546 - loss 0.00252186 - time (sec): 6.82 - samples/sec: 1789.34 - lr: 0.000003 - momentum: 0.000000 2023-10-16 22:24:29,820 epoch 10 - iter 308/1546 - loss 0.00144410 - time (sec): 13.65 - samples/sec: 1820.64 - lr: 0.000003 - momentum: 0.000000 2023-10-16 22:24:36,641 epoch 10 - iter 462/1546 - loss 0.00151338 - time (sec): 20.47 - samples/sec: 1816.93 - lr: 0.000002 - momentum: 0.000000 2023-10-16 22:24:43,491 epoch 10 - iter 616/1546 - loss 0.00162804 - time (sec): 27.32 - samples/sec: 1819.88 - lr: 0.000002 - momentum: 0.000000 2023-10-16 22:24:50,236 epoch 10 - iter 770/1546 - loss 0.00277701 - time (sec): 34.07 - samples/sec: 1826.95 - lr: 0.000002 - momentum: 0.000000 2023-10-16 22:24:57,155 epoch 10 - iter 924/1546 - loss 0.00280504 - time (sec): 40.98 - samples/sec: 1826.24 - lr: 0.000001 - momentum: 0.000000 2023-10-16 22:25:04,089 epoch 10 - iter 1078/1546 - loss 0.00303272 - time (sec): 47.92 - samples/sec: 1814.17 - lr: 0.000001 - momentum: 0.000000 2023-10-16 22:25:10,898 epoch 10 - iter 1232/1546 - loss 0.00317340 - time (sec): 54.73 - samples/sec: 1815.67 - lr: 0.000001 - momentum: 0.000000 2023-10-16 22:25:17,708 epoch 10 - iter 1386/1546 - loss 0.00315536 - time (sec): 61.54 - samples/sec: 1813.27 - lr: 0.000000 - momentum: 0.000000 2023-10-16 22:25:24,582 epoch 10 - iter 1540/1546 - loss 0.00355373 - time (sec): 68.41 - samples/sec: 1811.86 - lr: 0.000000 - momentum: 0.000000 2023-10-16 22:25:24,842 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:25:24,842 EPOCH 10 done: loss 0.0035 - lr: 0.000000 2023-10-16 22:25:26,849 DEV : loss 0.11292611062526703 - f1-score (micro avg) 0.8017 2023-10-16 22:25:26,861 saving best model 2023-10-16 22:25:27,661 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:25:27,663 Loading model from best epoch ... 2023-10-16 22:25:29,149 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-16 22:25:35,105 Results: - F-score (micro) 0.8044 - F-score (macro) 0.7149 - Accuracy 0.6943 By class: precision recall f1-score support LOC 0.8392 0.8605 0.8497 946 BUILDING 0.6329 0.5405 0.5831 185 STREET 0.6774 0.7500 0.7119 56 micro avg 0.8034 0.8054 0.8044 1187 macro avg 0.7165 0.7170 0.7149 1187 weighted avg 0.7994 0.8054 0.8016 1187 2023-10-16 22:25:35,105 ----------------------------------------------------------------------------------------------------