2023-10-16 20:49:01,964 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:49:01,965 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-16 20:49:01,965 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:49:01,965 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-16 20:49:01,965 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:49:01,965 Train: 6183 sentences 2023-10-16 20:49:01,965 (train_with_dev=False, train_with_test=False) 2023-10-16 20:49:01,965 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:49:01,965 Training Params: 2023-10-16 20:49:01,965 - learning_rate: "3e-05" 2023-10-16 20:49:01,965 - mini_batch_size: "4" 2023-10-16 20:49:01,965 - max_epochs: "10" 2023-10-16 20:49:01,965 - shuffle: "True" 2023-10-16 20:49:01,965 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:49:01,965 Plugins: 2023-10-16 20:49:01,965 - LinearScheduler | warmup_fraction: '0.1' 2023-10-16 20:49:01,965 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:49:01,965 Final evaluation on model from best epoch (best-model.pt) 2023-10-16 20:49:01,965 - metric: "('micro avg', 'f1-score')" 2023-10-16 20:49:01,965 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:49:01,965 Computation: 2023-10-16 20:49:01,965 - compute on device: cuda:0 2023-10-16 20:49:01,966 - embedding storage: none 2023-10-16 20:49:01,966 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:49:01,966 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-16 20:49:01,966 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:49:01,966 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:49:10,232 epoch 1 - iter 154/1546 - loss 2.04050695 - time (sec): 8.27 - samples/sec: 1543.19 - lr: 0.000003 - momentum: 0.000000 2023-10-16 20:49:17,106 epoch 1 - iter 308/1546 - loss 1.18613454 - time (sec): 15.14 - samples/sec: 1607.15 - lr: 0.000006 - momentum: 0.000000 2023-10-16 20:49:23,884 epoch 1 - iter 462/1546 - loss 0.84944225 - time (sec): 21.92 - samples/sec: 1650.35 - lr: 0.000009 - momentum: 0.000000 2023-10-16 20:49:30,718 epoch 1 - iter 616/1546 - loss 0.67863507 - time (sec): 28.75 - samples/sec: 1680.50 - lr: 0.000012 - momentum: 0.000000 2023-10-16 20:49:37,588 epoch 1 - iter 770/1546 - loss 0.57353723 - time (sec): 35.62 - samples/sec: 1685.60 - lr: 0.000015 - momentum: 0.000000 2023-10-16 20:49:44,597 epoch 1 - iter 924/1546 - loss 0.49451384 - time (sec): 42.63 - samples/sec: 1706.02 - lr: 0.000018 - momentum: 0.000000 2023-10-16 20:49:51,543 epoch 1 - iter 1078/1546 - loss 0.43902727 - time (sec): 49.58 - samples/sec: 1721.20 - lr: 0.000021 - momentum: 0.000000 2023-10-16 20:49:58,427 epoch 1 - iter 1232/1546 - loss 0.39809745 - time (sec): 56.46 - samples/sec: 1739.61 - lr: 0.000024 - momentum: 0.000000 2023-10-16 20:50:05,581 epoch 1 - iter 1386/1546 - loss 0.36547430 - time (sec): 63.61 - samples/sec: 1747.62 - lr: 0.000027 - momentum: 0.000000 2023-10-16 20:50:12,568 epoch 1 - iter 1540/1546 - loss 0.33904665 - time (sec): 70.60 - samples/sec: 1755.91 - lr: 0.000030 - momentum: 0.000000 2023-10-16 20:50:12,830 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:50:12,830 EPOCH 1 done: loss 0.3384 - lr: 0.000030 2023-10-16 20:50:14,597 DEV : loss 0.06913874298334122 - f1-score (micro avg) 0.7313 2023-10-16 20:50:14,612 saving best model 2023-10-16 20:50:15,166 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:50:23,126 epoch 2 - iter 154/1546 - loss 0.08438780 - time (sec): 7.96 - samples/sec: 1597.76 - lr: 0.000030 - momentum: 0.000000 2023-10-16 20:50:31,248 epoch 2 - iter 308/1546 - loss 0.09627467 - time (sec): 16.08 - samples/sec: 1582.08 - lr: 0.000029 - momentum: 0.000000 2023-10-16 20:50:38,715 epoch 2 - iter 462/1546 - loss 0.09832161 - time (sec): 23.55 - samples/sec: 1600.82 - lr: 0.000029 - momentum: 0.000000 2023-10-16 20:50:46,164 epoch 2 - iter 616/1546 - loss 0.09084457 - time (sec): 31.00 - samples/sec: 1635.83 - lr: 0.000029 - momentum: 0.000000 2023-10-16 20:50:53,052 epoch 2 - iter 770/1546 - loss 0.09363305 - time (sec): 37.88 - samples/sec: 1655.86 - lr: 0.000028 - momentum: 0.000000 2023-10-16 20:50:59,879 epoch 2 - iter 924/1546 - loss 0.09333953 - time (sec): 44.71 - samples/sec: 1682.51 - lr: 0.000028 - momentum: 0.000000 2023-10-16 20:51:06,756 epoch 2 - iter 1078/1546 - loss 0.09219892 - time (sec): 51.59 - samples/sec: 1692.81 - lr: 0.000028 - momentum: 0.000000 2023-10-16 20:51:13,668 epoch 2 - iter 1232/1546 - loss 0.09015760 - time (sec): 58.50 - samples/sec: 1693.79 - lr: 0.000027 - momentum: 0.000000 2023-10-16 20:51:20,778 epoch 2 - iter 1386/1546 - loss 0.08869022 - time (sec): 65.61 - samples/sec: 1697.34 - lr: 0.000027 - momentum: 0.000000 2023-10-16 20:51:27,731 epoch 2 - iter 1540/1546 - loss 0.08975901 - time (sec): 72.56 - samples/sec: 1704.23 - lr: 0.000027 - momentum: 0.000000 2023-10-16 20:51:28,036 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:51:28,036 EPOCH 2 done: loss 0.0897 - lr: 0.000027 2023-10-16 20:51:30,497 DEV : loss 0.062400542199611664 - f1-score (micro avg) 0.7432 2023-10-16 20:51:30,511 saving best model 2023-10-16 20:51:31,004 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:51:37,901 epoch 3 - iter 154/1546 - loss 0.06053105 - time (sec): 6.89 - samples/sec: 1630.47 - lr: 0.000026 - momentum: 0.000000 2023-10-16 20:51:44,765 epoch 3 - iter 308/1546 - loss 0.05492019 - time (sec): 13.76 - samples/sec: 1745.54 - lr: 0.000026 - momentum: 0.000000 2023-10-16 20:51:51,617 epoch 3 - iter 462/1546 - loss 0.06003718 - time (sec): 20.61 - samples/sec: 1764.49 - lr: 0.000026 - momentum: 0.000000 2023-10-16 20:51:58,419 epoch 3 - iter 616/1546 - loss 0.05518600 - time (sec): 27.41 - samples/sec: 1771.94 - lr: 0.000025 - momentum: 0.000000 2023-10-16 20:52:05,480 epoch 3 - iter 770/1546 - loss 0.05822749 - time (sec): 34.47 - samples/sec: 1779.59 - lr: 0.000025 - momentum: 0.000000 2023-10-16 20:52:12,376 epoch 3 - iter 924/1546 - loss 0.05981956 - time (sec): 41.37 - samples/sec: 1781.70 - lr: 0.000025 - momentum: 0.000000 2023-10-16 20:52:19,309 epoch 3 - iter 1078/1546 - loss 0.05732479 - time (sec): 48.30 - samples/sec: 1784.82 - lr: 0.000024 - momentum: 0.000000 2023-10-16 20:52:26,317 epoch 3 - iter 1232/1546 - loss 0.06037730 - time (sec): 55.31 - samples/sec: 1784.84 - lr: 0.000024 - momentum: 0.000000 2023-10-16 20:52:33,347 epoch 3 - iter 1386/1546 - loss 0.05953915 - time (sec): 62.34 - samples/sec: 1779.99 - lr: 0.000024 - momentum: 0.000000 2023-10-16 20:52:40,504 epoch 3 - iter 1540/1546 - loss 0.06029278 - time (sec): 69.50 - samples/sec: 1778.49 - lr: 0.000023 - momentum: 0.000000 2023-10-16 20:52:40,824 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:52:40,824 EPOCH 3 done: loss 0.0600 - lr: 0.000023 2023-10-16 20:52:42,988 DEV : loss 0.07797159254550934 - f1-score (micro avg) 0.7601 2023-10-16 20:52:43,006 saving best model 2023-10-16 20:52:43,593 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:52:50,785 epoch 4 - iter 154/1546 - loss 0.04625746 - time (sec): 7.19 - samples/sec: 1675.40 - lr: 0.000023 - momentum: 0.000000 2023-10-16 20:52:57,832 epoch 4 - iter 308/1546 - loss 0.04071889 - time (sec): 14.24 - samples/sec: 1805.67 - lr: 0.000023 - momentum: 0.000000 2023-10-16 20:53:04,736 epoch 4 - iter 462/1546 - loss 0.04042915 - time (sec): 21.14 - samples/sec: 1771.77 - lr: 0.000022 - momentum: 0.000000 2023-10-16 20:53:11,614 epoch 4 - iter 616/1546 - loss 0.03895070 - time (sec): 28.02 - samples/sec: 1788.97 - lr: 0.000022 - momentum: 0.000000 2023-10-16 20:53:18,562 epoch 4 - iter 770/1546 - loss 0.03791720 - time (sec): 34.97 - samples/sec: 1806.84 - lr: 0.000022 - momentum: 0.000000 2023-10-16 20:53:25,352 epoch 4 - iter 924/1546 - loss 0.03865879 - time (sec): 41.76 - samples/sec: 1805.06 - lr: 0.000021 - momentum: 0.000000 2023-10-16 20:53:32,304 epoch 4 - iter 1078/1546 - loss 0.03898360 - time (sec): 48.71 - samples/sec: 1803.59 - lr: 0.000021 - momentum: 0.000000 2023-10-16 20:53:39,033 epoch 4 - iter 1232/1546 - loss 0.03898183 - time (sec): 55.44 - samples/sec: 1798.60 - lr: 0.000021 - momentum: 0.000000 2023-10-16 20:53:46,010 epoch 4 - iter 1386/1546 - loss 0.03954707 - time (sec): 62.41 - samples/sec: 1803.81 - lr: 0.000020 - momentum: 0.000000 2023-10-16 20:53:52,758 epoch 4 - iter 1540/1546 - loss 0.03922934 - time (sec): 69.16 - samples/sec: 1791.34 - lr: 0.000020 - momentum: 0.000000 2023-10-16 20:53:53,022 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:53:53,022 EPOCH 4 done: loss 0.0392 - lr: 0.000020 2023-10-16 20:53:55,308 DEV : loss 0.1010911613702774 - f1-score (micro avg) 0.7709 2023-10-16 20:53:55,321 saving best model 2023-10-16 20:53:55,833 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:54:02,895 epoch 5 - iter 154/1546 - loss 0.03491660 - time (sec): 7.06 - samples/sec: 1891.66 - lr: 0.000020 - momentum: 0.000000 2023-10-16 20:54:09,782 epoch 5 - iter 308/1546 - loss 0.03112251 - time (sec): 13.94 - samples/sec: 1894.77 - lr: 0.000019 - momentum: 0.000000 2023-10-16 20:54:16,689 epoch 5 - iter 462/1546 - loss 0.02985586 - time (sec): 20.85 - samples/sec: 1863.01 - lr: 0.000019 - momentum: 0.000000 2023-10-16 20:54:23,397 epoch 5 - iter 616/1546 - loss 0.02761481 - time (sec): 27.56 - samples/sec: 1839.34 - lr: 0.000019 - momentum: 0.000000 2023-10-16 20:54:30,218 epoch 5 - iter 770/1546 - loss 0.02541894 - time (sec): 34.38 - samples/sec: 1800.90 - lr: 0.000018 - momentum: 0.000000 2023-10-16 20:54:37,025 epoch 5 - iter 924/1546 - loss 0.02635009 - time (sec): 41.19 - samples/sec: 1789.32 - lr: 0.000018 - momentum: 0.000000 2023-10-16 20:54:43,921 epoch 5 - iter 1078/1546 - loss 0.02918319 - time (sec): 48.08 - samples/sec: 1769.74 - lr: 0.000018 - momentum: 0.000000 2023-10-16 20:54:51,045 epoch 5 - iter 1232/1546 - loss 0.02832300 - time (sec): 55.21 - samples/sec: 1778.57 - lr: 0.000017 - momentum: 0.000000 2023-10-16 20:54:58,031 epoch 5 - iter 1386/1546 - loss 0.02761248 - time (sec): 62.19 - samples/sec: 1782.93 - lr: 0.000017 - momentum: 0.000000 2023-10-16 20:55:05,177 epoch 5 - iter 1540/1546 - loss 0.02672369 - time (sec): 69.34 - samples/sec: 1785.53 - lr: 0.000017 - momentum: 0.000000 2023-10-16 20:55:05,463 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:55:05,463 EPOCH 5 done: loss 0.0271 - lr: 0.000017 2023-10-16 20:55:07,686 DEV : loss 0.10391230136156082 - f1-score (micro avg) 0.7862 2023-10-16 20:55:07,703 saving best model 2023-10-16 20:55:08,185 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:55:15,057 epoch 6 - iter 154/1546 - loss 0.02493865 - time (sec): 6.87 - samples/sec: 1834.39 - lr: 0.000016 - momentum: 0.000000 2023-10-16 20:55:21,919 epoch 6 - iter 308/1546 - loss 0.02188742 - time (sec): 13.73 - samples/sec: 1789.72 - lr: 0.000016 - momentum: 0.000000 2023-10-16 20:55:28,980 epoch 6 - iter 462/1546 - loss 0.02105559 - time (sec): 20.79 - samples/sec: 1779.32 - lr: 0.000016 - momentum: 0.000000 2023-10-16 20:55:35,949 epoch 6 - iter 616/1546 - loss 0.01961893 - time (sec): 27.76 - samples/sec: 1787.26 - lr: 0.000015 - momentum: 0.000000 2023-10-16 20:55:42,862 epoch 6 - iter 770/1546 - loss 0.02113852 - time (sec): 34.68 - samples/sec: 1793.89 - lr: 0.000015 - momentum: 0.000000 2023-10-16 20:55:49,800 epoch 6 - iter 924/1546 - loss 0.02137338 - time (sec): 41.61 - samples/sec: 1790.78 - lr: 0.000015 - momentum: 0.000000 2023-10-16 20:55:56,804 epoch 6 - iter 1078/1546 - loss 0.02060005 - time (sec): 48.62 - samples/sec: 1792.52 - lr: 0.000014 - momentum: 0.000000 2023-10-16 20:56:03,662 epoch 6 - iter 1232/1546 - loss 0.02034437 - time (sec): 55.48 - samples/sec: 1798.00 - lr: 0.000014 - momentum: 0.000000 2023-10-16 20:56:10,662 epoch 6 - iter 1386/1546 - loss 0.02021798 - time (sec): 62.48 - samples/sec: 1792.01 - lr: 0.000014 - momentum: 0.000000 2023-10-16 20:56:17,575 epoch 6 - iter 1540/1546 - loss 0.02057104 - time (sec): 69.39 - samples/sec: 1783.53 - lr: 0.000013 - momentum: 0.000000 2023-10-16 20:56:17,837 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:56:17,837 EPOCH 6 done: loss 0.0205 - lr: 0.000013 2023-10-16 20:56:20,024 DEV : loss 0.10331042110919952 - f1-score (micro avg) 0.7887 2023-10-16 20:56:20,038 saving best model 2023-10-16 20:56:20,460 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:56:27,345 epoch 7 - iter 154/1546 - loss 0.01205076 - time (sec): 6.88 - samples/sec: 1702.65 - lr: 0.000013 - momentum: 0.000000 2023-10-16 20:56:34,241 epoch 7 - iter 308/1546 - loss 0.01298242 - time (sec): 13.78 - samples/sec: 1755.29 - lr: 0.000013 - momentum: 0.000000 2023-10-16 20:56:41,193 epoch 7 - iter 462/1546 - loss 0.01346244 - time (sec): 20.73 - samples/sec: 1757.85 - lr: 0.000012 - momentum: 0.000000 2023-10-16 20:56:48,314 epoch 7 - iter 616/1546 - loss 0.01319763 - time (sec): 27.85 - samples/sec: 1746.00 - lr: 0.000012 - momentum: 0.000000 2023-10-16 20:56:55,311 epoch 7 - iter 770/1546 - loss 0.01263468 - time (sec): 34.85 - samples/sec: 1744.52 - lr: 0.000012 - momentum: 0.000000 2023-10-16 20:57:02,573 epoch 7 - iter 924/1546 - loss 0.01299134 - time (sec): 42.11 - samples/sec: 1749.31 - lr: 0.000011 - momentum: 0.000000 2023-10-16 20:57:09,583 epoch 7 - iter 1078/1546 - loss 0.01334880 - time (sec): 49.12 - samples/sec: 1760.41 - lr: 0.000011 - momentum: 0.000000 2023-10-16 20:57:16,841 epoch 7 - iter 1232/1546 - loss 0.01406829 - time (sec): 56.38 - samples/sec: 1749.55 - lr: 0.000011 - momentum: 0.000000 2023-10-16 20:57:23,787 epoch 7 - iter 1386/1546 - loss 0.01364924 - time (sec): 63.33 - samples/sec: 1740.60 - lr: 0.000010 - momentum: 0.000000 2023-10-16 20:57:31,145 epoch 7 - iter 1540/1546 - loss 0.01416500 - time (sec): 70.68 - samples/sec: 1752.96 - lr: 0.000010 - momentum: 0.000000 2023-10-16 20:57:31,419 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:57:31,419 EPOCH 7 done: loss 0.0141 - lr: 0.000010 2023-10-16 20:57:33,633 DEV : loss 0.10377668589353561 - f1-score (micro avg) 0.7869 2023-10-16 20:57:33,646 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:57:40,710 epoch 8 - iter 154/1546 - loss 0.00994108 - time (sec): 7.06 - samples/sec: 1755.04 - lr: 0.000010 - momentum: 0.000000 2023-10-16 20:57:47,679 epoch 8 - iter 308/1546 - loss 0.01034994 - time (sec): 14.03 - samples/sec: 1737.29 - lr: 0.000009 - momentum: 0.000000 2023-10-16 20:57:54,447 epoch 8 - iter 462/1546 - loss 0.01317594 - time (sec): 20.80 - samples/sec: 1732.51 - lr: 0.000009 - momentum: 0.000000 2023-10-16 20:58:01,521 epoch 8 - iter 616/1546 - loss 0.01148351 - time (sec): 27.87 - samples/sec: 1744.13 - lr: 0.000009 - momentum: 0.000000 2023-10-16 20:58:08,660 epoch 8 - iter 770/1546 - loss 0.01104651 - time (sec): 35.01 - samples/sec: 1756.59 - lr: 0.000008 - momentum: 0.000000 2023-10-16 20:58:15,694 epoch 8 - iter 924/1546 - loss 0.00978163 - time (sec): 42.05 - samples/sec: 1733.72 - lr: 0.000008 - momentum: 0.000000 2023-10-16 20:58:22,674 epoch 8 - iter 1078/1546 - loss 0.00959353 - time (sec): 49.03 - samples/sec: 1742.16 - lr: 0.000008 - momentum: 0.000000 2023-10-16 20:58:29,695 epoch 8 - iter 1232/1546 - loss 0.00941851 - time (sec): 56.05 - samples/sec: 1745.65 - lr: 0.000007 - momentum: 0.000000 2023-10-16 20:58:36,930 epoch 8 - iter 1386/1546 - loss 0.00865968 - time (sec): 63.28 - samples/sec: 1755.25 - lr: 0.000007 - momentum: 0.000000 2023-10-16 20:58:44,146 epoch 8 - iter 1540/1546 - loss 0.00911298 - time (sec): 70.50 - samples/sec: 1756.63 - lr: 0.000007 - momentum: 0.000000 2023-10-16 20:58:44,415 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:58:44,415 EPOCH 8 done: loss 0.0091 - lr: 0.000007 2023-10-16 20:58:46,570 DEV : loss 0.11925285309553146 - f1-score (micro avg) 0.7832 2023-10-16 20:58:46,582 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:58:53,588 epoch 9 - iter 154/1546 - loss 0.01184549 - time (sec): 7.00 - samples/sec: 1841.40 - lr: 0.000006 - momentum: 0.000000 2023-10-16 20:59:00,766 epoch 9 - iter 308/1546 - loss 0.00842207 - time (sec): 14.18 - samples/sec: 1827.96 - lr: 0.000006 - momentum: 0.000000 2023-10-16 20:59:07,864 epoch 9 - iter 462/1546 - loss 0.00902081 - time (sec): 21.28 - samples/sec: 1811.55 - lr: 0.000006 - momentum: 0.000000 2023-10-16 20:59:15,008 epoch 9 - iter 616/1546 - loss 0.00794030 - time (sec): 28.42 - samples/sec: 1782.48 - lr: 0.000005 - momentum: 0.000000 2023-10-16 20:59:22,257 epoch 9 - iter 770/1546 - loss 0.00768491 - time (sec): 35.67 - samples/sec: 1766.33 - lr: 0.000005 - momentum: 0.000000 2023-10-16 20:59:29,350 epoch 9 - iter 924/1546 - loss 0.00720504 - time (sec): 42.77 - samples/sec: 1766.77 - lr: 0.000005 - momentum: 0.000000 2023-10-16 20:59:36,291 epoch 9 - iter 1078/1546 - loss 0.00677038 - time (sec): 49.71 - samples/sec: 1760.48 - lr: 0.000004 - momentum: 0.000000 2023-10-16 20:59:43,351 epoch 9 - iter 1232/1546 - loss 0.00643558 - time (sec): 56.77 - samples/sec: 1761.58 - lr: 0.000004 - momentum: 0.000000 2023-10-16 20:59:50,357 epoch 9 - iter 1386/1546 - loss 0.00636983 - time (sec): 63.77 - samples/sec: 1756.55 - lr: 0.000004 - momentum: 0.000000 2023-10-16 20:59:57,350 epoch 9 - iter 1540/1546 - loss 0.00634724 - time (sec): 70.77 - samples/sec: 1751.55 - lr: 0.000003 - momentum: 0.000000 2023-10-16 20:59:57,618 ---------------------------------------------------------------------------------------------------- 2023-10-16 20:59:57,618 EPOCH 9 done: loss 0.0064 - lr: 0.000003 2023-10-16 20:59:59,773 DEV : loss 0.1136489063501358 - f1-score (micro avg) 0.7821 2023-10-16 20:59:59,786 ---------------------------------------------------------------------------------------------------- 2023-10-16 21:00:06,666 epoch 10 - iter 154/1546 - loss 0.00181108 - time (sec): 6.88 - samples/sec: 1703.72 - lr: 0.000003 - momentum: 0.000000 2023-10-16 21:00:13,607 epoch 10 - iter 308/1546 - loss 0.00269566 - time (sec): 13.82 - samples/sec: 1733.95 - lr: 0.000003 - momentum: 0.000000 2023-10-16 21:00:20,897 epoch 10 - iter 462/1546 - loss 0.00441727 - time (sec): 21.11 - samples/sec: 1737.06 - lr: 0.000002 - momentum: 0.000000 2023-10-16 21:00:28,009 epoch 10 - iter 616/1546 - loss 0.00384133 - time (sec): 28.22 - samples/sec: 1775.35 - lr: 0.000002 - momentum: 0.000000 2023-10-16 21:00:35,093 epoch 10 - iter 770/1546 - loss 0.00365736 - time (sec): 35.31 - samples/sec: 1772.83 - lr: 0.000002 - momentum: 0.000000 2023-10-16 21:00:42,021 epoch 10 - iter 924/1546 - loss 0.00403936 - time (sec): 42.23 - samples/sec: 1765.43 - lr: 0.000001 - momentum: 0.000000 2023-10-16 21:00:49,101 epoch 10 - iter 1078/1546 - loss 0.00464171 - time (sec): 49.31 - samples/sec: 1781.32 - lr: 0.000001 - momentum: 0.000000 2023-10-16 21:00:56,091 epoch 10 - iter 1232/1546 - loss 0.00435505 - time (sec): 56.30 - samples/sec: 1762.10 - lr: 0.000001 - momentum: 0.000000 2023-10-16 21:01:03,204 epoch 10 - iter 1386/1546 - loss 0.00412570 - time (sec): 63.42 - samples/sec: 1764.89 - lr: 0.000000 - momentum: 0.000000 2023-10-16 21:01:10,180 epoch 10 - iter 1540/1546 - loss 0.00413572 - time (sec): 70.39 - samples/sec: 1761.54 - lr: 0.000000 - momentum: 0.000000 2023-10-16 21:01:10,429 ---------------------------------------------------------------------------------------------------- 2023-10-16 21:01:10,429 EPOCH 10 done: loss 0.0041 - lr: 0.000000 2023-10-16 21:01:12,588 DEV : loss 0.11747898161411285 - f1-score (micro avg) 0.791 2023-10-16 21:01:12,604 saving best model 2023-10-16 21:01:13,588 ---------------------------------------------------------------------------------------------------- 2023-10-16 21:01:13,590 Loading model from best epoch ... 2023-10-16 21:01:15,618 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-16 21:01:22,263 Results: - F-score (micro) 0.8175 - F-score (macro) 0.7428 - Accuracy 0.7133 By class: precision recall f1-score support LOC 0.8397 0.8636 0.8515 946 BUILDING 0.6584 0.7189 0.6873 185 STREET 0.6667 0.7143 0.6897 56 micro avg 0.8016 0.8340 0.8175 1187 macro avg 0.7216 0.7656 0.7428 1187 weighted avg 0.8033 0.8340 0.8183 1187 2023-10-16 21:01:22,264 ----------------------------------------------------------------------------------------------------