2023-10-18 22:08:20,553 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:08:20,553 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 22:08:20,553 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:08:20,554 MultiCorpus: 5777 train + 722 dev + 723 test sentences - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl 2023-10-18 22:08:20,554 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:08:20,554 Train: 5777 sentences 2023-10-18 22:08:20,554 (train_with_dev=False, train_with_test=False) 2023-10-18 22:08:20,554 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:08:20,554 Training Params: 2023-10-18 22:08:20,554 - learning_rate: "3e-05" 2023-10-18 22:08:20,554 - mini_batch_size: "8" 2023-10-18 22:08:20,554 - max_epochs: "10" 2023-10-18 22:08:20,554 - shuffle: "True" 2023-10-18 22:08:20,554 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:08:20,554 Plugins: 2023-10-18 22:08:20,554 - TensorboardLogger 2023-10-18 22:08:20,554 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 22:08:20,554 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:08:20,554 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 22:08:20,554 - metric: "('micro avg', 'f1-score')" 2023-10-18 22:08:20,554 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:08:20,554 Computation: 2023-10-18 22:08:20,554 - compute on device: cuda:0 2023-10-18 22:08:20,554 - embedding storage: none 2023-10-18 22:08:20,554 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:08:20,554 Model training base path: "hmbench-icdar/nl-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-18 22:08:20,554 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:08:20,554 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:08:20,554 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 22:08:22,339 epoch 1 - iter 72/723 - loss 3.19628147 - time (sec): 1.78 - samples/sec: 9675.62 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:08:24,206 epoch 1 - iter 144/723 - loss 2.98886129 - time (sec): 3.65 - samples/sec: 9785.27 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:08:26,021 epoch 1 - iter 216/723 - loss 2.66211908 - time (sec): 5.47 - samples/sec: 9773.41 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:08:27,869 epoch 1 - iter 288/723 - loss 2.30691520 - time (sec): 7.31 - samples/sec: 9694.54 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:08:29,724 epoch 1 - iter 360/723 - loss 1.94819460 - time (sec): 9.17 - samples/sec: 9712.82 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:08:31,493 epoch 1 - iter 432/723 - loss 1.67747636 - time (sec): 10.94 - samples/sec: 9797.48 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:08:33,327 epoch 1 - iter 504/723 - loss 1.48735944 - time (sec): 12.77 - samples/sec: 9773.89 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:08:35,091 epoch 1 - iter 576/723 - loss 1.34523933 - time (sec): 14.54 - samples/sec: 9791.10 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:08:36,824 epoch 1 - iter 648/723 - loss 1.23775011 - time (sec): 16.27 - samples/sec: 9748.59 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:08:38,548 epoch 1 - iter 720/723 - loss 1.14588291 - time (sec): 17.99 - samples/sec: 9754.09 - lr: 0.000030 - momentum: 0.000000 2023-10-18 22:08:38,637 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:08:38,637 EPOCH 1 done: loss 1.1433 - lr: 0.000030 2023-10-18 22:08:39,957 DEV : loss 0.36958980560302734 - f1-score (micro avg) 0.0 2023-10-18 22:08:39,971 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:08:41,819 epoch 2 - iter 72/723 - loss 0.27018856 - time (sec): 1.85 - samples/sec: 10070.33 - lr: 0.000030 - momentum: 0.000000 2023-10-18 22:08:43,579 epoch 2 - iter 144/723 - loss 0.28145989 - time (sec): 3.61 - samples/sec: 9918.45 - lr: 0.000029 - momentum: 0.000000 2023-10-18 22:08:45,396 epoch 2 - iter 216/723 - loss 0.28554002 - time (sec): 5.42 - samples/sec: 9927.00 - lr: 0.000029 - momentum: 0.000000 2023-10-18 22:08:47,278 epoch 2 - iter 288/723 - loss 0.27514712 - time (sec): 7.31 - samples/sec: 9805.31 - lr: 0.000029 - momentum: 0.000000 2023-10-18 22:08:49,104 epoch 2 - iter 360/723 - loss 0.26170040 - time (sec): 9.13 - samples/sec: 9725.21 - lr: 0.000028 - momentum: 0.000000 2023-10-18 22:08:50,949 epoch 2 - iter 432/723 - loss 0.25484094 - time (sec): 10.98 - samples/sec: 9792.59 - lr: 0.000028 - momentum: 0.000000 2023-10-18 22:08:52,671 epoch 2 - iter 504/723 - loss 0.25302933 - time (sec): 12.70 - samples/sec: 9736.45 - lr: 0.000028 - momentum: 0.000000 2023-10-18 22:08:54,386 epoch 2 - iter 576/723 - loss 0.24981194 - time (sec): 14.42 - samples/sec: 9726.09 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:08:56,178 epoch 2 - iter 648/723 - loss 0.24711782 - time (sec): 16.21 - samples/sec: 9733.59 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:08:57,963 epoch 2 - iter 720/723 - loss 0.24099947 - time (sec): 17.99 - samples/sec: 9770.09 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:08:58,019 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:08:58,020 EPOCH 2 done: loss 0.2412 - lr: 0.000027 2023-10-18 22:09:00,129 DEV : loss 0.24794618785381317 - f1-score (micro avg) 0.1606 2023-10-18 22:09:00,145 saving best model 2023-10-18 22:09:00,182 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:09:02,154 epoch 3 - iter 72/723 - loss 0.21367677 - time (sec): 1.97 - samples/sec: 9104.26 - lr: 0.000026 - momentum: 0.000000 2023-10-18 22:09:03,878 epoch 3 - iter 144/723 - loss 0.21394777 - time (sec): 3.70 - samples/sec: 9404.23 - lr: 0.000026 - momentum: 0.000000 2023-10-18 22:09:05,631 epoch 3 - iter 216/723 - loss 0.20709447 - time (sec): 5.45 - samples/sec: 9586.09 - lr: 0.000026 - momentum: 0.000000 2023-10-18 22:09:07,509 epoch 3 - iter 288/723 - loss 0.19573277 - time (sec): 7.33 - samples/sec: 9663.93 - lr: 0.000025 - momentum: 0.000000 2023-10-18 22:09:09,309 epoch 3 - iter 360/723 - loss 0.19576608 - time (sec): 9.13 - samples/sec: 9638.03 - lr: 0.000025 - momentum: 0.000000 2023-10-18 22:09:11,120 epoch 3 - iter 432/723 - loss 0.19557293 - time (sec): 10.94 - samples/sec: 9644.08 - lr: 0.000025 - momentum: 0.000000 2023-10-18 22:09:12,825 epoch 3 - iter 504/723 - loss 0.19629812 - time (sec): 12.64 - samples/sec: 9660.73 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:09:14,649 epoch 3 - iter 576/723 - loss 0.19836773 - time (sec): 14.47 - samples/sec: 9671.57 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:09:16,513 epoch 3 - iter 648/723 - loss 0.19587120 - time (sec): 16.33 - samples/sec: 9673.08 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:09:18,375 epoch 3 - iter 720/723 - loss 0.19640335 - time (sec): 18.19 - samples/sec: 9661.70 - lr: 0.000023 - momentum: 0.000000 2023-10-18 22:09:18,427 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:09:18,427 EPOCH 3 done: loss 0.1963 - lr: 0.000023 2023-10-18 22:09:20,180 DEV : loss 0.23398783802986145 - f1-score (micro avg) 0.2566 2023-10-18 22:09:20,195 saving best model 2023-10-18 22:09:20,231 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:09:22,057 epoch 4 - iter 72/723 - loss 0.17513429 - time (sec): 1.82 - samples/sec: 9662.51 - lr: 0.000023 - momentum: 0.000000 2023-10-18 22:09:23,842 epoch 4 - iter 144/723 - loss 0.17318933 - time (sec): 3.61 - samples/sec: 9551.33 - lr: 0.000023 - momentum: 0.000000 2023-10-18 22:09:25,544 epoch 4 - iter 216/723 - loss 0.18058374 - time (sec): 5.31 - samples/sec: 9862.22 - lr: 0.000022 - momentum: 0.000000 2023-10-18 22:09:27,220 epoch 4 - iter 288/723 - loss 0.17700677 - time (sec): 6.99 - samples/sec: 10223.72 - lr: 0.000022 - momentum: 0.000000 2023-10-18 22:09:28,987 epoch 4 - iter 360/723 - loss 0.17605842 - time (sec): 8.75 - samples/sec: 10119.42 - lr: 0.000022 - momentum: 0.000000 2023-10-18 22:09:30,862 epoch 4 - iter 432/723 - loss 0.18006832 - time (sec): 10.63 - samples/sec: 10053.19 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:09:32,700 epoch 4 - iter 504/723 - loss 0.17871246 - time (sec): 12.47 - samples/sec: 10026.68 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:09:34,452 epoch 4 - iter 576/723 - loss 0.17713673 - time (sec): 14.22 - samples/sec: 10011.82 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:09:36,217 epoch 4 - iter 648/723 - loss 0.17726623 - time (sec): 15.98 - samples/sec: 9931.35 - lr: 0.000020 - momentum: 0.000000 2023-10-18 22:09:37,995 epoch 4 - iter 720/723 - loss 0.18135238 - time (sec): 17.76 - samples/sec: 9890.23 - lr: 0.000020 - momentum: 0.000000 2023-10-18 22:09:38,053 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:09:38,053 EPOCH 4 done: loss 0.1813 - lr: 0.000020 2023-10-18 22:09:40,150 DEV : loss 0.21179497241973877 - f1-score (micro avg) 0.4558 2023-10-18 22:09:40,164 saving best model 2023-10-18 22:09:40,198 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:09:42,041 epoch 5 - iter 72/723 - loss 0.19369549 - time (sec): 1.84 - samples/sec: 9869.29 - lr: 0.000020 - momentum: 0.000000 2023-10-18 22:09:43,801 epoch 5 - iter 144/723 - loss 0.18409304 - time (sec): 3.60 - samples/sec: 9974.14 - lr: 0.000019 - momentum: 0.000000 2023-10-18 22:09:45,514 epoch 5 - iter 216/723 - loss 0.18004779 - time (sec): 5.32 - samples/sec: 9724.56 - lr: 0.000019 - momentum: 0.000000 2023-10-18 22:09:47,330 epoch 5 - iter 288/723 - loss 0.17894839 - time (sec): 7.13 - samples/sec: 9574.37 - lr: 0.000019 - momentum: 0.000000 2023-10-18 22:09:49,078 epoch 5 - iter 360/723 - loss 0.17522117 - time (sec): 8.88 - samples/sec: 9572.50 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:09:50,902 epoch 5 - iter 432/723 - loss 0.17234762 - time (sec): 10.70 - samples/sec: 9692.67 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:09:52,702 epoch 5 - iter 504/723 - loss 0.17119392 - time (sec): 12.50 - samples/sec: 9721.98 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:09:54,427 epoch 5 - iter 576/723 - loss 0.17011314 - time (sec): 14.23 - samples/sec: 9782.07 - lr: 0.000017 - momentum: 0.000000 2023-10-18 22:09:56,281 epoch 5 - iter 648/723 - loss 0.17258346 - time (sec): 16.08 - samples/sec: 9755.92 - lr: 0.000017 - momentum: 0.000000 2023-10-18 22:09:58,160 epoch 5 - iter 720/723 - loss 0.17140290 - time (sec): 17.96 - samples/sec: 9775.89 - lr: 0.000017 - momentum: 0.000000 2023-10-18 22:09:58,222 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:09:58,223 EPOCH 5 done: loss 0.1715 - lr: 0.000017 2023-10-18 22:09:59,986 DEV : loss 0.20946592092514038 - f1-score (micro avg) 0.4311 2023-10-18 22:10:00,001 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:10:01,737 epoch 6 - iter 72/723 - loss 0.15932427 - time (sec): 1.74 - samples/sec: 9819.08 - lr: 0.000016 - momentum: 0.000000 2023-10-18 22:10:03,525 epoch 6 - iter 144/723 - loss 0.16318147 - time (sec): 3.52 - samples/sec: 9748.74 - lr: 0.000016 - momentum: 0.000000 2023-10-18 22:10:05,353 epoch 6 - iter 216/723 - loss 0.17376298 - time (sec): 5.35 - samples/sec: 9741.66 - lr: 0.000016 - momentum: 0.000000 2023-10-18 22:10:07,055 epoch 6 - iter 288/723 - loss 0.17504093 - time (sec): 7.05 - samples/sec: 9699.27 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:10:08,797 epoch 6 - iter 360/723 - loss 0.16835316 - time (sec): 8.79 - samples/sec: 9839.54 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:10:10,614 epoch 6 - iter 432/723 - loss 0.16479689 - time (sec): 10.61 - samples/sec: 9759.78 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:10:12,459 epoch 6 - iter 504/723 - loss 0.16660146 - time (sec): 12.46 - samples/sec: 9848.56 - lr: 0.000014 - momentum: 0.000000 2023-10-18 22:10:14,270 epoch 6 - iter 576/723 - loss 0.16492361 - time (sec): 14.27 - samples/sec: 9864.81 - lr: 0.000014 - momentum: 0.000000 2023-10-18 22:10:16,409 epoch 6 - iter 648/723 - loss 0.16587764 - time (sec): 16.41 - samples/sec: 9696.86 - lr: 0.000014 - momentum: 0.000000 2023-10-18 22:10:18,149 epoch 6 - iter 720/723 - loss 0.16381985 - time (sec): 18.15 - samples/sec: 9670.91 - lr: 0.000013 - momentum: 0.000000 2023-10-18 22:10:18,222 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:10:18,222 EPOCH 6 done: loss 0.1633 - lr: 0.000013 2023-10-18 22:10:19,990 DEV : loss 0.20532798767089844 - f1-score (micro avg) 0.437 2023-10-18 22:10:20,004 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:10:21,747 epoch 7 - iter 72/723 - loss 0.15945249 - time (sec): 1.74 - samples/sec: 9697.69 - lr: 0.000013 - momentum: 0.000000 2023-10-18 22:10:23,545 epoch 7 - iter 144/723 - loss 0.15900563 - time (sec): 3.54 - samples/sec: 9956.07 - lr: 0.000013 - momentum: 0.000000 2023-10-18 22:10:25,277 epoch 7 - iter 216/723 - loss 0.15756915 - time (sec): 5.27 - samples/sec: 9983.42 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:10:27,035 epoch 7 - iter 288/723 - loss 0.16114798 - time (sec): 7.03 - samples/sec: 9914.44 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:10:28,803 epoch 7 - iter 360/723 - loss 0.15938604 - time (sec): 8.80 - samples/sec: 9841.78 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:10:30,596 epoch 7 - iter 432/723 - loss 0.15944278 - time (sec): 10.59 - samples/sec: 9940.46 - lr: 0.000011 - momentum: 0.000000 2023-10-18 22:10:32,312 epoch 7 - iter 504/723 - loss 0.15758147 - time (sec): 12.31 - samples/sec: 9961.91 - lr: 0.000011 - momentum: 0.000000 2023-10-18 22:10:34,029 epoch 7 - iter 576/723 - loss 0.15825257 - time (sec): 14.02 - samples/sec: 9904.92 - lr: 0.000011 - momentum: 0.000000 2023-10-18 22:10:35,849 epoch 7 - iter 648/723 - loss 0.15932809 - time (sec): 15.84 - samples/sec: 9917.65 - lr: 0.000010 - momentum: 0.000000 2023-10-18 22:10:37,706 epoch 7 - iter 720/723 - loss 0.15736768 - time (sec): 17.70 - samples/sec: 9919.43 - lr: 0.000010 - momentum: 0.000000 2023-10-18 22:10:37,771 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:10:37,772 EPOCH 7 done: loss 0.1571 - lr: 0.000010 2023-10-18 22:10:39,536 DEV : loss 0.20203644037246704 - f1-score (micro avg) 0.435 2023-10-18 22:10:39,551 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:10:41,250 epoch 8 - iter 72/723 - loss 0.14938503 - time (sec): 1.70 - samples/sec: 9489.78 - lr: 0.000010 - momentum: 0.000000 2023-10-18 22:10:43,030 epoch 8 - iter 144/723 - loss 0.17061855 - time (sec): 3.48 - samples/sec: 9818.98 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:10:44,814 epoch 8 - iter 216/723 - loss 0.15925070 - time (sec): 5.26 - samples/sec: 10070.94 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:10:46,559 epoch 8 - iter 288/723 - loss 0.15425474 - time (sec): 7.01 - samples/sec: 10052.28 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:10:48,290 epoch 8 - iter 360/723 - loss 0.15406306 - time (sec): 8.74 - samples/sec: 10093.66 - lr: 0.000008 - momentum: 0.000000 2023-10-18 22:10:50,448 epoch 8 - iter 432/723 - loss 0.14990555 - time (sec): 10.90 - samples/sec: 9787.11 - lr: 0.000008 - momentum: 0.000000 2023-10-18 22:10:52,136 epoch 8 - iter 504/723 - loss 0.14942113 - time (sec): 12.59 - samples/sec: 9789.64 - lr: 0.000008 - momentum: 0.000000 2023-10-18 22:10:53,938 epoch 8 - iter 576/723 - loss 0.14926547 - time (sec): 14.39 - samples/sec: 9816.21 - lr: 0.000007 - momentum: 0.000000 2023-10-18 22:10:55,695 epoch 8 - iter 648/723 - loss 0.15071738 - time (sec): 16.14 - samples/sec: 9794.35 - lr: 0.000007 - momentum: 0.000000 2023-10-18 22:10:57,488 epoch 8 - iter 720/723 - loss 0.15325254 - time (sec): 17.94 - samples/sec: 9801.83 - lr: 0.000007 - momentum: 0.000000 2023-10-18 22:10:57,544 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:10:57,544 EPOCH 8 done: loss 0.1530 - lr: 0.000007 2023-10-18 22:10:59,325 DEV : loss 0.1937786489725113 - f1-score (micro avg) 0.4817 2023-10-18 22:10:59,340 saving best model 2023-10-18 22:10:59,377 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:11:01,186 epoch 9 - iter 72/723 - loss 0.13945487 - time (sec): 1.81 - samples/sec: 10785.70 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:11:02,934 epoch 9 - iter 144/723 - loss 0.13316947 - time (sec): 3.56 - samples/sec: 10339.76 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:11:04,648 epoch 9 - iter 216/723 - loss 0.13722754 - time (sec): 5.27 - samples/sec: 10152.37 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:11:06,400 epoch 9 - iter 288/723 - loss 0.14328557 - time (sec): 7.02 - samples/sec: 10075.44 - lr: 0.000005 - momentum: 0.000000 2023-10-18 22:11:08,177 epoch 9 - iter 360/723 - loss 0.14615266 - time (sec): 8.80 - samples/sec: 10049.16 - lr: 0.000005 - momentum: 0.000000 2023-10-18 22:11:09,894 epoch 9 - iter 432/723 - loss 0.14984252 - time (sec): 10.52 - samples/sec: 9960.70 - lr: 0.000005 - momentum: 0.000000 2023-10-18 22:11:11,597 epoch 9 - iter 504/723 - loss 0.15143593 - time (sec): 12.22 - samples/sec: 9905.91 - lr: 0.000004 - momentum: 0.000000 2023-10-18 22:11:13,468 epoch 9 - iter 576/723 - loss 0.15073005 - time (sec): 14.09 - samples/sec: 9992.68 - lr: 0.000004 - momentum: 0.000000 2023-10-18 22:11:15,257 epoch 9 - iter 648/723 - loss 0.15148546 - time (sec): 15.88 - samples/sec: 9978.29 - lr: 0.000004 - momentum: 0.000000 2023-10-18 22:11:16,977 epoch 9 - iter 720/723 - loss 0.15040735 - time (sec): 17.60 - samples/sec: 9978.12 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:11:17,043 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:11:17,043 EPOCH 9 done: loss 0.1503 - lr: 0.000003 2023-10-18 22:11:18,810 DEV : loss 0.1968904435634613 - f1-score (micro avg) 0.4678 2023-10-18 22:11:18,825 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:11:20,567 epoch 10 - iter 72/723 - loss 0.13221108 - time (sec): 1.74 - samples/sec: 9813.73 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:11:22,334 epoch 10 - iter 144/723 - loss 0.15218809 - time (sec): 3.51 - samples/sec: 9666.30 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:11:24,466 epoch 10 - iter 216/723 - loss 0.14769919 - time (sec): 5.64 - samples/sec: 9253.33 - lr: 0.000002 - momentum: 0.000000 2023-10-18 22:11:26,295 epoch 10 - iter 288/723 - loss 0.15037349 - time (sec): 7.47 - samples/sec: 9296.52 - lr: 0.000002 - momentum: 0.000000 2023-10-18 22:11:28,116 epoch 10 - iter 360/723 - loss 0.15714138 - time (sec): 9.29 - samples/sec: 9508.97 - lr: 0.000002 - momentum: 0.000000 2023-10-18 22:11:29,886 epoch 10 - iter 432/723 - loss 0.15652764 - time (sec): 11.06 - samples/sec: 9531.39 - lr: 0.000001 - momentum: 0.000000 2023-10-18 22:11:31,644 epoch 10 - iter 504/723 - loss 0.15303564 - time (sec): 12.82 - samples/sec: 9644.01 - lr: 0.000001 - momentum: 0.000000 2023-10-18 22:11:33,414 epoch 10 - iter 576/723 - loss 0.15136324 - time (sec): 14.59 - samples/sec: 9663.06 - lr: 0.000001 - momentum: 0.000000 2023-10-18 22:11:35,141 epoch 10 - iter 648/723 - loss 0.14914005 - time (sec): 16.32 - samples/sec: 9668.13 - lr: 0.000000 - momentum: 0.000000 2023-10-18 22:11:36,887 epoch 10 - iter 720/723 - loss 0.15021989 - time (sec): 18.06 - samples/sec: 9722.40 - lr: 0.000000 - momentum: 0.000000 2023-10-18 22:11:36,951 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:11:36,951 EPOCH 10 done: loss 0.1504 - lr: 0.000000 2023-10-18 22:11:38,720 DEV : loss 0.19765476882457733 - f1-score (micro avg) 0.4656 2023-10-18 22:11:38,766 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:11:38,766 Loading model from best epoch ... 2023-10-18 22:11:38,851 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-18 22:11:40,203 Results: - F-score (micro) 0.4758 - F-score (macro) 0.3261 - Accuracy 0.3258 By class: precision recall f1-score support LOC 0.5020 0.5611 0.5299 458 PER 0.6822 0.3340 0.4485 482 ORG 0.0000 0.0000 0.0000 69 micro avg 0.5588 0.4143 0.4758 1009 macro avg 0.3947 0.2984 0.3261 1009 weighted avg 0.5537 0.4143 0.4548 1009 2023-10-18 22:11:40,203 ----------------------------------------------------------------------------------------------------