2023-10-19 00:34:30,183 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:34:30,183 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-19 00:34:30,183 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:34:30,183 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences - NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator 2023-10-19 00:34:30,183 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:34:30,183 Train: 14465 sentences 2023-10-19 00:34:30,183 (train_with_dev=False, train_with_test=False) 2023-10-19 00:34:30,183 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:34:30,183 Training Params: 2023-10-19 00:34:30,183 - learning_rate: "3e-05" 2023-10-19 00:34:30,183 - mini_batch_size: "4" 2023-10-19 00:34:30,183 - max_epochs: "10" 2023-10-19 00:34:30,184 - shuffle: "True" 2023-10-19 00:34:30,184 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:34:30,184 Plugins: 2023-10-19 00:34:30,184 - TensorboardLogger 2023-10-19 00:34:30,184 - LinearScheduler | warmup_fraction: '0.1' 2023-10-19 00:34:30,184 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:34:30,184 Final evaluation on model from best epoch (best-model.pt) 2023-10-19 00:34:30,184 - metric: "('micro avg', 'f1-score')" 2023-10-19 00:34:30,184 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:34:30,184 Computation: 2023-10-19 00:34:30,184 - compute on device: cuda:0 2023-10-19 00:34:30,184 - embedding storage: none 2023-10-19 00:34:30,184 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:34:30,184 Model training base path: "hmbench-letemps/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-19 00:34:30,184 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:34:30,184 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:34:30,184 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-19 00:34:35,918 epoch 1 - iter 361/3617 - loss 2.91805740 - time (sec): 5.73 - samples/sec: 6815.20 - lr: 0.000003 - momentum: 0.000000 2023-10-19 00:34:41,682 epoch 1 - iter 722/3617 - loss 2.25303714 - time (sec): 11.50 - samples/sec: 6622.87 - lr: 0.000006 - momentum: 0.000000 2023-10-19 00:34:47,406 epoch 1 - iter 1083/3617 - loss 1.66388576 - time (sec): 17.22 - samples/sec: 6640.71 - lr: 0.000009 - momentum: 0.000000 2023-10-19 00:34:53,049 epoch 1 - iter 1444/3617 - loss 1.33761956 - time (sec): 22.86 - samples/sec: 6653.83 - lr: 0.000012 - momentum: 0.000000 2023-10-19 00:34:58,249 epoch 1 - iter 1805/3617 - loss 1.12626075 - time (sec): 28.06 - samples/sec: 6840.05 - lr: 0.000015 - momentum: 0.000000 2023-10-19 00:35:03,842 epoch 1 - iter 2166/3617 - loss 0.98677877 - time (sec): 33.66 - samples/sec: 6847.55 - lr: 0.000018 - momentum: 0.000000 2023-10-19 00:35:09,452 epoch 1 - iter 2527/3617 - loss 0.89109170 - time (sec): 39.27 - samples/sec: 6783.21 - lr: 0.000021 - momentum: 0.000000 2023-10-19 00:35:14,571 epoch 1 - iter 2888/3617 - loss 0.81168511 - time (sec): 44.39 - samples/sec: 6859.56 - lr: 0.000024 - momentum: 0.000000 2023-10-19 00:35:19,754 epoch 1 - iter 3249/3617 - loss 0.74884197 - time (sec): 49.57 - samples/sec: 6883.09 - lr: 0.000027 - momentum: 0.000000 2023-10-19 00:35:25,488 epoch 1 - iter 3610/3617 - loss 0.69704141 - time (sec): 55.30 - samples/sec: 6861.62 - lr: 0.000030 - momentum: 0.000000 2023-10-19 00:35:25,588 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:35:25,588 EPOCH 1 done: loss 0.6964 - lr: 0.000030 2023-10-19 00:35:27,894 DEV : loss 0.18414448201656342 - f1-score (micro avg) 0.1723 2023-10-19 00:35:27,923 saving best model 2023-10-19 00:35:27,957 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:35:33,423 epoch 2 - iter 361/3617 - loss 0.20767271 - time (sec): 5.46 - samples/sec: 6898.30 - lr: 0.000030 - momentum: 0.000000 2023-10-19 00:35:39,165 epoch 2 - iter 722/3617 - loss 0.20857724 - time (sec): 11.21 - samples/sec: 6785.54 - lr: 0.000029 - momentum: 0.000000 2023-10-19 00:35:44,839 epoch 2 - iter 1083/3617 - loss 0.20034477 - time (sec): 16.88 - samples/sec: 6781.60 - lr: 0.000029 - momentum: 0.000000 2023-10-19 00:35:50,478 epoch 2 - iter 1444/3617 - loss 0.19662744 - time (sec): 22.52 - samples/sec: 6670.61 - lr: 0.000029 - momentum: 0.000000 2023-10-19 00:35:56,146 epoch 2 - iter 1805/3617 - loss 0.19587540 - time (sec): 28.19 - samples/sec: 6615.87 - lr: 0.000028 - momentum: 0.000000 2023-10-19 00:36:01,731 epoch 2 - iter 2166/3617 - loss 0.19490222 - time (sec): 33.77 - samples/sec: 6691.46 - lr: 0.000028 - momentum: 0.000000 2023-10-19 00:36:07,501 epoch 2 - iter 2527/3617 - loss 0.19331286 - time (sec): 39.54 - samples/sec: 6672.70 - lr: 0.000028 - momentum: 0.000000 2023-10-19 00:36:13,181 epoch 2 - iter 2888/3617 - loss 0.19208740 - time (sec): 45.22 - samples/sec: 6649.22 - lr: 0.000027 - momentum: 0.000000 2023-10-19 00:36:18,875 epoch 2 - iter 3249/3617 - loss 0.18952749 - time (sec): 50.92 - samples/sec: 6673.34 - lr: 0.000027 - momentum: 0.000000 2023-10-19 00:36:24,598 epoch 2 - iter 3610/3617 - loss 0.18852611 - time (sec): 56.64 - samples/sec: 6696.51 - lr: 0.000027 - momentum: 0.000000 2023-10-19 00:36:24,705 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:36:24,706 EPOCH 2 done: loss 0.1886 - lr: 0.000027 2023-10-19 00:36:28,635 DEV : loss 0.1664983630180359 - f1-score (micro avg) 0.381 2023-10-19 00:36:28,663 saving best model 2023-10-19 00:36:28,696 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:36:34,434 epoch 3 - iter 361/3617 - loss 0.15091050 - time (sec): 5.74 - samples/sec: 6577.68 - lr: 0.000026 - momentum: 0.000000 2023-10-19 00:36:40,116 epoch 3 - iter 722/3617 - loss 0.15277438 - time (sec): 11.42 - samples/sec: 6633.98 - lr: 0.000026 - momentum: 0.000000 2023-10-19 00:36:45,517 epoch 3 - iter 1083/3617 - loss 0.15890046 - time (sec): 16.82 - samples/sec: 6773.25 - lr: 0.000026 - momentum: 0.000000 2023-10-19 00:36:51,398 epoch 3 - iter 1444/3617 - loss 0.16323482 - time (sec): 22.70 - samples/sec: 6688.62 - lr: 0.000025 - momentum: 0.000000 2023-10-19 00:36:57,123 epoch 3 - iter 1805/3617 - loss 0.15977729 - time (sec): 28.43 - samples/sec: 6707.84 - lr: 0.000025 - momentum: 0.000000 2023-10-19 00:37:02,800 epoch 3 - iter 2166/3617 - loss 0.16133560 - time (sec): 34.10 - samples/sec: 6683.91 - lr: 0.000025 - momentum: 0.000000 2023-10-19 00:37:08,622 epoch 3 - iter 2527/3617 - loss 0.16171229 - time (sec): 39.92 - samples/sec: 6680.19 - lr: 0.000024 - momentum: 0.000000 2023-10-19 00:37:14,112 epoch 3 - iter 2888/3617 - loss 0.16077563 - time (sec): 45.41 - samples/sec: 6699.91 - lr: 0.000024 - momentum: 0.000000 2023-10-19 00:37:19,821 epoch 3 - iter 3249/3617 - loss 0.15940779 - time (sec): 51.12 - samples/sec: 6685.32 - lr: 0.000024 - momentum: 0.000000 2023-10-19 00:37:25,531 epoch 3 - iter 3610/3617 - loss 0.15948787 - time (sec): 56.83 - samples/sec: 6671.03 - lr: 0.000023 - momentum: 0.000000 2023-10-19 00:37:25,641 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:37:25,641 EPOCH 3 done: loss 0.1594 - lr: 0.000023 2023-10-19 00:37:28,812 DEV : loss 0.16962358355522156 - f1-score (micro avg) 0.3721 2023-10-19 00:37:28,839 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:37:34,732 epoch 4 - iter 361/3617 - loss 0.14232155 - time (sec): 5.89 - samples/sec: 6302.73 - lr: 0.000023 - momentum: 0.000000 2023-10-19 00:37:40,531 epoch 4 - iter 722/3617 - loss 0.14496426 - time (sec): 11.69 - samples/sec: 6505.66 - lr: 0.000023 - momentum: 0.000000 2023-10-19 00:37:46,264 epoch 4 - iter 1083/3617 - loss 0.15214303 - time (sec): 17.42 - samples/sec: 6523.89 - lr: 0.000022 - momentum: 0.000000 2023-10-19 00:37:52,081 epoch 4 - iter 1444/3617 - loss 0.14974326 - time (sec): 23.24 - samples/sec: 6539.88 - lr: 0.000022 - momentum: 0.000000 2023-10-19 00:37:57,503 epoch 4 - iter 1805/3617 - loss 0.15035754 - time (sec): 28.66 - samples/sec: 6641.02 - lr: 0.000022 - momentum: 0.000000 2023-10-19 00:38:02,896 epoch 4 - iter 2166/3617 - loss 0.15039641 - time (sec): 34.06 - samples/sec: 6679.55 - lr: 0.000021 - momentum: 0.000000 2023-10-19 00:38:08,575 epoch 4 - iter 2527/3617 - loss 0.14933721 - time (sec): 39.73 - samples/sec: 6636.79 - lr: 0.000021 - momentum: 0.000000 2023-10-19 00:38:14,295 epoch 4 - iter 2888/3617 - loss 0.14723742 - time (sec): 45.46 - samples/sec: 6659.06 - lr: 0.000021 - momentum: 0.000000 2023-10-19 00:38:19,742 epoch 4 - iter 3249/3617 - loss 0.14686544 - time (sec): 50.90 - samples/sec: 6721.83 - lr: 0.000020 - momentum: 0.000000 2023-10-19 00:38:25,385 epoch 4 - iter 3610/3617 - loss 0.14710156 - time (sec): 56.55 - samples/sec: 6702.85 - lr: 0.000020 - momentum: 0.000000 2023-10-19 00:38:25,499 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:38:25,499 EPOCH 4 done: loss 0.1470 - lr: 0.000020 2023-10-19 00:38:29,396 DEV : loss 0.16811420023441315 - f1-score (micro avg) 0.4632 2023-10-19 00:38:29,424 saving best model 2023-10-19 00:38:29,457 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:38:35,207 epoch 5 - iter 361/3617 - loss 0.14721800 - time (sec): 5.75 - samples/sec: 6158.21 - lr: 0.000020 - momentum: 0.000000 2023-10-19 00:38:41,041 epoch 5 - iter 722/3617 - loss 0.14279723 - time (sec): 11.58 - samples/sec: 6442.74 - lr: 0.000019 - momentum: 0.000000 2023-10-19 00:38:46,517 epoch 5 - iter 1083/3617 - loss 0.13300252 - time (sec): 17.06 - samples/sec: 6532.03 - lr: 0.000019 - momentum: 0.000000 2023-10-19 00:38:52,413 epoch 5 - iter 1444/3617 - loss 0.13201950 - time (sec): 22.95 - samples/sec: 6528.86 - lr: 0.000019 - momentum: 0.000000 2023-10-19 00:38:58,140 epoch 5 - iter 1805/3617 - loss 0.13224927 - time (sec): 28.68 - samples/sec: 6504.15 - lr: 0.000018 - momentum: 0.000000 2023-10-19 00:39:03,858 epoch 5 - iter 2166/3617 - loss 0.13286286 - time (sec): 34.40 - samples/sec: 6560.32 - lr: 0.000018 - momentum: 0.000000 2023-10-19 00:39:09,676 epoch 5 - iter 2527/3617 - loss 0.13266831 - time (sec): 40.22 - samples/sec: 6601.60 - lr: 0.000018 - momentum: 0.000000 2023-10-19 00:39:15,284 epoch 5 - iter 2888/3617 - loss 0.13352438 - time (sec): 45.83 - samples/sec: 6614.43 - lr: 0.000017 - momentum: 0.000000 2023-10-19 00:39:20,653 epoch 5 - iter 3249/3617 - loss 0.13325418 - time (sec): 51.20 - samples/sec: 6681.12 - lr: 0.000017 - momentum: 0.000000 2023-10-19 00:39:26,465 epoch 5 - iter 3610/3617 - loss 0.13392825 - time (sec): 57.01 - samples/sec: 6652.32 - lr: 0.000017 - momentum: 0.000000 2023-10-19 00:39:26,599 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:39:26,600 EPOCH 5 done: loss 0.1339 - lr: 0.000017 2023-10-19 00:39:29,796 DEV : loss 0.17465609312057495 - f1-score (micro avg) 0.4657 2023-10-19 00:39:29,825 saving best model 2023-10-19 00:39:29,859 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:39:35,592 epoch 6 - iter 361/3617 - loss 0.12127706 - time (sec): 5.73 - samples/sec: 6721.20 - lr: 0.000016 - momentum: 0.000000 2023-10-19 00:39:41,260 epoch 6 - iter 722/3617 - loss 0.11778469 - time (sec): 11.40 - samples/sec: 6597.92 - lr: 0.000016 - momentum: 0.000000 2023-10-19 00:39:47,020 epoch 6 - iter 1083/3617 - loss 0.11656336 - time (sec): 17.16 - samples/sec: 6622.72 - lr: 0.000016 - momentum: 0.000000 2023-10-19 00:39:52,551 epoch 6 - iter 1444/3617 - loss 0.12094593 - time (sec): 22.69 - samples/sec: 6627.71 - lr: 0.000015 - momentum: 0.000000 2023-10-19 00:39:58,056 epoch 6 - iter 1805/3617 - loss 0.12103503 - time (sec): 28.20 - samples/sec: 6695.63 - lr: 0.000015 - momentum: 0.000000 2023-10-19 00:40:03,764 epoch 6 - iter 2166/3617 - loss 0.12382929 - time (sec): 33.90 - samples/sec: 6676.97 - lr: 0.000015 - momentum: 0.000000 2023-10-19 00:40:09,505 epoch 6 - iter 2527/3617 - loss 0.12571836 - time (sec): 39.65 - samples/sec: 6688.00 - lr: 0.000014 - momentum: 0.000000 2023-10-19 00:40:15,239 epoch 6 - iter 2888/3617 - loss 0.12660519 - time (sec): 45.38 - samples/sec: 6656.01 - lr: 0.000014 - momentum: 0.000000 2023-10-19 00:40:21,099 epoch 6 - iter 3249/3617 - loss 0.12660343 - time (sec): 51.24 - samples/sec: 6643.48 - lr: 0.000014 - momentum: 0.000000 2023-10-19 00:40:26,893 epoch 6 - iter 3610/3617 - loss 0.12570085 - time (sec): 57.03 - samples/sec: 6643.19 - lr: 0.000013 - momentum: 0.000000 2023-10-19 00:40:27,007 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:40:27,008 EPOCH 6 done: loss 0.1257 - lr: 0.000013 2023-10-19 00:40:30,241 DEV : loss 0.18021412193775177 - f1-score (micro avg) 0.4823 2023-10-19 00:40:30,269 saving best model 2023-10-19 00:40:30,301 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:40:36,131 epoch 7 - iter 361/3617 - loss 0.12149979 - time (sec): 5.83 - samples/sec: 6622.41 - lr: 0.000013 - momentum: 0.000000 2023-10-19 00:40:41,749 epoch 7 - iter 722/3617 - loss 0.11531578 - time (sec): 11.45 - samples/sec: 6667.10 - lr: 0.000013 - momentum: 0.000000 2023-10-19 00:40:47,463 epoch 7 - iter 1083/3617 - loss 0.11569164 - time (sec): 17.16 - samples/sec: 6666.03 - lr: 0.000012 - momentum: 0.000000 2023-10-19 00:40:52,785 epoch 7 - iter 1444/3617 - loss 0.11579758 - time (sec): 22.48 - samples/sec: 6774.95 - lr: 0.000012 - momentum: 0.000000 2023-10-19 00:40:58,121 epoch 7 - iter 1805/3617 - loss 0.11667595 - time (sec): 27.82 - samples/sec: 6819.17 - lr: 0.000012 - momentum: 0.000000 2023-10-19 00:41:03,894 epoch 7 - iter 2166/3617 - loss 0.12102352 - time (sec): 33.59 - samples/sec: 6764.86 - lr: 0.000011 - momentum: 0.000000 2023-10-19 00:41:09,701 epoch 7 - iter 2527/3617 - loss 0.12119387 - time (sec): 39.40 - samples/sec: 6738.92 - lr: 0.000011 - momentum: 0.000000 2023-10-19 00:41:15,459 epoch 7 - iter 2888/3617 - loss 0.12026071 - time (sec): 45.16 - samples/sec: 6692.51 - lr: 0.000011 - momentum: 0.000000 2023-10-19 00:41:21,204 epoch 7 - iter 3249/3617 - loss 0.11944450 - time (sec): 50.90 - samples/sec: 6680.71 - lr: 0.000010 - momentum: 0.000000 2023-10-19 00:41:27,019 epoch 7 - iter 3610/3617 - loss 0.11841485 - time (sec): 56.72 - samples/sec: 6679.74 - lr: 0.000010 - momentum: 0.000000 2023-10-19 00:41:27,134 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:41:27,135 EPOCH 7 done: loss 0.1186 - lr: 0.000010 2023-10-19 00:41:31,039 DEV : loss 0.1851833611726761 - f1-score (micro avg) 0.4913 2023-10-19 00:41:31,067 saving best model 2023-10-19 00:41:31,106 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:41:36,914 epoch 8 - iter 361/3617 - loss 0.11416187 - time (sec): 5.81 - samples/sec: 6707.95 - lr: 0.000010 - momentum: 0.000000 2023-10-19 00:41:42,734 epoch 8 - iter 722/3617 - loss 0.10800957 - time (sec): 11.63 - samples/sec: 6724.86 - lr: 0.000009 - momentum: 0.000000 2023-10-19 00:41:48,476 epoch 8 - iter 1083/3617 - loss 0.11095705 - time (sec): 17.37 - samples/sec: 6711.31 - lr: 0.000009 - momentum: 0.000000 2023-10-19 00:41:54,248 epoch 8 - iter 1444/3617 - loss 0.10817209 - time (sec): 23.14 - samples/sec: 6666.37 - lr: 0.000009 - momentum: 0.000000 2023-10-19 00:42:00,062 epoch 8 - iter 1805/3617 - loss 0.11279132 - time (sec): 28.96 - samples/sec: 6669.85 - lr: 0.000008 - momentum: 0.000000 2023-10-19 00:42:05,767 epoch 8 - iter 2166/3617 - loss 0.11285781 - time (sec): 34.66 - samples/sec: 6670.09 - lr: 0.000008 - momentum: 0.000000 2023-10-19 00:42:11,517 epoch 8 - iter 2527/3617 - loss 0.11301960 - time (sec): 40.41 - samples/sec: 6634.57 - lr: 0.000008 - momentum: 0.000000 2023-10-19 00:42:17,117 epoch 8 - iter 2888/3617 - loss 0.11249704 - time (sec): 46.01 - samples/sec: 6632.10 - lr: 0.000007 - momentum: 0.000000 2023-10-19 00:42:22,519 epoch 8 - iter 3249/3617 - loss 0.11359347 - time (sec): 51.41 - samples/sec: 6671.77 - lr: 0.000007 - momentum: 0.000000 2023-10-19 00:42:28,116 epoch 8 - iter 3610/3617 - loss 0.11389018 - time (sec): 57.01 - samples/sec: 6649.67 - lr: 0.000007 - momentum: 0.000000 2023-10-19 00:42:28,224 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:42:28,224 EPOCH 8 done: loss 0.1137 - lr: 0.000007 2023-10-19 00:42:31,463 DEV : loss 0.19062528014183044 - f1-score (micro avg) 0.4952 2023-10-19 00:42:31,491 saving best model 2023-10-19 00:42:31,528 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:42:37,260 epoch 9 - iter 361/3617 - loss 0.10534011 - time (sec): 5.73 - samples/sec: 6685.34 - lr: 0.000006 - momentum: 0.000000 2023-10-19 00:42:42,983 epoch 9 - iter 722/3617 - loss 0.11077406 - time (sec): 11.45 - samples/sec: 6598.82 - lr: 0.000006 - momentum: 0.000000 2023-10-19 00:42:48,253 epoch 9 - iter 1083/3617 - loss 0.10676885 - time (sec): 16.72 - samples/sec: 6791.81 - lr: 0.000006 - momentum: 0.000000 2023-10-19 00:42:54,088 epoch 9 - iter 1444/3617 - loss 0.10622678 - time (sec): 22.56 - samples/sec: 6690.49 - lr: 0.000005 - momentum: 0.000000 2023-10-19 00:42:59,808 epoch 9 - iter 1805/3617 - loss 0.10776910 - time (sec): 28.28 - samples/sec: 6707.15 - lr: 0.000005 - momentum: 0.000000 2023-10-19 00:43:05,595 epoch 9 - iter 2166/3617 - loss 0.11026578 - time (sec): 34.07 - samples/sec: 6654.23 - lr: 0.000005 - momentum: 0.000000 2023-10-19 00:43:11,285 epoch 9 - iter 2527/3617 - loss 0.11034727 - time (sec): 39.76 - samples/sec: 6657.01 - lr: 0.000004 - momentum: 0.000000 2023-10-19 00:43:17,141 epoch 9 - iter 2888/3617 - loss 0.11070044 - time (sec): 45.61 - samples/sec: 6632.25 - lr: 0.000004 - momentum: 0.000000 2023-10-19 00:43:22,779 epoch 9 - iter 3249/3617 - loss 0.10960559 - time (sec): 51.25 - samples/sec: 6625.34 - lr: 0.000004 - momentum: 0.000000 2023-10-19 00:43:28,625 epoch 9 - iter 3610/3617 - loss 0.11083827 - time (sec): 57.10 - samples/sec: 6642.00 - lr: 0.000003 - momentum: 0.000000 2023-10-19 00:43:28,729 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:43:28,729 EPOCH 9 done: loss 0.1109 - lr: 0.000003 2023-10-19 00:43:31,979 DEV : loss 0.19269128143787384 - f1-score (micro avg) 0.5016 2023-10-19 00:43:32,008 saving best model 2023-10-19 00:43:32,041 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:43:38,495 epoch 10 - iter 361/3617 - loss 0.10547873 - time (sec): 6.45 - samples/sec: 5966.48 - lr: 0.000003 - momentum: 0.000000 2023-10-19 00:43:44,265 epoch 10 - iter 722/3617 - loss 0.10323462 - time (sec): 12.22 - samples/sec: 6292.25 - lr: 0.000003 - momentum: 0.000000 2023-10-19 00:43:49,972 epoch 10 - iter 1083/3617 - loss 0.11136052 - time (sec): 17.93 - samples/sec: 6278.56 - lr: 0.000002 - momentum: 0.000000 2023-10-19 00:43:55,711 epoch 10 - iter 1444/3617 - loss 0.10802696 - time (sec): 23.67 - samples/sec: 6385.53 - lr: 0.000002 - momentum: 0.000000 2023-10-19 00:44:01,531 epoch 10 - iter 1805/3617 - loss 0.10816177 - time (sec): 29.49 - samples/sec: 6442.13 - lr: 0.000002 - momentum: 0.000000 2023-10-19 00:44:07,303 epoch 10 - iter 2166/3617 - loss 0.10603407 - time (sec): 35.26 - samples/sec: 6485.02 - lr: 0.000001 - momentum: 0.000000 2023-10-19 00:44:13,016 epoch 10 - iter 2527/3617 - loss 0.10623744 - time (sec): 40.97 - samples/sec: 6481.74 - lr: 0.000001 - momentum: 0.000000 2023-10-19 00:44:18,445 epoch 10 - iter 2888/3617 - loss 0.10604407 - time (sec): 46.40 - samples/sec: 6574.08 - lr: 0.000001 - momentum: 0.000000 2023-10-19 00:44:24,218 epoch 10 - iter 3249/3617 - loss 0.10773950 - time (sec): 52.18 - samples/sec: 6581.15 - lr: 0.000000 - momentum: 0.000000 2023-10-19 00:44:29,920 epoch 10 - iter 3610/3617 - loss 0.10913884 - time (sec): 57.88 - samples/sec: 6551.05 - lr: 0.000000 - momentum: 0.000000 2023-10-19 00:44:30,022 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:44:30,023 EPOCH 10 done: loss 0.1090 - lr: 0.000000 2023-10-19 00:44:33,292 DEV : loss 0.1960730254650116 - f1-score (micro avg) 0.5019 2023-10-19 00:44:33,321 saving best model 2023-10-19 00:44:33,388 ---------------------------------------------------------------------------------------------------- 2023-10-19 00:44:33,389 Loading model from best epoch ... 2023-10-19 00:44:33,469 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org 2023-10-19 00:44:37,684 Results: - F-score (micro) 0.5164 - F-score (macro) 0.3449 - Accuracy 0.36 By class: precision recall f1-score support loc 0.5194 0.6785 0.5884 591 pers 0.3952 0.5126 0.4463 357 org 0.0000 0.0000 0.0000 79 micro avg 0.4729 0.5686 0.5164 1027 macro avg 0.3049 0.3970 0.3449 1027 weighted avg 0.4363 0.5686 0.4938 1027 2023-10-19 00:44:37,685 ----------------------------------------------------------------------------------------------------