2023-10-18 16:48:35,521 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:35,521 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 16:48:35,521 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:35,521 MultiCorpus: 966 train + 219 dev + 204 test sentences - NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator 2023-10-18 16:48:35,521 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:35,521 Train: 966 sentences 2023-10-18 16:48:35,521 (train_with_dev=False, train_with_test=False) 2023-10-18 16:48:35,521 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:35,521 Training Params: 2023-10-18 16:48:35,521 - learning_rate: "3e-05" 2023-10-18 16:48:35,522 - mini_batch_size: "8" 2023-10-18 16:48:35,522 - max_epochs: "10" 2023-10-18 16:48:35,522 - shuffle: "True" 2023-10-18 16:48:35,522 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:35,522 Plugins: 2023-10-18 16:48:35,522 - TensorboardLogger 2023-10-18 16:48:35,522 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 16:48:35,522 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:35,522 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 16:48:35,522 - metric: "('micro avg', 'f1-score')" 2023-10-18 16:48:35,522 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:35,522 Computation: 2023-10-18 16:48:35,522 - compute on device: cuda:0 2023-10-18 16:48:35,522 - embedding storage: none 2023-10-18 16:48:35,522 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:35,522 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-18 16:48:35,522 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:35,522 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:35,522 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 16:48:35,781 epoch 1 - iter 12/121 - loss 3.73135666 - time (sec): 0.26 - samples/sec: 8989.30 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:48:36,037 epoch 1 - iter 24/121 - loss 3.70230323 - time (sec): 0.51 - samples/sec: 8365.49 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:48:36,312 epoch 1 - iter 36/121 - loss 3.64697934 - time (sec): 0.79 - samples/sec: 9056.15 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:48:36,585 epoch 1 - iter 48/121 - loss 3.64897593 - time (sec): 1.06 - samples/sec: 8930.69 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:48:36,853 epoch 1 - iter 60/121 - loss 3.61059523 - time (sec): 1.33 - samples/sec: 8967.33 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:48:37,110 epoch 1 - iter 72/121 - loss 3.52905406 - time (sec): 1.59 - samples/sec: 8849.71 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:48:37,383 epoch 1 - iter 84/121 - loss 3.40711096 - time (sec): 1.86 - samples/sec: 9028.03 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:48:37,654 epoch 1 - iter 96/121 - loss 3.27300372 - time (sec): 2.13 - samples/sec: 9245.22 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:48:37,931 epoch 1 - iter 108/121 - loss 3.11948802 - time (sec): 2.41 - samples/sec: 9258.34 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:48:38,186 epoch 1 - iter 120/121 - loss 2.98945042 - time (sec): 2.66 - samples/sec: 9256.13 - lr: 0.000030 - momentum: 0.000000 2023-10-18 16:48:38,202 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:38,202 EPOCH 1 done: loss 2.9857 - lr: 0.000030 2023-10-18 16:48:38,475 DEV : loss 0.888546884059906 - f1-score (micro avg) 0.0 2023-10-18 16:48:38,480 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:38,743 epoch 2 - iter 12/121 - loss 1.34267957 - time (sec): 0.26 - samples/sec: 8726.53 - lr: 0.000030 - momentum: 0.000000 2023-10-18 16:48:39,014 epoch 2 - iter 24/121 - loss 1.20006107 - time (sec): 0.53 - samples/sec: 8960.14 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:48:39,281 epoch 2 - iter 36/121 - loss 1.07692122 - time (sec): 0.80 - samples/sec: 9001.51 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:48:39,559 epoch 2 - iter 48/121 - loss 1.02313150 - time (sec): 1.08 - samples/sec: 9151.99 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:48:39,830 epoch 2 - iter 60/121 - loss 0.98698774 - time (sec): 1.35 - samples/sec: 8953.45 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:48:40,110 epoch 2 - iter 72/121 - loss 0.95003231 - time (sec): 1.63 - samples/sec: 9022.75 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:48:40,342 epoch 2 - iter 84/121 - loss 0.89889328 - time (sec): 1.86 - samples/sec: 9294.91 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:48:40,744 epoch 2 - iter 96/121 - loss 0.86386461 - time (sec): 2.26 - samples/sec: 8747.32 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:48:41,018 epoch 2 - iter 108/121 - loss 0.86218533 - time (sec): 2.54 - samples/sec: 8706.77 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:48:41,283 epoch 2 - iter 120/121 - loss 0.84499695 - time (sec): 2.80 - samples/sec: 8744.14 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:48:41,305 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:41,305 EPOCH 2 done: loss 0.8462 - lr: 0.000027 2023-10-18 16:48:41,719 DEV : loss 0.6577260494232178 - f1-score (micro avg) 0.0 2023-10-18 16:48:41,724 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:42,009 epoch 3 - iter 12/121 - loss 0.72147088 - time (sec): 0.28 - samples/sec: 8779.43 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:48:42,279 epoch 3 - iter 24/121 - loss 0.76590949 - time (sec): 0.56 - samples/sec: 8647.80 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:48:42,457 epoch 3 - iter 36/121 - loss 0.74445104 - time (sec): 0.73 - samples/sec: 9633.18 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:48:42,640 epoch 3 - iter 48/121 - loss 0.73264098 - time (sec): 0.92 - samples/sec: 10434.44 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:48:42,827 epoch 3 - iter 60/121 - loss 0.71998033 - time (sec): 1.10 - samples/sec: 10910.00 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:48:43,005 epoch 3 - iter 72/121 - loss 0.70956971 - time (sec): 1.28 - samples/sec: 11157.50 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:48:43,192 epoch 3 - iter 84/121 - loss 0.69546580 - time (sec): 1.47 - samples/sec: 11494.33 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:48:43,392 epoch 3 - iter 96/121 - loss 0.68506052 - time (sec): 1.67 - samples/sec: 11815.01 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:48:43,616 epoch 3 - iter 108/121 - loss 0.67453678 - time (sec): 1.89 - samples/sec: 11736.39 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:48:43,840 epoch 3 - iter 120/121 - loss 0.67556474 - time (sec): 2.12 - samples/sec: 11614.32 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:48:43,855 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:43,855 EPOCH 3 done: loss 0.6758 - lr: 0.000023 2023-10-18 16:48:44,269 DEV : loss 0.5368312001228333 - f1-score (micro avg) 0.0 2023-10-18 16:48:44,273 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:44,540 epoch 4 - iter 12/121 - loss 0.67001378 - time (sec): 0.27 - samples/sec: 7861.72 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:48:44,819 epoch 4 - iter 24/121 - loss 0.64732484 - time (sec): 0.55 - samples/sec: 8204.05 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:48:45,098 epoch 4 - iter 36/121 - loss 0.61911750 - time (sec): 0.82 - samples/sec: 8823.53 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:48:45,356 epoch 4 - iter 48/121 - loss 0.61389990 - time (sec): 1.08 - samples/sec: 8897.35 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:48:45,623 epoch 4 - iter 60/121 - loss 0.61282173 - time (sec): 1.35 - samples/sec: 8954.13 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:48:45,889 epoch 4 - iter 72/121 - loss 0.59776583 - time (sec): 1.61 - samples/sec: 9038.88 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:48:46,159 epoch 4 - iter 84/121 - loss 0.59023191 - time (sec): 1.89 - samples/sec: 9178.13 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:48:46,434 epoch 4 - iter 96/121 - loss 0.58243842 - time (sec): 2.16 - samples/sec: 9158.22 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:48:46,721 epoch 4 - iter 108/121 - loss 0.58528172 - time (sec): 2.45 - samples/sec: 9086.95 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:48:46,991 epoch 4 - iter 120/121 - loss 0.57638690 - time (sec): 2.72 - samples/sec: 9077.62 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:48:47,008 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:47,008 EPOCH 4 done: loss 0.5767 - lr: 0.000020 2023-10-18 16:48:47,426 DEV : loss 0.43159720301628113 - f1-score (micro avg) 0.0952 2023-10-18 16:48:47,430 saving best model 2023-10-18 16:48:47,463 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:47,734 epoch 5 - iter 12/121 - loss 0.54535688 - time (sec): 0.27 - samples/sec: 9126.35 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:48:48,003 epoch 5 - iter 24/121 - loss 0.53077766 - time (sec): 0.54 - samples/sec: 9335.20 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:48:48,272 epoch 5 - iter 36/121 - loss 0.51623647 - time (sec): 0.81 - samples/sec: 9313.01 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:48:48,551 epoch 5 - iter 48/121 - loss 0.52815520 - time (sec): 1.09 - samples/sec: 9360.77 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:48:48,739 epoch 5 - iter 60/121 - loss 0.52803554 - time (sec): 1.27 - samples/sec: 9878.01 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:48:48,937 epoch 5 - iter 72/121 - loss 0.52301950 - time (sec): 1.47 - samples/sec: 10188.78 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:48:49,159 epoch 5 - iter 84/121 - loss 0.51809378 - time (sec): 1.69 - samples/sec: 10286.49 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:48:49,354 epoch 5 - iter 96/121 - loss 0.51630126 - time (sec): 1.89 - samples/sec: 10552.00 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:48:49,577 epoch 5 - iter 108/121 - loss 0.50390050 - time (sec): 2.11 - samples/sec: 10559.03 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:48:49,829 epoch 5 - iter 120/121 - loss 0.50184025 - time (sec): 2.36 - samples/sec: 10415.44 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:48:49,847 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:49,847 EPOCH 5 done: loss 0.5019 - lr: 0.000017 2023-10-18 16:48:50,277 DEV : loss 0.38968807458877563 - f1-score (micro avg) 0.236 2023-10-18 16:48:50,283 saving best model 2023-10-18 16:48:50,320 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:50,611 epoch 6 - iter 12/121 - loss 0.48606714 - time (sec): 0.29 - samples/sec: 8841.20 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:48:50,929 epoch 6 - iter 24/121 - loss 0.47768069 - time (sec): 0.61 - samples/sec: 8232.48 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:48:51,229 epoch 6 - iter 36/121 - loss 0.49199054 - time (sec): 0.91 - samples/sec: 8306.29 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:48:51,507 epoch 6 - iter 48/121 - loss 0.45279491 - time (sec): 1.19 - samples/sec: 8222.37 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:48:51,781 epoch 6 - iter 60/121 - loss 0.45754327 - time (sec): 1.46 - samples/sec: 8416.10 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:48:52,049 epoch 6 - iter 72/121 - loss 0.46451499 - time (sec): 1.73 - samples/sec: 8553.99 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:48:52,313 epoch 6 - iter 84/121 - loss 0.46666705 - time (sec): 1.99 - samples/sec: 8596.65 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:48:52,598 epoch 6 - iter 96/121 - loss 0.46486521 - time (sec): 2.28 - samples/sec: 8641.31 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:48:52,856 epoch 6 - iter 108/121 - loss 0.46672906 - time (sec): 2.54 - samples/sec: 8654.28 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:48:53,087 epoch 6 - iter 120/121 - loss 0.46874155 - time (sec): 2.77 - samples/sec: 8879.28 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:48:53,104 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:53,104 EPOCH 6 done: loss 0.4705 - lr: 0.000013 2023-10-18 16:48:53,519 DEV : loss 0.36645838618278503 - f1-score (micro avg) 0.4151 2023-10-18 16:48:53,524 saving best model 2023-10-18 16:48:53,558 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:53,838 epoch 7 - iter 12/121 - loss 0.50523558 - time (sec): 0.28 - samples/sec: 9255.92 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:48:54,101 epoch 7 - iter 24/121 - loss 0.52339164 - time (sec): 0.54 - samples/sec: 8657.21 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:48:54,392 epoch 7 - iter 36/121 - loss 0.48059441 - time (sec): 0.83 - samples/sec: 8492.88 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:48:54,658 epoch 7 - iter 48/121 - loss 0.46494498 - time (sec): 1.10 - samples/sec: 8560.68 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:48:54,908 epoch 7 - iter 60/121 - loss 0.45264593 - time (sec): 1.35 - samples/sec: 8831.21 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:48:55,174 epoch 7 - iter 72/121 - loss 0.43933276 - time (sec): 1.62 - samples/sec: 8974.31 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:48:55,445 epoch 7 - iter 84/121 - loss 0.43595279 - time (sec): 1.89 - samples/sec: 8984.70 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:48:55,726 epoch 7 - iter 96/121 - loss 0.43515044 - time (sec): 2.17 - samples/sec: 9174.93 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:48:56,000 epoch 7 - iter 108/121 - loss 0.43071109 - time (sec): 2.44 - samples/sec: 9085.00 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:48:56,270 epoch 7 - iter 120/121 - loss 0.43660508 - time (sec): 2.71 - samples/sec: 9052.66 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:48:56,291 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:56,291 EPOCH 7 done: loss 0.4377 - lr: 0.000010 2023-10-18 16:48:56,717 DEV : loss 0.34206482768058777 - f1-score (micro avg) 0.4824 2023-10-18 16:48:56,722 saving best model 2023-10-18 16:48:56,754 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:57,038 epoch 8 - iter 12/121 - loss 0.39657285 - time (sec): 0.28 - samples/sec: 9431.46 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:48:57,301 epoch 8 - iter 24/121 - loss 0.41035317 - time (sec): 0.55 - samples/sec: 9594.04 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:48:57,576 epoch 8 - iter 36/121 - loss 0.43193841 - time (sec): 0.82 - samples/sec: 9142.53 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:48:57,855 epoch 8 - iter 48/121 - loss 0.42462507 - time (sec): 1.10 - samples/sec: 8958.80 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:48:58,128 epoch 8 - iter 60/121 - loss 0.42070240 - time (sec): 1.37 - samples/sec: 9029.39 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:48:58,400 epoch 8 - iter 72/121 - loss 0.43213186 - time (sec): 1.64 - samples/sec: 9184.13 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:48:58,659 epoch 8 - iter 84/121 - loss 0.43272228 - time (sec): 1.90 - samples/sec: 9123.10 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:48:58,943 epoch 8 - iter 96/121 - loss 0.42418418 - time (sec): 2.19 - samples/sec: 9023.54 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:48:59,233 epoch 8 - iter 108/121 - loss 0.42133625 - time (sec): 2.48 - samples/sec: 8978.89 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:48:59,501 epoch 8 - iter 120/121 - loss 0.42313856 - time (sec): 2.75 - samples/sec: 8956.45 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:48:59,521 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:48:59,521 EPOCH 8 done: loss 0.4254 - lr: 0.000007 2023-10-18 16:48:59,953 DEV : loss 0.33342665433883667 - f1-score (micro avg) 0.4978 2023-10-18 16:48:59,958 saving best model 2023-10-18 16:48:59,989 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:49:00,250 epoch 9 - iter 12/121 - loss 0.38063288 - time (sec): 0.26 - samples/sec: 7829.16 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:49:00,517 epoch 9 - iter 24/121 - loss 0.39518238 - time (sec): 0.53 - samples/sec: 8546.75 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:49:00,801 epoch 9 - iter 36/121 - loss 0.41125975 - time (sec): 0.81 - samples/sec: 8801.67 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:49:01,072 epoch 9 - iter 48/121 - loss 0.42758374 - time (sec): 1.08 - samples/sec: 8857.82 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:49:01,308 epoch 9 - iter 60/121 - loss 0.43010899 - time (sec): 1.32 - samples/sec: 9144.43 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:49:01,487 epoch 9 - iter 72/121 - loss 0.41322277 - time (sec): 1.50 - samples/sec: 9543.12 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:49:01,756 epoch 9 - iter 84/121 - loss 0.41753163 - time (sec): 1.77 - samples/sec: 9339.69 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:49:02,050 epoch 9 - iter 96/121 - loss 0.41759424 - time (sec): 2.06 - samples/sec: 9336.66 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:49:02,327 epoch 9 - iter 108/121 - loss 0.41214860 - time (sec): 2.34 - samples/sec: 9410.62 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:49:02,609 epoch 9 - iter 120/121 - loss 0.40768587 - time (sec): 2.62 - samples/sec: 9430.79 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:49:02,630 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:49:02,630 EPOCH 9 done: loss 0.4079 - lr: 0.000004 2023-10-18 16:49:03,057 DEV : loss 0.3280840814113617 - f1-score (micro avg) 0.5015 2023-10-18 16:49:03,061 saving best model 2023-10-18 16:49:03,093 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:49:03,354 epoch 10 - iter 12/121 - loss 0.36093503 - time (sec): 0.26 - samples/sec: 8626.32 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:49:03,619 epoch 10 - iter 24/121 - loss 0.36817826 - time (sec): 0.53 - samples/sec: 8930.16 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:49:03,896 epoch 10 - iter 36/121 - loss 0.38006792 - time (sec): 0.80 - samples/sec: 8872.17 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:49:04,162 epoch 10 - iter 48/121 - loss 0.37238255 - time (sec): 1.07 - samples/sec: 9074.65 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:49:04,428 epoch 10 - iter 60/121 - loss 0.39144881 - time (sec): 1.33 - samples/sec: 9013.20 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:49:04,698 epoch 10 - iter 72/121 - loss 0.40218852 - time (sec): 1.60 - samples/sec: 9206.61 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:49:04,990 epoch 10 - iter 84/121 - loss 0.38920723 - time (sec): 1.90 - samples/sec: 9100.69 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:49:05,255 epoch 10 - iter 96/121 - loss 0.39328880 - time (sec): 2.16 - samples/sec: 9091.63 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:49:05,531 epoch 10 - iter 108/121 - loss 0.39338392 - time (sec): 2.44 - samples/sec: 9085.86 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:49:05,795 epoch 10 - iter 120/121 - loss 0.40198011 - time (sec): 2.70 - samples/sec: 9092.62 - lr: 0.000000 - momentum: 0.000000 2023-10-18 16:49:05,814 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:49:05,814 EPOCH 10 done: loss 0.4003 - lr: 0.000000 2023-10-18 16:49:06,241 DEV : loss 0.32836541533470154 - f1-score (micro avg) 0.4978 2023-10-18 16:49:06,276 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:49:06,276 Loading model from best epoch ... 2023-10-18 16:49:06,355 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-18 16:49:06,782 Results: - F-score (micro) 0.4535 - F-score (macro) 0.2114 - Accuracy 0.3051 By class: precision recall f1-score support pers 0.6099 0.6187 0.6143 139 scope 0.4361 0.4496 0.4427 129 work 0.0000 0.0000 0.0000 80 loc 0.0000 0.0000 0.0000 9 date 0.0000 0.0000 0.0000 3 micro avg 0.5236 0.4000 0.4535 360 macro avg 0.2092 0.2137 0.2114 360 weighted avg 0.3918 0.4000 0.3958 360 2023-10-18 16:49:06,782 ----------------------------------------------------------------------------------------------------