2023-10-18 14:37:23,776 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:23,776 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 14:37:23,777 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:23,777 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-18 14:37:23,777 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:23,777 Train: 1100 sentences 2023-10-18 14:37:23,777 (train_with_dev=False, train_with_test=False) 2023-10-18 14:37:23,777 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:23,777 Training Params: 2023-10-18 14:37:23,777 - learning_rate: "5e-05" 2023-10-18 14:37:23,777 - mini_batch_size: "4" 2023-10-18 14:37:23,777 - max_epochs: "10" 2023-10-18 14:37:23,777 - shuffle: "True" 2023-10-18 14:37:23,777 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:23,777 Plugins: 2023-10-18 14:37:23,777 - TensorboardLogger 2023-10-18 14:37:23,777 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 14:37:23,777 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:23,777 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 14:37:23,777 - metric: "('micro avg', 'f1-score')" 2023-10-18 14:37:23,777 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:23,777 Computation: 2023-10-18 14:37:23,777 - compute on device: cuda:0 2023-10-18 14:37:23,777 - embedding storage: none 2023-10-18 14:37:23,777 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:23,777 Model training base path: "hmbench-ajmc/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-18 14:37:23,777 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:23,777 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:23,777 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 14:37:24,249 epoch 1 - iter 27/275 - loss 3.44506819 - time (sec): 0.47 - samples/sec: 4697.62 - lr: 0.000005 - momentum: 0.000000 2023-10-18 14:37:24,721 epoch 1 - iter 54/275 - loss 3.48067933 - time (sec): 0.94 - samples/sec: 4744.75 - lr: 0.000010 - momentum: 0.000000 2023-10-18 14:37:25,162 epoch 1 - iter 81/275 - loss 3.35504312 - time (sec): 1.38 - samples/sec: 4992.64 - lr: 0.000015 - momentum: 0.000000 2023-10-18 14:37:25,566 epoch 1 - iter 108/275 - loss 3.14821801 - time (sec): 1.79 - samples/sec: 5106.42 - lr: 0.000019 - momentum: 0.000000 2023-10-18 14:37:25,958 epoch 1 - iter 135/275 - loss 2.93654229 - time (sec): 2.18 - samples/sec: 5199.03 - lr: 0.000024 - momentum: 0.000000 2023-10-18 14:37:26,361 epoch 1 - iter 162/275 - loss 2.70077102 - time (sec): 2.58 - samples/sec: 5277.96 - lr: 0.000029 - momentum: 0.000000 2023-10-18 14:37:26,773 epoch 1 - iter 189/275 - loss 2.49256861 - time (sec): 3.00 - samples/sec: 5340.80 - lr: 0.000034 - momentum: 0.000000 2023-10-18 14:37:27,172 epoch 1 - iter 216/275 - loss 2.32414752 - time (sec): 3.39 - samples/sec: 5335.46 - lr: 0.000039 - momentum: 0.000000 2023-10-18 14:37:27,580 epoch 1 - iter 243/275 - loss 2.16875250 - time (sec): 3.80 - samples/sec: 5329.08 - lr: 0.000044 - momentum: 0.000000 2023-10-18 14:37:27,981 epoch 1 - iter 270/275 - loss 2.06213790 - time (sec): 4.20 - samples/sec: 5304.75 - lr: 0.000049 - momentum: 0.000000 2023-10-18 14:37:28,055 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:28,056 EPOCH 1 done: loss 2.0352 - lr: 0.000049 2023-10-18 14:37:28,297 DEV : loss 0.7585274577140808 - f1-score (micro avg) 0.0 2023-10-18 14:37:28,301 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:28,727 epoch 2 - iter 27/275 - loss 0.91815449 - time (sec): 0.43 - samples/sec: 5672.01 - lr: 0.000049 - momentum: 0.000000 2023-10-18 14:37:29,137 epoch 2 - iter 54/275 - loss 0.91759626 - time (sec): 0.84 - samples/sec: 5681.95 - lr: 0.000049 - momentum: 0.000000 2023-10-18 14:37:29,537 epoch 2 - iter 81/275 - loss 0.86900748 - time (sec): 1.24 - samples/sec: 5630.96 - lr: 0.000048 - momentum: 0.000000 2023-10-18 14:37:29,943 epoch 2 - iter 108/275 - loss 0.89004770 - time (sec): 1.64 - samples/sec: 5689.86 - lr: 0.000048 - momentum: 0.000000 2023-10-18 14:37:30,347 epoch 2 - iter 135/275 - loss 0.85756509 - time (sec): 2.05 - samples/sec: 5681.95 - lr: 0.000047 - momentum: 0.000000 2023-10-18 14:37:30,763 epoch 2 - iter 162/275 - loss 0.81972010 - time (sec): 2.46 - samples/sec: 5549.23 - lr: 0.000047 - momentum: 0.000000 2023-10-18 14:37:31,218 epoch 2 - iter 189/275 - loss 0.80509335 - time (sec): 2.92 - samples/sec: 5507.30 - lr: 0.000046 - momentum: 0.000000 2023-10-18 14:37:31,622 epoch 2 - iter 216/275 - loss 0.78494681 - time (sec): 3.32 - samples/sec: 5519.16 - lr: 0.000046 - momentum: 0.000000 2023-10-18 14:37:32,015 epoch 2 - iter 243/275 - loss 0.75671714 - time (sec): 3.71 - samples/sec: 5462.01 - lr: 0.000045 - momentum: 0.000000 2023-10-18 14:37:32,420 epoch 2 - iter 270/275 - loss 0.74937046 - time (sec): 4.12 - samples/sec: 5447.96 - lr: 0.000045 - momentum: 0.000000 2023-10-18 14:37:32,496 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:32,496 EPOCH 2 done: loss 0.7456 - lr: 0.000045 2023-10-18 14:37:32,865 DEV : loss 0.5183359384536743 - f1-score (micro avg) 0.2379 2023-10-18 14:37:32,870 saving best model 2023-10-18 14:37:32,904 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:33,330 epoch 3 - iter 27/275 - loss 0.51579331 - time (sec): 0.43 - samples/sec: 4937.77 - lr: 0.000044 - momentum: 0.000000 2023-10-18 14:37:33,774 epoch 3 - iter 54/275 - loss 0.57900133 - time (sec): 0.87 - samples/sec: 5154.23 - lr: 0.000043 - momentum: 0.000000 2023-10-18 14:37:34,193 epoch 3 - iter 81/275 - loss 0.57979204 - time (sec): 1.29 - samples/sec: 5237.45 - lr: 0.000043 - momentum: 0.000000 2023-10-18 14:37:34,607 epoch 3 - iter 108/275 - loss 0.55881669 - time (sec): 1.70 - samples/sec: 5308.82 - lr: 0.000042 - momentum: 0.000000 2023-10-18 14:37:35,022 epoch 3 - iter 135/275 - loss 0.56674469 - time (sec): 2.12 - samples/sec: 5287.09 - lr: 0.000042 - momentum: 0.000000 2023-10-18 14:37:35,448 epoch 3 - iter 162/275 - loss 0.55143427 - time (sec): 2.54 - samples/sec: 5302.35 - lr: 0.000041 - momentum: 0.000000 2023-10-18 14:37:35,866 epoch 3 - iter 189/275 - loss 0.54519733 - time (sec): 2.96 - samples/sec: 5356.23 - lr: 0.000041 - momentum: 0.000000 2023-10-18 14:37:36,281 epoch 3 - iter 216/275 - loss 0.54485920 - time (sec): 3.38 - samples/sec: 5306.81 - lr: 0.000040 - momentum: 0.000000 2023-10-18 14:37:36,693 epoch 3 - iter 243/275 - loss 0.54816656 - time (sec): 3.79 - samples/sec: 5334.90 - lr: 0.000040 - momentum: 0.000000 2023-10-18 14:37:37,096 epoch 3 - iter 270/275 - loss 0.54292207 - time (sec): 4.19 - samples/sec: 5337.56 - lr: 0.000039 - momentum: 0.000000 2023-10-18 14:37:37,169 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:37,169 EPOCH 3 done: loss 0.5490 - lr: 0.000039 2023-10-18 14:37:37,657 DEV : loss 0.3785657584667206 - f1-score (micro avg) 0.4435 2023-10-18 14:37:37,661 saving best model 2023-10-18 14:37:37,698 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:38,103 epoch 4 - iter 27/275 - loss 0.43794792 - time (sec): 0.40 - samples/sec: 5198.77 - lr: 0.000038 - momentum: 0.000000 2023-10-18 14:37:38,518 epoch 4 - iter 54/275 - loss 0.45653236 - time (sec): 0.82 - samples/sec: 5273.23 - lr: 0.000038 - momentum: 0.000000 2023-10-18 14:37:38,948 epoch 4 - iter 81/275 - loss 0.48094814 - time (sec): 1.25 - samples/sec: 5270.53 - lr: 0.000037 - momentum: 0.000000 2023-10-18 14:37:39,368 epoch 4 - iter 108/275 - loss 0.50124443 - time (sec): 1.67 - samples/sec: 5394.21 - lr: 0.000037 - momentum: 0.000000 2023-10-18 14:37:39,767 epoch 4 - iter 135/275 - loss 0.49961782 - time (sec): 2.07 - samples/sec: 5281.28 - lr: 0.000036 - momentum: 0.000000 2023-10-18 14:37:40,175 epoch 4 - iter 162/275 - loss 0.48466438 - time (sec): 2.48 - samples/sec: 5341.82 - lr: 0.000036 - momentum: 0.000000 2023-10-18 14:37:40,581 epoch 4 - iter 189/275 - loss 0.46275687 - time (sec): 2.88 - samples/sec: 5409.93 - lr: 0.000035 - momentum: 0.000000 2023-10-18 14:37:40,956 epoch 4 - iter 216/275 - loss 0.46090295 - time (sec): 3.26 - samples/sec: 5546.22 - lr: 0.000035 - momentum: 0.000000 2023-10-18 14:37:41,323 epoch 4 - iter 243/275 - loss 0.44519482 - time (sec): 3.62 - samples/sec: 5529.30 - lr: 0.000034 - momentum: 0.000000 2023-10-18 14:37:41,686 epoch 4 - iter 270/275 - loss 0.44802928 - time (sec): 3.99 - samples/sec: 5565.68 - lr: 0.000034 - momentum: 0.000000 2023-10-18 14:37:41,760 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:41,760 EPOCH 4 done: loss 0.4482 - lr: 0.000034 2023-10-18 14:37:42,131 DEV : loss 0.3344648480415344 - f1-score (micro avg) 0.5163 2023-10-18 14:37:42,135 saving best model 2023-10-18 14:37:42,171 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:42,558 epoch 5 - iter 27/275 - loss 0.39398986 - time (sec): 0.39 - samples/sec: 5498.62 - lr: 0.000033 - momentum: 0.000000 2023-10-18 14:37:42,972 epoch 5 - iter 54/275 - loss 0.39201972 - time (sec): 0.80 - samples/sec: 5477.25 - lr: 0.000032 - momentum: 0.000000 2023-10-18 14:37:43,378 epoch 5 - iter 81/275 - loss 0.42707050 - time (sec): 1.21 - samples/sec: 5621.19 - lr: 0.000032 - momentum: 0.000000 2023-10-18 14:37:43,805 epoch 5 - iter 108/275 - loss 0.41357894 - time (sec): 1.63 - samples/sec: 5534.01 - lr: 0.000031 - momentum: 0.000000 2023-10-18 14:37:44,199 epoch 5 - iter 135/275 - loss 0.39300572 - time (sec): 2.03 - samples/sec: 5439.96 - lr: 0.000031 - momentum: 0.000000 2023-10-18 14:37:44,616 epoch 5 - iter 162/275 - loss 0.41764628 - time (sec): 2.44 - samples/sec: 5467.24 - lr: 0.000030 - momentum: 0.000000 2023-10-18 14:37:45,028 epoch 5 - iter 189/275 - loss 0.40744829 - time (sec): 2.86 - samples/sec: 5438.63 - lr: 0.000030 - momentum: 0.000000 2023-10-18 14:37:45,450 epoch 5 - iter 216/275 - loss 0.40549747 - time (sec): 3.28 - samples/sec: 5462.00 - lr: 0.000029 - momentum: 0.000000 2023-10-18 14:37:45,854 epoch 5 - iter 243/275 - loss 0.41187670 - time (sec): 3.68 - samples/sec: 5533.97 - lr: 0.000029 - momentum: 0.000000 2023-10-18 14:37:46,258 epoch 5 - iter 270/275 - loss 0.40473354 - time (sec): 4.09 - samples/sec: 5461.03 - lr: 0.000028 - momentum: 0.000000 2023-10-18 14:37:46,335 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:46,335 EPOCH 5 done: loss 0.4055 - lr: 0.000028 2023-10-18 14:37:46,704 DEV : loss 0.3040231168270111 - f1-score (micro avg) 0.5263 2023-10-18 14:37:46,708 saving best model 2023-10-18 14:37:46,743 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:47,162 epoch 6 - iter 27/275 - loss 0.28318195 - time (sec): 0.42 - samples/sec: 5428.70 - lr: 0.000027 - momentum: 0.000000 2023-10-18 14:37:47,570 epoch 6 - iter 54/275 - loss 0.32325644 - time (sec): 0.83 - samples/sec: 5290.25 - lr: 0.000027 - momentum: 0.000000 2023-10-18 14:37:47,978 epoch 6 - iter 81/275 - loss 0.32233930 - time (sec): 1.23 - samples/sec: 5357.66 - lr: 0.000026 - momentum: 0.000000 2023-10-18 14:37:48,379 epoch 6 - iter 108/275 - loss 0.35102071 - time (sec): 1.64 - samples/sec: 5391.97 - lr: 0.000026 - momentum: 0.000000 2023-10-18 14:37:48,808 epoch 6 - iter 135/275 - loss 0.35497201 - time (sec): 2.06 - samples/sec: 5496.09 - lr: 0.000025 - momentum: 0.000000 2023-10-18 14:37:49,214 epoch 6 - iter 162/275 - loss 0.34305539 - time (sec): 2.47 - samples/sec: 5510.79 - lr: 0.000025 - momentum: 0.000000 2023-10-18 14:37:49,627 epoch 6 - iter 189/275 - loss 0.34058497 - time (sec): 2.88 - samples/sec: 5503.56 - lr: 0.000024 - momentum: 0.000000 2023-10-18 14:37:50,028 epoch 6 - iter 216/275 - loss 0.34478547 - time (sec): 3.28 - samples/sec: 5486.92 - lr: 0.000024 - momentum: 0.000000 2023-10-18 14:37:50,437 epoch 6 - iter 243/275 - loss 0.35330126 - time (sec): 3.69 - samples/sec: 5497.16 - lr: 0.000023 - momentum: 0.000000 2023-10-18 14:37:50,847 epoch 6 - iter 270/275 - loss 0.35357063 - time (sec): 4.10 - samples/sec: 5479.72 - lr: 0.000022 - momentum: 0.000000 2023-10-18 14:37:50,918 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:50,918 EPOCH 6 done: loss 0.3545 - lr: 0.000022 2023-10-18 14:37:51,290 DEV : loss 0.2817199230194092 - f1-score (micro avg) 0.582 2023-10-18 14:37:51,294 saving best model 2023-10-18 14:37:51,327 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:51,697 epoch 7 - iter 27/275 - loss 0.40784625 - time (sec): 0.37 - samples/sec: 6641.22 - lr: 0.000022 - momentum: 0.000000 2023-10-18 14:37:52,076 epoch 7 - iter 54/275 - loss 0.37140173 - time (sec): 0.75 - samples/sec: 6025.89 - lr: 0.000021 - momentum: 0.000000 2023-10-18 14:37:52,448 epoch 7 - iter 81/275 - loss 0.36783411 - time (sec): 1.12 - samples/sec: 6035.72 - lr: 0.000021 - momentum: 0.000000 2023-10-18 14:37:52,817 epoch 7 - iter 108/275 - loss 0.36656962 - time (sec): 1.49 - samples/sec: 6007.96 - lr: 0.000020 - momentum: 0.000000 2023-10-18 14:37:53,183 epoch 7 - iter 135/275 - loss 0.36831965 - time (sec): 1.86 - samples/sec: 6038.79 - lr: 0.000020 - momentum: 0.000000 2023-10-18 14:37:53,550 epoch 7 - iter 162/275 - loss 0.36514824 - time (sec): 2.22 - samples/sec: 5963.27 - lr: 0.000019 - momentum: 0.000000 2023-10-18 14:37:53,926 epoch 7 - iter 189/275 - loss 0.36347761 - time (sec): 2.60 - samples/sec: 6010.83 - lr: 0.000019 - momentum: 0.000000 2023-10-18 14:37:54,295 epoch 7 - iter 216/275 - loss 0.35660454 - time (sec): 2.97 - samples/sec: 5958.45 - lr: 0.000018 - momentum: 0.000000 2023-10-18 14:37:54,697 epoch 7 - iter 243/275 - loss 0.34868514 - time (sec): 3.37 - samples/sec: 5917.06 - lr: 0.000017 - momentum: 0.000000 2023-10-18 14:37:55,098 epoch 7 - iter 270/275 - loss 0.34187809 - time (sec): 3.77 - samples/sec: 5905.06 - lr: 0.000017 - momentum: 0.000000 2023-10-18 14:37:55,174 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:55,174 EPOCH 7 done: loss 0.3399 - lr: 0.000017 2023-10-18 14:37:55,550 DEV : loss 0.2742369771003723 - f1-score (micro avg) 0.6027 2023-10-18 14:37:55,554 saving best model 2023-10-18 14:37:55,588 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:55,998 epoch 8 - iter 27/275 - loss 0.34922028 - time (sec): 0.41 - samples/sec: 5658.67 - lr: 0.000016 - momentum: 0.000000 2023-10-18 14:37:56,418 epoch 8 - iter 54/275 - loss 0.33758198 - time (sec): 0.83 - samples/sec: 5659.96 - lr: 0.000016 - momentum: 0.000000 2023-10-18 14:37:56,824 epoch 8 - iter 81/275 - loss 0.33718204 - time (sec): 1.24 - samples/sec: 5499.40 - lr: 0.000015 - momentum: 0.000000 2023-10-18 14:37:57,233 epoch 8 - iter 108/275 - loss 0.33915391 - time (sec): 1.64 - samples/sec: 5656.47 - lr: 0.000015 - momentum: 0.000000 2023-10-18 14:37:57,631 epoch 8 - iter 135/275 - loss 0.32312804 - time (sec): 2.04 - samples/sec: 5699.77 - lr: 0.000014 - momentum: 0.000000 2023-10-18 14:37:58,035 epoch 8 - iter 162/275 - loss 0.32118629 - time (sec): 2.45 - samples/sec: 5671.92 - lr: 0.000014 - momentum: 0.000000 2023-10-18 14:37:58,438 epoch 8 - iter 189/275 - loss 0.32178458 - time (sec): 2.85 - samples/sec: 5539.65 - lr: 0.000013 - momentum: 0.000000 2023-10-18 14:37:58,846 epoch 8 - iter 216/275 - loss 0.33027699 - time (sec): 3.26 - samples/sec: 5503.73 - lr: 0.000012 - momentum: 0.000000 2023-10-18 14:37:59,267 epoch 8 - iter 243/275 - loss 0.32755976 - time (sec): 3.68 - samples/sec: 5494.84 - lr: 0.000012 - momentum: 0.000000 2023-10-18 14:37:59,680 epoch 8 - iter 270/275 - loss 0.32156494 - time (sec): 4.09 - samples/sec: 5453.56 - lr: 0.000011 - momentum: 0.000000 2023-10-18 14:37:59,759 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:37:59,759 EPOCH 8 done: loss 0.3217 - lr: 0.000011 2023-10-18 14:38:00,145 DEV : loss 0.2673712968826294 - f1-score (micro avg) 0.6034 2023-10-18 14:38:00,150 saving best model 2023-10-18 14:38:00,184 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:38:00,594 epoch 9 - iter 27/275 - loss 0.32937183 - time (sec): 0.41 - samples/sec: 5143.51 - lr: 0.000011 - momentum: 0.000000 2023-10-18 14:38:00,991 epoch 9 - iter 54/275 - loss 0.31536696 - time (sec): 0.81 - samples/sec: 5353.34 - lr: 0.000010 - momentum: 0.000000 2023-10-18 14:38:01,407 epoch 9 - iter 81/275 - loss 0.31309459 - time (sec): 1.22 - samples/sec: 5425.99 - lr: 0.000010 - momentum: 0.000000 2023-10-18 14:38:01,823 epoch 9 - iter 108/275 - loss 0.31256041 - time (sec): 1.64 - samples/sec: 5424.62 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:38:02,228 epoch 9 - iter 135/275 - loss 0.31961799 - time (sec): 2.04 - samples/sec: 5404.80 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:38:02,645 epoch 9 - iter 162/275 - loss 0.32950955 - time (sec): 2.46 - samples/sec: 5390.67 - lr: 0.000008 - momentum: 0.000000 2023-10-18 14:38:03,059 epoch 9 - iter 189/275 - loss 0.31728190 - time (sec): 2.88 - samples/sec: 5495.58 - lr: 0.000007 - momentum: 0.000000 2023-10-18 14:38:03,467 epoch 9 - iter 216/275 - loss 0.31161439 - time (sec): 3.28 - samples/sec: 5474.93 - lr: 0.000007 - momentum: 0.000000 2023-10-18 14:38:03,880 epoch 9 - iter 243/275 - loss 0.30658292 - time (sec): 3.70 - samples/sec: 5483.54 - lr: 0.000006 - momentum: 0.000000 2023-10-18 14:38:04,285 epoch 9 - iter 270/275 - loss 0.30326656 - time (sec): 4.10 - samples/sec: 5474.44 - lr: 0.000006 - momentum: 0.000000 2023-10-18 14:38:04,355 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:38:04,355 EPOCH 9 done: loss 0.3037 - lr: 0.000006 2023-10-18 14:38:04,734 DEV : loss 0.26037830114364624 - f1-score (micro avg) 0.6202 2023-10-18 14:38:04,738 saving best model 2023-10-18 14:38:04,773 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:38:05,188 epoch 10 - iter 27/275 - loss 0.26237248 - time (sec): 0.41 - samples/sec: 5560.38 - lr: 0.000005 - momentum: 0.000000 2023-10-18 14:38:05,594 epoch 10 - iter 54/275 - loss 0.27646884 - time (sec): 0.82 - samples/sec: 5747.06 - lr: 0.000005 - momentum: 0.000000 2023-10-18 14:38:06,004 epoch 10 - iter 81/275 - loss 0.29469659 - time (sec): 1.23 - samples/sec: 5496.26 - lr: 0.000004 - momentum: 0.000000 2023-10-18 14:38:06,410 epoch 10 - iter 108/275 - loss 0.29742969 - time (sec): 1.64 - samples/sec: 5426.51 - lr: 0.000004 - momentum: 0.000000 2023-10-18 14:38:06,821 epoch 10 - iter 135/275 - loss 0.29161883 - time (sec): 2.05 - samples/sec: 5403.54 - lr: 0.000003 - momentum: 0.000000 2023-10-18 14:38:07,230 epoch 10 - iter 162/275 - loss 0.29003134 - time (sec): 2.46 - samples/sec: 5428.83 - lr: 0.000002 - momentum: 0.000000 2023-10-18 14:38:07,648 epoch 10 - iter 189/275 - loss 0.29233717 - time (sec): 2.87 - samples/sec: 5461.10 - lr: 0.000002 - momentum: 0.000000 2023-10-18 14:38:08,066 epoch 10 - iter 216/275 - loss 0.30332708 - time (sec): 3.29 - samples/sec: 5482.12 - lr: 0.000001 - momentum: 0.000000 2023-10-18 14:38:08,474 epoch 10 - iter 243/275 - loss 0.30305464 - time (sec): 3.70 - samples/sec: 5426.92 - lr: 0.000001 - momentum: 0.000000 2023-10-18 14:38:08,881 epoch 10 - iter 270/275 - loss 0.30092721 - time (sec): 4.11 - samples/sec: 5437.05 - lr: 0.000000 - momentum: 0.000000 2023-10-18 14:38:08,957 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:38:08,957 EPOCH 10 done: loss 0.3012 - lr: 0.000000 2023-10-18 14:38:09,329 DEV : loss 0.25936517119407654 - f1-score (micro avg) 0.6226 2023-10-18 14:38:09,334 saving best model 2023-10-18 14:38:09,395 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:38:09,395 Loading model from best epoch ... 2023-10-18 14:38:09,473 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-18 14:38:09,782 Results: - F-score (micro) 0.6523 - F-score (macro) 0.3883 - Accuracy 0.4942 By class: precision recall f1-score support scope 0.5862 0.6761 0.6280 176 pers 0.8716 0.7422 0.8017 128 work 0.4574 0.5811 0.5119 74 object 0.0000 0.0000 0.0000 2 loc 0.0000 0.0000 0.0000 2 micro avg 0.6330 0.6728 0.6523 382 macro avg 0.3830 0.3999 0.3883 382 weighted avg 0.6507 0.6728 0.6571 382 2023-10-18 14:38:09,782 ----------------------------------------------------------------------------------------------------