2023-10-18 16:41:17,095 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:17,095 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 16:41:17,095 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:17,095 MultiCorpus: 966 train + 219 dev + 204 test sentences - NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator 2023-10-18 16:41:17,095 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:17,095 Train: 966 sentences 2023-10-18 16:41:17,095 (train_with_dev=False, train_with_test=False) 2023-10-18 16:41:17,095 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:17,095 Training Params: 2023-10-18 16:41:17,095 - learning_rate: "3e-05" 2023-10-18 16:41:17,095 - mini_batch_size: "4" 2023-10-18 16:41:17,095 - max_epochs: "10" 2023-10-18 16:41:17,096 - shuffle: "True" 2023-10-18 16:41:17,096 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:17,096 Plugins: 2023-10-18 16:41:17,096 - TensorboardLogger 2023-10-18 16:41:17,096 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 16:41:17,096 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:17,096 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 16:41:17,096 - metric: "('micro avg', 'f1-score')" 2023-10-18 16:41:17,096 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:17,096 Computation: 2023-10-18 16:41:17,096 - compute on device: cuda:0 2023-10-18 16:41:17,096 - embedding storage: none 2023-10-18 16:41:17,096 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:17,096 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-18 16:41:17,096 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:17,096 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:17,096 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 16:41:17,492 epoch 1 - iter 24/242 - loss 3.97566800 - time (sec): 0.40 - samples/sec: 6562.42 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:41:17,862 epoch 1 - iter 48/242 - loss 3.88728222 - time (sec): 0.77 - samples/sec: 6600.90 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:41:18,251 epoch 1 - iter 72/242 - loss 3.77372937 - time (sec): 1.15 - samples/sec: 6685.97 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:41:18,629 epoch 1 - iter 96/242 - loss 3.66245852 - time (sec): 1.53 - samples/sec: 6686.47 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:41:18,999 epoch 1 - iter 120/242 - loss 3.49968645 - time (sec): 1.90 - samples/sec: 6637.90 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:41:19,366 epoch 1 - iter 144/242 - loss 3.34214291 - time (sec): 2.27 - samples/sec: 6474.08 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:41:19,746 epoch 1 - iter 168/242 - loss 3.13908340 - time (sec): 2.65 - samples/sec: 6418.52 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:41:20,131 epoch 1 - iter 192/242 - loss 2.88468383 - time (sec): 3.03 - samples/sec: 6494.93 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:41:20,503 epoch 1 - iter 216/242 - loss 2.68546128 - time (sec): 3.41 - samples/sec: 6491.83 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:41:20,874 epoch 1 - iter 240/242 - loss 2.51325016 - time (sec): 3.78 - samples/sec: 6507.60 - lr: 0.000030 - momentum: 0.000000 2023-10-18 16:41:20,902 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:20,902 EPOCH 1 done: loss 2.5062 - lr: 0.000030 2023-10-18 16:41:21,427 DEV : loss 0.672512412071228 - f1-score (micro avg) 0.0 2023-10-18 16:41:21,432 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:21,828 epoch 2 - iter 24/242 - loss 0.70009065 - time (sec): 0.40 - samples/sec: 7061.35 - lr: 0.000030 - momentum: 0.000000 2023-10-18 16:41:22,203 epoch 2 - iter 48/242 - loss 0.73593652 - time (sec): 0.77 - samples/sec: 7027.34 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:41:22,560 epoch 2 - iter 72/242 - loss 0.76919581 - time (sec): 1.13 - samples/sec: 6681.10 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:41:22,935 epoch 2 - iter 96/242 - loss 0.74922317 - time (sec): 1.50 - samples/sec: 6547.15 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:41:23,310 epoch 2 - iter 120/242 - loss 0.74285586 - time (sec): 1.88 - samples/sec: 6512.27 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:41:23,682 epoch 2 - iter 144/242 - loss 0.72485842 - time (sec): 2.25 - samples/sec: 6491.43 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:41:24,062 epoch 2 - iter 168/242 - loss 0.71444080 - time (sec): 2.63 - samples/sec: 6570.51 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:41:24,436 epoch 2 - iter 192/242 - loss 0.69081863 - time (sec): 3.00 - samples/sec: 6570.46 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:41:24,806 epoch 2 - iter 216/242 - loss 0.68555369 - time (sec): 3.37 - samples/sec: 6555.36 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:41:25,175 epoch 2 - iter 240/242 - loss 0.68707248 - time (sec): 3.74 - samples/sec: 6577.40 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:41:25,197 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:25,197 EPOCH 2 done: loss 0.6867 - lr: 0.000027 2023-10-18 16:41:25,622 DEV : loss 0.5189130902290344 - f1-score (micro avg) 0.0102 2023-10-18 16:41:25,626 saving best model 2023-10-18 16:41:25,654 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:26,035 epoch 3 - iter 24/242 - loss 0.50361751 - time (sec): 0.38 - samples/sec: 7122.22 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:41:26,403 epoch 3 - iter 48/242 - loss 0.56122713 - time (sec): 0.75 - samples/sec: 6650.81 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:41:26,779 epoch 3 - iter 72/242 - loss 0.56475439 - time (sec): 1.12 - samples/sec: 6714.10 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:41:27,161 epoch 3 - iter 96/242 - loss 0.58682047 - time (sec): 1.51 - samples/sec: 6563.41 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:41:27,545 epoch 3 - iter 120/242 - loss 0.58972486 - time (sec): 1.89 - samples/sec: 6696.26 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:41:27,915 epoch 3 - iter 144/242 - loss 0.58496460 - time (sec): 2.26 - samples/sec: 6579.90 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:41:28,283 epoch 3 - iter 168/242 - loss 0.56924852 - time (sec): 2.63 - samples/sec: 6496.92 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:41:28,670 epoch 3 - iter 192/242 - loss 0.55364809 - time (sec): 3.01 - samples/sec: 6533.37 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:41:29,051 epoch 3 - iter 216/242 - loss 0.55155880 - time (sec): 3.40 - samples/sec: 6574.07 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:41:29,421 epoch 3 - iter 240/242 - loss 0.54953543 - time (sec): 3.77 - samples/sec: 6533.68 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:41:29,453 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:29,453 EPOCH 3 done: loss 0.5477 - lr: 0.000023 2023-10-18 16:41:29,873 DEV : loss 0.4237130582332611 - f1-score (micro avg) 0.2653 2023-10-18 16:41:29,877 saving best model 2023-10-18 16:41:29,917 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:30,300 epoch 4 - iter 24/242 - loss 0.48364144 - time (sec): 0.38 - samples/sec: 6912.53 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:41:30,686 epoch 4 - iter 48/242 - loss 0.51994805 - time (sec): 0.77 - samples/sec: 6683.86 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:41:31,057 epoch 4 - iter 72/242 - loss 0.52784469 - time (sec): 1.14 - samples/sec: 6618.16 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:41:31,451 epoch 4 - iter 96/242 - loss 0.51164609 - time (sec): 1.53 - samples/sec: 6664.78 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:41:31,839 epoch 4 - iter 120/242 - loss 0.50961125 - time (sec): 1.92 - samples/sec: 6663.72 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:41:32,219 epoch 4 - iter 144/242 - loss 0.50995486 - time (sec): 2.30 - samples/sec: 6634.68 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:41:32,598 epoch 4 - iter 168/242 - loss 0.49633304 - time (sec): 2.68 - samples/sec: 6612.18 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:41:32,956 epoch 4 - iter 192/242 - loss 0.49982263 - time (sec): 3.04 - samples/sec: 6567.42 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:41:33,324 epoch 4 - iter 216/242 - loss 0.48995788 - time (sec): 3.41 - samples/sec: 6516.65 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:41:33,709 epoch 4 - iter 240/242 - loss 0.48290903 - time (sec): 3.79 - samples/sec: 6481.37 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:41:33,739 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:33,739 EPOCH 4 done: loss 0.4828 - lr: 0.000020 2023-10-18 16:41:34,170 DEV : loss 0.36858034133911133 - f1-score (micro avg) 0.3862 2023-10-18 16:41:34,174 saving best model 2023-10-18 16:41:34,207 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:34,574 epoch 5 - iter 24/242 - loss 0.35792216 - time (sec): 0.37 - samples/sec: 6812.29 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:41:34,940 epoch 5 - iter 48/242 - loss 0.39622776 - time (sec): 0.73 - samples/sec: 6789.52 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:41:35,299 epoch 5 - iter 72/242 - loss 0.40616871 - time (sec): 1.09 - samples/sec: 6647.08 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:41:35,677 epoch 5 - iter 96/242 - loss 0.42290359 - time (sec): 1.47 - samples/sec: 6588.03 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:41:36,052 epoch 5 - iter 120/242 - loss 0.42158650 - time (sec): 1.84 - samples/sec: 6590.18 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:41:36,426 epoch 5 - iter 144/242 - loss 0.42428076 - time (sec): 2.22 - samples/sec: 6659.62 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:41:36,803 epoch 5 - iter 168/242 - loss 0.43224460 - time (sec): 2.60 - samples/sec: 6627.02 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:41:37,187 epoch 5 - iter 192/242 - loss 0.42892199 - time (sec): 2.98 - samples/sec: 6572.09 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:41:37,570 epoch 5 - iter 216/242 - loss 0.43227011 - time (sec): 3.36 - samples/sec: 6481.55 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:41:37,949 epoch 5 - iter 240/242 - loss 0.42801633 - time (sec): 3.74 - samples/sec: 6589.80 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:41:37,975 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:37,975 EPOCH 5 done: loss 0.4281 - lr: 0.000017 2023-10-18 16:41:38,412 DEV : loss 0.3335218131542206 - f1-score (micro avg) 0.4482 2023-10-18 16:41:38,417 saving best model 2023-10-18 16:41:38,451 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:38,831 epoch 6 - iter 24/242 - loss 0.35305446 - time (sec): 0.38 - samples/sec: 6587.87 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:41:39,216 epoch 6 - iter 48/242 - loss 0.41778200 - time (sec): 0.76 - samples/sec: 6451.22 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:41:39,591 epoch 6 - iter 72/242 - loss 0.42746455 - time (sec): 1.14 - samples/sec: 6339.50 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:41:39,982 epoch 6 - iter 96/242 - loss 0.42763802 - time (sec): 1.53 - samples/sec: 6239.02 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:41:40,362 epoch 6 - iter 120/242 - loss 0.43815673 - time (sec): 1.91 - samples/sec: 6355.77 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:41:40,745 epoch 6 - iter 144/242 - loss 0.42172564 - time (sec): 2.29 - samples/sec: 6452.34 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:41:41,145 epoch 6 - iter 168/242 - loss 0.41527328 - time (sec): 2.69 - samples/sec: 6478.71 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:41:41,505 epoch 6 - iter 192/242 - loss 0.41397657 - time (sec): 3.05 - samples/sec: 6464.34 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:41:41,873 epoch 6 - iter 216/242 - loss 0.41130689 - time (sec): 3.42 - samples/sec: 6474.42 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:41:42,250 epoch 6 - iter 240/242 - loss 0.39981695 - time (sec): 3.80 - samples/sec: 6480.92 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:41:42,278 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:42,278 EPOCH 6 done: loss 0.3997 - lr: 0.000013 2023-10-18 16:41:42,710 DEV : loss 0.3300015926361084 - f1-score (micro avg) 0.4378 2023-10-18 16:41:42,715 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:43,048 epoch 7 - iter 24/242 - loss 0.42862414 - time (sec): 0.33 - samples/sec: 6646.57 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:41:43,376 epoch 7 - iter 48/242 - loss 0.37114435 - time (sec): 0.66 - samples/sec: 7202.03 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:41:43,711 epoch 7 - iter 72/242 - loss 0.37215748 - time (sec): 1.00 - samples/sec: 7154.33 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:41:44,057 epoch 7 - iter 96/242 - loss 0.38837681 - time (sec): 1.34 - samples/sec: 7251.69 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:41:44,401 epoch 7 - iter 120/242 - loss 0.38195453 - time (sec): 1.69 - samples/sec: 7437.50 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:41:44,731 epoch 7 - iter 144/242 - loss 0.37512029 - time (sec): 2.02 - samples/sec: 7301.71 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:41:45,073 epoch 7 - iter 168/242 - loss 0.37590169 - time (sec): 2.36 - samples/sec: 7356.38 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:41:45,409 epoch 7 - iter 192/242 - loss 0.38176902 - time (sec): 2.69 - samples/sec: 7310.49 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:41:45,745 epoch 7 - iter 216/242 - loss 0.38130426 - time (sec): 3.03 - samples/sec: 7285.49 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:41:46,081 epoch 7 - iter 240/242 - loss 0.37661077 - time (sec): 3.37 - samples/sec: 7311.89 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:41:46,106 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:46,106 EPOCH 7 done: loss 0.3766 - lr: 0.000010 2023-10-18 16:41:46,541 DEV : loss 0.30915191769599915 - f1-score (micro avg) 0.4514 2023-10-18 16:41:46,546 saving best model 2023-10-18 16:41:46,579 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:46,912 epoch 8 - iter 24/242 - loss 0.42549425 - time (sec): 0.33 - samples/sec: 6738.63 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:41:47,256 epoch 8 - iter 48/242 - loss 0.40114624 - time (sec): 0.68 - samples/sec: 6946.50 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:41:47,630 epoch 8 - iter 72/242 - loss 0.37937312 - time (sec): 1.05 - samples/sec: 6928.43 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:41:48,009 epoch 8 - iter 96/242 - loss 0.36570346 - time (sec): 1.43 - samples/sec: 6864.91 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:41:48,388 epoch 8 - iter 120/242 - loss 0.37366620 - time (sec): 1.81 - samples/sec: 6900.97 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:41:48,761 epoch 8 - iter 144/242 - loss 0.36450703 - time (sec): 2.18 - samples/sec: 6658.87 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:41:49,134 epoch 8 - iter 168/242 - loss 0.36780230 - time (sec): 2.55 - samples/sec: 6709.95 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:41:49,510 epoch 8 - iter 192/242 - loss 0.36502777 - time (sec): 2.93 - samples/sec: 6650.76 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:41:49,899 epoch 8 - iter 216/242 - loss 0.36138231 - time (sec): 3.32 - samples/sec: 6637.18 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:41:50,285 epoch 8 - iter 240/242 - loss 0.36260783 - time (sec): 3.71 - samples/sec: 6646.51 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:41:50,315 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:50,315 EPOCH 8 done: loss 0.3630 - lr: 0.000007 2023-10-18 16:41:50,740 DEV : loss 0.30288276076316833 - f1-score (micro avg) 0.4676 2023-10-18 16:41:50,745 saving best model 2023-10-18 16:41:50,777 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:51,149 epoch 9 - iter 24/242 - loss 0.32383020 - time (sec): 0.37 - samples/sec: 6857.05 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:41:51,512 epoch 9 - iter 48/242 - loss 0.38424255 - time (sec): 0.73 - samples/sec: 6700.43 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:41:51,891 epoch 9 - iter 72/242 - loss 0.37148297 - time (sec): 1.11 - samples/sec: 6455.79 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:41:52,281 epoch 9 - iter 96/242 - loss 0.35965702 - time (sec): 1.50 - samples/sec: 6341.74 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:41:52,652 epoch 9 - iter 120/242 - loss 0.35370300 - time (sec): 1.87 - samples/sec: 6244.81 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:41:53,041 epoch 9 - iter 144/242 - loss 0.35889500 - time (sec): 2.26 - samples/sec: 6447.29 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:41:53,427 epoch 9 - iter 168/242 - loss 0.34651646 - time (sec): 2.65 - samples/sec: 6518.22 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:41:53,793 epoch 9 - iter 192/242 - loss 0.35262267 - time (sec): 3.02 - samples/sec: 6519.84 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:41:54,158 epoch 9 - iter 216/242 - loss 0.35162943 - time (sec): 3.38 - samples/sec: 6521.88 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:41:54,527 epoch 9 - iter 240/242 - loss 0.35226423 - time (sec): 3.75 - samples/sec: 6533.72 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:41:54,558 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:54,559 EPOCH 9 done: loss 0.3530 - lr: 0.000003 2023-10-18 16:41:54,993 DEV : loss 0.2923411428928375 - f1-score (micro avg) 0.4627 2023-10-18 16:41:54,998 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:55,372 epoch 10 - iter 24/242 - loss 0.36790203 - time (sec): 0.37 - samples/sec: 6522.08 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:41:55,751 epoch 10 - iter 48/242 - loss 0.31762369 - time (sec): 0.75 - samples/sec: 6309.01 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:41:56,128 epoch 10 - iter 72/242 - loss 0.34190091 - time (sec): 1.13 - samples/sec: 6365.12 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:41:56,499 epoch 10 - iter 96/242 - loss 0.34999290 - time (sec): 1.50 - samples/sec: 6368.65 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:41:56,861 epoch 10 - iter 120/242 - loss 0.33463458 - time (sec): 1.86 - samples/sec: 6278.80 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:41:57,242 epoch 10 - iter 144/242 - loss 0.34136558 - time (sec): 2.24 - samples/sec: 6386.57 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:41:57,612 epoch 10 - iter 168/242 - loss 0.34804941 - time (sec): 2.61 - samples/sec: 6417.89 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:41:58,001 epoch 10 - iter 192/242 - loss 0.34878252 - time (sec): 3.00 - samples/sec: 6504.08 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:41:58,379 epoch 10 - iter 216/242 - loss 0.34552840 - time (sec): 3.38 - samples/sec: 6496.70 - lr: 0.000000 - momentum: 0.000000 2023-10-18 16:41:58,754 epoch 10 - iter 240/242 - loss 0.34464142 - time (sec): 3.76 - samples/sec: 6541.64 - lr: 0.000000 - momentum: 0.000000 2023-10-18 16:41:58,782 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:58,782 EPOCH 10 done: loss 0.3435 - lr: 0.000000 2023-10-18 16:41:59,219 DEV : loss 0.2907232642173767 - f1-score (micro avg) 0.4615 2023-10-18 16:41:59,254 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:41:59,255 Loading model from best epoch ... 2023-10-18 16:41:59,327 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-18 16:41:59,747 Results: - F-score (micro) 0.4068 - F-score (macro) 0.189 - Accuracy 0.2682 By class: precision recall f1-score support scope 0.3351 0.4884 0.3975 129 pers 0.5159 0.5827 0.5473 139 work 0.0000 0.0000 0.0000 80 loc 0.0000 0.0000 0.0000 9 date 0.0000 0.0000 0.0000 3 micro avg 0.4138 0.4000 0.4068 360 macro avg 0.1702 0.2142 0.1890 360 weighted avg 0.3193 0.4000 0.3537 360 2023-10-18 16:41:59,747 ----------------------------------------------------------------------------------------------------