2023-10-18 20:21:09,554 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:21:09,555 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 20:21:09,555 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:21:09,555 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-18 20:21:09,555 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:21:09,555 Train: 7936 sentences 2023-10-18 20:21:09,555 (train_with_dev=False, train_with_test=False) 2023-10-18 20:21:09,555 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:21:09,555 Training Params: 2023-10-18 20:21:09,555 - learning_rate: "3e-05" 2023-10-18 20:21:09,555 - mini_batch_size: "4" 2023-10-18 20:21:09,555 - max_epochs: "10" 2023-10-18 20:21:09,555 - shuffle: "True" 2023-10-18 20:21:09,555 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:21:09,555 Plugins: 2023-10-18 20:21:09,555 - TensorboardLogger 2023-10-18 20:21:09,555 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 20:21:09,555 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:21:09,555 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 20:21:09,555 - metric: "('micro avg', 'f1-score')" 2023-10-18 20:21:09,555 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:21:09,555 Computation: 2023-10-18 20:21:09,555 - compute on device: cuda:0 2023-10-18 20:21:09,555 - embedding storage: none 2023-10-18 20:21:09,555 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:21:09,556 Model training base path: "hmbench-icdar/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-18 20:21:09,556 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:21:09,556 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:21:09,556 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 20:21:13,160 epoch 1 - iter 198/1984 - loss 3.19239065 - time (sec): 3.60 - samples/sec: 4520.00 - lr: 0.000003 - momentum: 0.000000 2023-10-18 20:21:16,212 epoch 1 - iter 396/1984 - loss 2.81648826 - time (sec): 6.66 - samples/sec: 4879.36 - lr: 0.000006 - momentum: 0.000000 2023-10-18 20:21:19,275 epoch 1 - iter 594/1984 - loss 2.28737400 - time (sec): 9.72 - samples/sec: 5062.60 - lr: 0.000009 - momentum: 0.000000 2023-10-18 20:21:22,346 epoch 1 - iter 792/1984 - loss 1.86592392 - time (sec): 12.79 - samples/sec: 5140.12 - lr: 0.000012 - momentum: 0.000000 2023-10-18 20:21:25,305 epoch 1 - iter 990/1984 - loss 1.59866230 - time (sec): 15.75 - samples/sec: 5171.07 - lr: 0.000015 - momentum: 0.000000 2023-10-18 20:21:28,333 epoch 1 - iter 1188/1984 - loss 1.40641142 - time (sec): 18.78 - samples/sec: 5213.98 - lr: 0.000018 - momentum: 0.000000 2023-10-18 20:21:31,352 epoch 1 - iter 1386/1984 - loss 1.26714919 - time (sec): 21.80 - samples/sec: 5230.21 - lr: 0.000021 - momentum: 0.000000 2023-10-18 20:21:34,376 epoch 1 - iter 1584/1984 - loss 1.15707396 - time (sec): 24.82 - samples/sec: 5276.51 - lr: 0.000024 - momentum: 0.000000 2023-10-18 20:21:37,250 epoch 1 - iter 1782/1984 - loss 1.07044664 - time (sec): 27.69 - samples/sec: 5313.14 - lr: 0.000027 - momentum: 0.000000 2023-10-18 20:21:40,102 epoch 1 - iter 1980/1984 - loss 0.99619007 - time (sec): 30.55 - samples/sec: 5356.70 - lr: 0.000030 - momentum: 0.000000 2023-10-18 20:21:40,161 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:21:40,161 EPOCH 1 done: loss 0.9947 - lr: 0.000030 2023-10-18 20:21:41,606 DEV : loss 0.23625816404819489 - f1-score (micro avg) 0.2282 2023-10-18 20:21:41,623 saving best model 2023-10-18 20:21:41,659 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:21:44,713 epoch 2 - iter 198/1984 - loss 0.34878745 - time (sec): 3.05 - samples/sec: 5568.75 - lr: 0.000030 - momentum: 0.000000 2023-10-18 20:21:47,768 epoch 2 - iter 396/1984 - loss 0.32539556 - time (sec): 6.11 - samples/sec: 5539.51 - lr: 0.000029 - momentum: 0.000000 2023-10-18 20:21:51,158 epoch 2 - iter 594/1984 - loss 0.31675399 - time (sec): 9.50 - samples/sec: 5273.19 - lr: 0.000029 - momentum: 0.000000 2023-10-18 20:21:54,182 epoch 2 - iter 792/1984 - loss 0.30310641 - time (sec): 12.52 - samples/sec: 5269.51 - lr: 0.000029 - momentum: 0.000000 2023-10-18 20:21:57,250 epoch 2 - iter 990/1984 - loss 0.29671952 - time (sec): 15.59 - samples/sec: 5326.88 - lr: 0.000028 - momentum: 0.000000 2023-10-18 20:22:00,316 epoch 2 - iter 1188/1984 - loss 0.29371486 - time (sec): 18.66 - samples/sec: 5332.97 - lr: 0.000028 - momentum: 0.000000 2023-10-18 20:22:03,364 epoch 2 - iter 1386/1984 - loss 0.28559705 - time (sec): 21.70 - samples/sec: 5357.01 - lr: 0.000028 - momentum: 0.000000 2023-10-18 20:22:06,430 epoch 2 - iter 1584/1984 - loss 0.28450858 - time (sec): 24.77 - samples/sec: 5353.75 - lr: 0.000027 - momentum: 0.000000 2023-10-18 20:22:09,535 epoch 2 - iter 1782/1984 - loss 0.28203045 - time (sec): 27.87 - samples/sec: 5316.94 - lr: 0.000027 - momentum: 0.000000 2023-10-18 20:22:12,574 epoch 2 - iter 1980/1984 - loss 0.27933585 - time (sec): 30.91 - samples/sec: 5293.09 - lr: 0.000027 - momentum: 0.000000 2023-10-18 20:22:12,634 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:22:12,634 EPOCH 2 done: loss 0.2791 - lr: 0.000027 2023-10-18 20:22:14,454 DEV : loss 0.17796094715595245 - f1-score (micro avg) 0.3782 2023-10-18 20:22:14,472 saving best model 2023-10-18 20:22:14,505 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:22:17,635 epoch 3 - iter 198/1984 - loss 0.21732311 - time (sec): 3.13 - samples/sec: 5209.34 - lr: 0.000026 - momentum: 0.000000 2023-10-18 20:22:20,674 epoch 3 - iter 396/1984 - loss 0.21382431 - time (sec): 6.17 - samples/sec: 5318.22 - lr: 0.000026 - momentum: 0.000000 2023-10-18 20:22:23,712 epoch 3 - iter 594/1984 - loss 0.23635522 - time (sec): 9.21 - samples/sec: 5292.83 - lr: 0.000026 - momentum: 0.000000 2023-10-18 20:22:26,797 epoch 3 - iter 792/1984 - loss 0.23384899 - time (sec): 12.29 - samples/sec: 5342.49 - lr: 0.000025 - momentum: 0.000000 2023-10-18 20:22:29,630 epoch 3 - iter 990/1984 - loss 0.23319557 - time (sec): 15.12 - samples/sec: 5373.24 - lr: 0.000025 - momentum: 0.000000 2023-10-18 20:22:32,703 epoch 3 - iter 1188/1984 - loss 0.23261618 - time (sec): 18.20 - samples/sec: 5391.85 - lr: 0.000025 - momentum: 0.000000 2023-10-18 20:22:35,702 epoch 3 - iter 1386/1984 - loss 0.23393194 - time (sec): 21.20 - samples/sec: 5382.13 - lr: 0.000024 - momentum: 0.000000 2023-10-18 20:22:38,761 epoch 3 - iter 1584/1984 - loss 0.23335714 - time (sec): 24.26 - samples/sec: 5394.11 - lr: 0.000024 - momentum: 0.000000 2023-10-18 20:22:41,803 epoch 3 - iter 1782/1984 - loss 0.23015129 - time (sec): 27.30 - samples/sec: 5396.62 - lr: 0.000024 - momentum: 0.000000 2023-10-18 20:22:44,850 epoch 3 - iter 1980/1984 - loss 0.22849262 - time (sec): 30.34 - samples/sec: 5389.37 - lr: 0.000023 - momentum: 0.000000 2023-10-18 20:22:44,920 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:22:44,920 EPOCH 3 done: loss 0.2286 - lr: 0.000023 2023-10-18 20:22:46,719 DEV : loss 0.15816588699817657 - f1-score (micro avg) 0.446 2023-10-18 20:22:46,736 saving best model 2023-10-18 20:22:46,766 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:22:49,823 epoch 4 - iter 198/1984 - loss 0.21892155 - time (sec): 3.06 - samples/sec: 5353.62 - lr: 0.000023 - momentum: 0.000000 2023-10-18 20:22:52,844 epoch 4 - iter 396/1984 - loss 0.22023973 - time (sec): 6.08 - samples/sec: 5341.75 - lr: 0.000023 - momentum: 0.000000 2023-10-18 20:22:55,849 epoch 4 - iter 594/1984 - loss 0.21519154 - time (sec): 9.08 - samples/sec: 5280.36 - lr: 0.000022 - momentum: 0.000000 2023-10-18 20:22:58,885 epoch 4 - iter 792/1984 - loss 0.21938168 - time (sec): 12.12 - samples/sec: 5231.69 - lr: 0.000022 - momentum: 0.000000 2023-10-18 20:23:01,883 epoch 4 - iter 990/1984 - loss 0.21550684 - time (sec): 15.12 - samples/sec: 5252.24 - lr: 0.000022 - momentum: 0.000000 2023-10-18 20:23:04,896 epoch 4 - iter 1188/1984 - loss 0.20933788 - time (sec): 18.13 - samples/sec: 5284.31 - lr: 0.000021 - momentum: 0.000000 2023-10-18 20:23:07,904 epoch 4 - iter 1386/1984 - loss 0.20913435 - time (sec): 21.14 - samples/sec: 5366.26 - lr: 0.000021 - momentum: 0.000000 2023-10-18 20:23:10,953 epoch 4 - iter 1584/1984 - loss 0.20611639 - time (sec): 24.19 - samples/sec: 5381.99 - lr: 0.000021 - momentum: 0.000000 2023-10-18 20:23:13,935 epoch 4 - iter 1782/1984 - loss 0.20643644 - time (sec): 27.17 - samples/sec: 5373.80 - lr: 0.000020 - momentum: 0.000000 2023-10-18 20:23:16,968 epoch 4 - iter 1980/1984 - loss 0.20253333 - time (sec): 30.20 - samples/sec: 5418.56 - lr: 0.000020 - momentum: 0.000000 2023-10-18 20:23:17,030 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:23:17,030 EPOCH 4 done: loss 0.2025 - lr: 0.000020 2023-10-18 20:23:18,839 DEV : loss 0.15379515290260315 - f1-score (micro avg) 0.5326 2023-10-18 20:23:18,856 saving best model 2023-10-18 20:23:18,889 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:23:21,899 epoch 5 - iter 198/1984 - loss 0.23060341 - time (sec): 3.01 - samples/sec: 4955.51 - lr: 0.000020 - momentum: 0.000000 2023-10-18 20:23:25,011 epoch 5 - iter 396/1984 - loss 0.19610837 - time (sec): 6.12 - samples/sec: 5314.25 - lr: 0.000019 - momentum: 0.000000 2023-10-18 20:23:28,034 epoch 5 - iter 594/1984 - loss 0.19599719 - time (sec): 9.14 - samples/sec: 5362.28 - lr: 0.000019 - momentum: 0.000000 2023-10-18 20:23:31,070 epoch 5 - iter 792/1984 - loss 0.19167242 - time (sec): 12.18 - samples/sec: 5419.94 - lr: 0.000019 - momentum: 0.000000 2023-10-18 20:23:34,098 epoch 5 - iter 990/1984 - loss 0.18849708 - time (sec): 15.21 - samples/sec: 5363.90 - lr: 0.000018 - momentum: 0.000000 2023-10-18 20:23:37,129 epoch 5 - iter 1188/1984 - loss 0.18868575 - time (sec): 18.24 - samples/sec: 5392.29 - lr: 0.000018 - momentum: 0.000000 2023-10-18 20:23:40,183 epoch 5 - iter 1386/1984 - loss 0.18996251 - time (sec): 21.29 - samples/sec: 5399.95 - lr: 0.000018 - momentum: 0.000000 2023-10-18 20:23:43,230 epoch 5 - iter 1584/1984 - loss 0.18960056 - time (sec): 24.34 - samples/sec: 5407.03 - lr: 0.000017 - momentum: 0.000000 2023-10-18 20:23:46,294 epoch 5 - iter 1782/1984 - loss 0.18960480 - time (sec): 27.40 - samples/sec: 5392.96 - lr: 0.000017 - momentum: 0.000000 2023-10-18 20:23:49,245 epoch 5 - iter 1980/1984 - loss 0.18902609 - time (sec): 30.35 - samples/sec: 5390.97 - lr: 0.000017 - momentum: 0.000000 2023-10-18 20:23:49,308 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:23:49,308 EPOCH 5 done: loss 0.1890 - lr: 0.000017 2023-10-18 20:23:51,124 DEV : loss 0.14417240023612976 - f1-score (micro avg) 0.5654 2023-10-18 20:23:51,141 saving best model 2023-10-18 20:23:51,174 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:23:54,192 epoch 6 - iter 198/1984 - loss 0.20691944 - time (sec): 3.02 - samples/sec: 5108.94 - lr: 0.000016 - momentum: 0.000000 2023-10-18 20:23:57,214 epoch 6 - iter 396/1984 - loss 0.18917630 - time (sec): 6.04 - samples/sec: 5249.12 - lr: 0.000016 - momentum: 0.000000 2023-10-18 20:24:00,217 epoch 6 - iter 594/1984 - loss 0.18665542 - time (sec): 9.04 - samples/sec: 5247.48 - lr: 0.000016 - momentum: 0.000000 2023-10-18 20:24:03,320 epoch 6 - iter 792/1984 - loss 0.18917319 - time (sec): 12.15 - samples/sec: 5235.56 - lr: 0.000015 - momentum: 0.000000 2023-10-18 20:24:06,415 epoch 6 - iter 990/1984 - loss 0.18919754 - time (sec): 15.24 - samples/sec: 5275.00 - lr: 0.000015 - momentum: 0.000000 2023-10-18 20:24:09,474 epoch 6 - iter 1188/1984 - loss 0.18272315 - time (sec): 18.30 - samples/sec: 5320.83 - lr: 0.000015 - momentum: 0.000000 2023-10-18 20:24:12,446 epoch 6 - iter 1386/1984 - loss 0.17852437 - time (sec): 21.27 - samples/sec: 5378.84 - lr: 0.000014 - momentum: 0.000000 2023-10-18 20:24:15,144 epoch 6 - iter 1584/1984 - loss 0.17782946 - time (sec): 23.97 - samples/sec: 5458.78 - lr: 0.000014 - momentum: 0.000000 2023-10-18 20:24:18,117 epoch 6 - iter 1782/1984 - loss 0.17992881 - time (sec): 26.94 - samples/sec: 5465.89 - lr: 0.000014 - momentum: 0.000000 2023-10-18 20:24:21,064 epoch 6 - iter 1980/1984 - loss 0.17716357 - time (sec): 29.89 - samples/sec: 5475.25 - lr: 0.000013 - momentum: 0.000000 2023-10-18 20:24:21,123 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:24:21,123 EPOCH 6 done: loss 0.1771 - lr: 0.000013 2023-10-18 20:24:22,948 DEV : loss 0.14219804108142853 - f1-score (micro avg) 0.5755 2023-10-18 20:24:22,965 saving best model 2023-10-18 20:24:22,998 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:24:26,033 epoch 7 - iter 198/1984 - loss 0.21070221 - time (sec): 3.03 - samples/sec: 5279.32 - lr: 0.000013 - momentum: 0.000000 2023-10-18 20:24:28,996 epoch 7 - iter 396/1984 - loss 0.18898161 - time (sec): 6.00 - samples/sec: 5395.21 - lr: 0.000013 - momentum: 0.000000 2023-10-18 20:24:32,004 epoch 7 - iter 594/1984 - loss 0.18467547 - time (sec): 9.00 - samples/sec: 5381.83 - lr: 0.000012 - momentum: 0.000000 2023-10-18 20:24:35,059 epoch 7 - iter 792/1984 - loss 0.17499372 - time (sec): 12.06 - samples/sec: 5446.69 - lr: 0.000012 - momentum: 0.000000 2023-10-18 20:24:38,049 epoch 7 - iter 990/1984 - loss 0.17390168 - time (sec): 15.05 - samples/sec: 5481.36 - lr: 0.000012 - momentum: 0.000000 2023-10-18 20:24:41,030 epoch 7 - iter 1188/1984 - loss 0.17296841 - time (sec): 18.03 - samples/sec: 5453.01 - lr: 0.000011 - momentum: 0.000000 2023-10-18 20:24:44,215 epoch 7 - iter 1386/1984 - loss 0.16940126 - time (sec): 21.22 - samples/sec: 5422.00 - lr: 0.000011 - momentum: 0.000000 2023-10-18 20:24:47,216 epoch 7 - iter 1584/1984 - loss 0.16927956 - time (sec): 24.22 - samples/sec: 5404.10 - lr: 0.000011 - momentum: 0.000000 2023-10-18 20:24:50,282 epoch 7 - iter 1782/1984 - loss 0.16921141 - time (sec): 27.28 - samples/sec: 5385.26 - lr: 0.000010 - momentum: 0.000000 2023-10-18 20:24:53,363 epoch 7 - iter 1980/1984 - loss 0.16921911 - time (sec): 30.36 - samples/sec: 5393.77 - lr: 0.000010 - momentum: 0.000000 2023-10-18 20:24:53,427 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:24:53,427 EPOCH 7 done: loss 0.1692 - lr: 0.000010 2023-10-18 20:24:55,548 DEV : loss 0.1411110907793045 - f1-score (micro avg) 0.5873 2023-10-18 20:24:55,564 saving best model 2023-10-18 20:24:55,599 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:24:58,631 epoch 8 - iter 198/1984 - loss 0.16894551 - time (sec): 3.03 - samples/sec: 5308.14 - lr: 0.000010 - momentum: 0.000000 2023-10-18 20:25:01,643 epoch 8 - iter 396/1984 - loss 0.16387924 - time (sec): 6.04 - samples/sec: 5258.74 - lr: 0.000009 - momentum: 0.000000 2023-10-18 20:25:04,674 epoch 8 - iter 594/1984 - loss 0.16514882 - time (sec): 9.07 - samples/sec: 5202.23 - lr: 0.000009 - momentum: 0.000000 2023-10-18 20:25:07,774 epoch 8 - iter 792/1984 - loss 0.16780131 - time (sec): 12.18 - samples/sec: 5274.47 - lr: 0.000009 - momentum: 0.000000 2023-10-18 20:25:10,830 epoch 8 - iter 990/1984 - loss 0.16733526 - time (sec): 15.23 - samples/sec: 5256.42 - lr: 0.000008 - momentum: 0.000000 2023-10-18 20:25:13,904 epoch 8 - iter 1188/1984 - loss 0.16373840 - time (sec): 18.30 - samples/sec: 5340.86 - lr: 0.000008 - momentum: 0.000000 2023-10-18 20:25:16,887 epoch 8 - iter 1386/1984 - loss 0.16181785 - time (sec): 21.29 - samples/sec: 5329.93 - lr: 0.000008 - momentum: 0.000000 2023-10-18 20:25:19,959 epoch 8 - iter 1584/1984 - loss 0.16227133 - time (sec): 24.36 - samples/sec: 5355.25 - lr: 0.000007 - momentum: 0.000000 2023-10-18 20:25:23,008 epoch 8 - iter 1782/1984 - loss 0.16335469 - time (sec): 27.41 - samples/sec: 5361.81 - lr: 0.000007 - momentum: 0.000000 2023-10-18 20:25:26,049 epoch 8 - iter 1980/1984 - loss 0.16311903 - time (sec): 30.45 - samples/sec: 5376.89 - lr: 0.000007 - momentum: 0.000000 2023-10-18 20:25:26,105 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:25:26,105 EPOCH 8 done: loss 0.1630 - lr: 0.000007 2023-10-18 20:25:27,913 DEV : loss 0.14169389009475708 - f1-score (micro avg) 0.6026 2023-10-18 20:25:27,932 saving best model 2023-10-18 20:25:27,966 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:25:31,010 epoch 9 - iter 198/1984 - loss 0.14872512 - time (sec): 3.04 - samples/sec: 5336.13 - lr: 0.000006 - momentum: 0.000000 2023-10-18 20:25:34,053 epoch 9 - iter 396/1984 - loss 0.16471370 - time (sec): 6.09 - samples/sec: 5390.47 - lr: 0.000006 - momentum: 0.000000 2023-10-18 20:25:37,062 epoch 9 - iter 594/1984 - loss 0.16359773 - time (sec): 9.10 - samples/sec: 5307.53 - lr: 0.000006 - momentum: 0.000000 2023-10-18 20:25:40,148 epoch 9 - iter 792/1984 - loss 0.16213124 - time (sec): 12.18 - samples/sec: 5246.00 - lr: 0.000005 - momentum: 0.000000 2023-10-18 20:25:43,182 epoch 9 - iter 990/1984 - loss 0.16071329 - time (sec): 15.21 - samples/sec: 5310.46 - lr: 0.000005 - momentum: 0.000000 2023-10-18 20:25:46,211 epoch 9 - iter 1188/1984 - loss 0.16040075 - time (sec): 18.24 - samples/sec: 5307.42 - lr: 0.000005 - momentum: 0.000000 2023-10-18 20:25:49,259 epoch 9 - iter 1386/1984 - loss 0.15955613 - time (sec): 21.29 - samples/sec: 5367.69 - lr: 0.000004 - momentum: 0.000000 2023-10-18 20:25:52,294 epoch 9 - iter 1584/1984 - loss 0.16153404 - time (sec): 24.33 - samples/sec: 5350.28 - lr: 0.000004 - momentum: 0.000000 2023-10-18 20:25:55,337 epoch 9 - iter 1782/1984 - loss 0.16026350 - time (sec): 27.37 - samples/sec: 5356.94 - lr: 0.000004 - momentum: 0.000000 2023-10-18 20:25:58,420 epoch 9 - iter 1980/1984 - loss 0.16022030 - time (sec): 30.45 - samples/sec: 5375.18 - lr: 0.000003 - momentum: 0.000000 2023-10-18 20:25:58,483 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:25:58,483 EPOCH 9 done: loss 0.1602 - lr: 0.000003 2023-10-18 20:26:00,280 DEV : loss 0.14077164232730865 - f1-score (micro avg) 0.5999 2023-10-18 20:26:00,296 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:26:03,366 epoch 10 - iter 198/1984 - loss 0.15429908 - time (sec): 3.07 - samples/sec: 5117.20 - lr: 0.000003 - momentum: 0.000000 2023-10-18 20:26:06,590 epoch 10 - iter 396/1984 - loss 0.16153646 - time (sec): 6.29 - samples/sec: 5099.32 - lr: 0.000003 - momentum: 0.000000 2023-10-18 20:26:09,606 epoch 10 - iter 594/1984 - loss 0.16009665 - time (sec): 9.31 - samples/sec: 5150.65 - lr: 0.000002 - momentum: 0.000000 2023-10-18 20:26:12,646 epoch 10 - iter 792/1984 - loss 0.15349866 - time (sec): 12.35 - samples/sec: 5233.90 - lr: 0.000002 - momentum: 0.000000 2023-10-18 20:26:15,665 epoch 10 - iter 990/1984 - loss 0.15500561 - time (sec): 15.37 - samples/sec: 5272.80 - lr: 0.000002 - momentum: 0.000000 2023-10-18 20:26:18,672 epoch 10 - iter 1188/1984 - loss 0.15624539 - time (sec): 18.37 - samples/sec: 5268.47 - lr: 0.000001 - momentum: 0.000000 2023-10-18 20:26:21,752 epoch 10 - iter 1386/1984 - loss 0.15463573 - time (sec): 21.46 - samples/sec: 5328.03 - lr: 0.000001 - momentum: 0.000000 2023-10-18 20:26:24,848 epoch 10 - iter 1584/1984 - loss 0.15358641 - time (sec): 24.55 - samples/sec: 5300.75 - lr: 0.000001 - momentum: 0.000000 2023-10-18 20:26:27,969 epoch 10 - iter 1782/1984 - loss 0.15337495 - time (sec): 27.67 - samples/sec: 5306.87 - lr: 0.000000 - momentum: 0.000000 2023-10-18 20:26:30,832 epoch 10 - iter 1980/1984 - loss 0.15484121 - time (sec): 30.54 - samples/sec: 5357.87 - lr: 0.000000 - momentum: 0.000000 2023-10-18 20:26:30,891 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:26:30,891 EPOCH 10 done: loss 0.1550 - lr: 0.000000 2023-10-18 20:26:32,699 DEV : loss 0.13979580998420715 - f1-score (micro avg) 0.6034 2023-10-18 20:26:32,718 saving best model 2023-10-18 20:26:32,781 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:26:32,782 Loading model from best epoch ... 2023-10-18 20:26:32,869 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-18 20:26:34,366 Results: - F-score (micro) 0.6068 - F-score (macro) 0.436 - Accuracy 0.4819 By class: precision recall f1-score support LOC 0.7375 0.6992 0.7179 655 PER 0.4011 0.6547 0.4974 223 ORG 0.2917 0.0551 0.0927 127 micro avg 0.6056 0.6080 0.6068 1005 macro avg 0.4768 0.4697 0.4360 1005 weighted avg 0.6065 0.6080 0.5900 1005 2023-10-18 20:26:34,366 ----------------------------------------------------------------------------------------------------