2023-10-17 18:04:35,825 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:04:35,826 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 18:04:35,826 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:04:35,826 MultiCorpus: 5777 train + 722 dev + 723 test sentences - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl 2023-10-17 18:04:35,826 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:04:35,826 Train: 5777 sentences 2023-10-17 18:04:35,826 (train_with_dev=False, train_with_test=False) 2023-10-17 18:04:35,826 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:04:35,826 Training Params: 2023-10-17 18:04:35,826 - learning_rate: "3e-05" 2023-10-17 18:04:35,826 - mini_batch_size: "8" 2023-10-17 18:04:35,826 - max_epochs: "10" 2023-10-17 18:04:35,826 - shuffle: "True" 2023-10-17 18:04:35,826 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:04:35,826 Plugins: 2023-10-17 18:04:35,826 - TensorboardLogger 2023-10-17 18:04:35,826 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 18:04:35,827 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:04:35,827 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 18:04:35,827 - metric: "('micro avg', 'f1-score')" 2023-10-17 18:04:35,827 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:04:35,827 Computation: 2023-10-17 18:04:35,827 - compute on device: cuda:0 2023-10-17 18:04:35,827 - embedding storage: none 2023-10-17 18:04:35,827 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:04:35,827 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-17 18:04:35,827 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:04:35,827 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:04:35,827 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 18:04:40,869 epoch 1 - iter 72/723 - loss 3.14780265 - time (sec): 5.04 - samples/sec: 3212.85 - lr: 0.000003 - momentum: 0.000000 2023-10-17 18:04:46,354 epoch 1 - iter 144/723 - loss 1.91689024 - time (sec): 10.53 - samples/sec: 3248.38 - lr: 0.000006 - momentum: 0.000000 2023-10-17 18:04:51,606 epoch 1 - iter 216/723 - loss 1.35009273 - time (sec): 15.78 - samples/sec: 3276.04 - lr: 0.000009 - momentum: 0.000000 2023-10-17 18:04:56,946 epoch 1 - iter 288/723 - loss 1.05729540 - time (sec): 21.12 - samples/sec: 3274.92 - lr: 0.000012 - momentum: 0.000000 2023-10-17 18:05:02,237 epoch 1 - iter 360/723 - loss 0.87211949 - time (sec): 26.41 - samples/sec: 3299.80 - lr: 0.000015 - momentum: 0.000000 2023-10-17 18:05:07,703 epoch 1 - iter 432/723 - loss 0.74573597 - time (sec): 31.87 - samples/sec: 3312.62 - lr: 0.000018 - momentum: 0.000000 2023-10-17 18:05:12,988 epoch 1 - iter 504/723 - loss 0.65737284 - time (sec): 37.16 - samples/sec: 3318.84 - lr: 0.000021 - momentum: 0.000000 2023-10-17 18:05:18,380 epoch 1 - iter 576/723 - loss 0.59117923 - time (sec): 42.55 - samples/sec: 3318.60 - lr: 0.000024 - momentum: 0.000000 2023-10-17 18:05:24,033 epoch 1 - iter 648/723 - loss 0.53883556 - time (sec): 48.20 - samples/sec: 3299.42 - lr: 0.000027 - momentum: 0.000000 2023-10-17 18:05:28,712 epoch 1 - iter 720/723 - loss 0.49808664 - time (sec): 52.88 - samples/sec: 3321.46 - lr: 0.000030 - momentum: 0.000000 2023-10-17 18:05:28,893 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:05:28,894 EPOCH 1 done: loss 0.4968 - lr: 0.000030 2023-10-17 18:05:32,615 DEV : loss 0.10252858698368073 - f1-score (micro avg) 0.6435 2023-10-17 18:05:32,634 saving best model 2023-10-17 18:05:33,009 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:05:38,412 epoch 2 - iter 72/723 - loss 0.08824932 - time (sec): 5.40 - samples/sec: 3445.98 - lr: 0.000030 - momentum: 0.000000 2023-10-17 18:05:43,268 epoch 2 - iter 144/723 - loss 0.09284165 - time (sec): 10.26 - samples/sec: 3459.67 - lr: 0.000029 - momentum: 0.000000 2023-10-17 18:05:49,301 epoch 2 - iter 216/723 - loss 0.09672994 - time (sec): 16.29 - samples/sec: 3286.52 - lr: 0.000029 - momentum: 0.000000 2023-10-17 18:05:54,557 epoch 2 - iter 288/723 - loss 0.09200487 - time (sec): 21.55 - samples/sec: 3302.04 - lr: 0.000029 - momentum: 0.000000 2023-10-17 18:05:59,668 epoch 2 - iter 360/723 - loss 0.09466436 - time (sec): 26.66 - samples/sec: 3295.03 - lr: 0.000028 - momentum: 0.000000 2023-10-17 18:06:04,651 epoch 2 - iter 432/723 - loss 0.09512636 - time (sec): 31.64 - samples/sec: 3301.25 - lr: 0.000028 - momentum: 0.000000 2023-10-17 18:06:10,475 epoch 2 - iter 504/723 - loss 0.09272504 - time (sec): 37.47 - samples/sec: 3290.58 - lr: 0.000028 - momentum: 0.000000 2023-10-17 18:06:15,387 epoch 2 - iter 576/723 - loss 0.09030789 - time (sec): 42.38 - samples/sec: 3293.41 - lr: 0.000027 - momentum: 0.000000 2023-10-17 18:06:20,677 epoch 2 - iter 648/723 - loss 0.08892716 - time (sec): 47.67 - samples/sec: 3294.64 - lr: 0.000027 - momentum: 0.000000 2023-10-17 18:06:26,199 epoch 2 - iter 720/723 - loss 0.08670549 - time (sec): 53.19 - samples/sec: 3305.28 - lr: 0.000027 - momentum: 0.000000 2023-10-17 18:06:26,360 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:06:26,360 EPOCH 2 done: loss 0.0868 - lr: 0.000027 2023-10-17 18:06:29,657 DEV : loss 0.07472483813762665 - f1-score (micro avg) 0.8086 2023-10-17 18:06:29,676 saving best model 2023-10-17 18:06:30,161 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:06:35,368 epoch 3 - iter 72/723 - loss 0.05396348 - time (sec): 5.21 - samples/sec: 3359.73 - lr: 0.000026 - momentum: 0.000000 2023-10-17 18:06:40,571 epoch 3 - iter 144/723 - loss 0.05386152 - time (sec): 10.41 - samples/sec: 3348.20 - lr: 0.000026 - momentum: 0.000000 2023-10-17 18:06:45,548 epoch 3 - iter 216/723 - loss 0.05763223 - time (sec): 15.39 - samples/sec: 3351.57 - lr: 0.000026 - momentum: 0.000000 2023-10-17 18:06:50,644 epoch 3 - iter 288/723 - loss 0.05871276 - time (sec): 20.48 - samples/sec: 3338.22 - lr: 0.000025 - momentum: 0.000000 2023-10-17 18:06:56,207 epoch 3 - iter 360/723 - loss 0.05735120 - time (sec): 26.05 - samples/sec: 3312.70 - lr: 0.000025 - momentum: 0.000000 2023-10-17 18:07:02,101 epoch 3 - iter 432/723 - loss 0.05744013 - time (sec): 31.94 - samples/sec: 3320.41 - lr: 0.000025 - momentum: 0.000000 2023-10-17 18:07:07,544 epoch 3 - iter 504/723 - loss 0.05703734 - time (sec): 37.38 - samples/sec: 3329.27 - lr: 0.000024 - momentum: 0.000000 2023-10-17 18:07:12,605 epoch 3 - iter 576/723 - loss 0.05670689 - time (sec): 42.44 - samples/sec: 3338.18 - lr: 0.000024 - momentum: 0.000000 2023-10-17 18:07:17,846 epoch 3 - iter 648/723 - loss 0.05681013 - time (sec): 47.68 - samples/sec: 3330.06 - lr: 0.000024 - momentum: 0.000000 2023-10-17 18:07:23,059 epoch 3 - iter 720/723 - loss 0.05774106 - time (sec): 52.90 - samples/sec: 3317.61 - lr: 0.000023 - momentum: 0.000000 2023-10-17 18:07:23,314 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:07:23,314 EPOCH 3 done: loss 0.0578 - lr: 0.000023 2023-10-17 18:07:27,112 DEV : loss 0.05712844431400299 - f1-score (micro avg) 0.8812 2023-10-17 18:07:27,129 saving best model 2023-10-17 18:07:27,652 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:07:32,710 epoch 4 - iter 72/723 - loss 0.03483535 - time (sec): 5.06 - samples/sec: 3414.70 - lr: 0.000023 - momentum: 0.000000 2023-10-17 18:07:37,907 epoch 4 - iter 144/723 - loss 0.03934497 - time (sec): 10.25 - samples/sec: 3374.65 - lr: 0.000023 - momentum: 0.000000 2023-10-17 18:07:43,328 epoch 4 - iter 216/723 - loss 0.03886039 - time (sec): 15.67 - samples/sec: 3339.03 - lr: 0.000022 - momentum: 0.000000 2023-10-17 18:07:48,534 epoch 4 - iter 288/723 - loss 0.03985048 - time (sec): 20.88 - samples/sec: 3333.40 - lr: 0.000022 - momentum: 0.000000 2023-10-17 18:07:53,639 epoch 4 - iter 360/723 - loss 0.03887291 - time (sec): 25.98 - samples/sec: 3306.91 - lr: 0.000022 - momentum: 0.000000 2023-10-17 18:07:59,171 epoch 4 - iter 432/723 - loss 0.04024768 - time (sec): 31.52 - samples/sec: 3306.96 - lr: 0.000021 - momentum: 0.000000 2023-10-17 18:08:04,658 epoch 4 - iter 504/723 - loss 0.04285831 - time (sec): 37.00 - samples/sec: 3311.61 - lr: 0.000021 - momentum: 0.000000 2023-10-17 18:08:09,714 epoch 4 - iter 576/723 - loss 0.04317139 - time (sec): 42.06 - samples/sec: 3314.94 - lr: 0.000021 - momentum: 0.000000 2023-10-17 18:08:14,986 epoch 4 - iter 648/723 - loss 0.04227368 - time (sec): 47.33 - samples/sec: 3321.28 - lr: 0.000020 - momentum: 0.000000 2023-10-17 18:08:20,439 epoch 4 - iter 720/723 - loss 0.04245703 - time (sec): 52.78 - samples/sec: 3326.15 - lr: 0.000020 - momentum: 0.000000 2023-10-17 18:08:20,628 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:08:20,628 EPOCH 4 done: loss 0.0424 - lr: 0.000020 2023-10-17 18:08:23,988 DEV : loss 0.06662245094776154 - f1-score (micro avg) 0.8791 2023-10-17 18:08:24,007 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:08:29,324 epoch 5 - iter 72/723 - loss 0.01990662 - time (sec): 5.32 - samples/sec: 3344.69 - lr: 0.000020 - momentum: 0.000000 2023-10-17 18:08:35,095 epoch 5 - iter 144/723 - loss 0.02279437 - time (sec): 11.09 - samples/sec: 3189.92 - lr: 0.000019 - momentum: 0.000000 2023-10-17 18:08:40,075 epoch 5 - iter 216/723 - loss 0.02479149 - time (sec): 16.07 - samples/sec: 3231.66 - lr: 0.000019 - momentum: 0.000000 2023-10-17 18:08:45,591 epoch 5 - iter 288/723 - loss 0.02719107 - time (sec): 21.58 - samples/sec: 3262.26 - lr: 0.000019 - momentum: 0.000000 2023-10-17 18:08:50,464 epoch 5 - iter 360/723 - loss 0.02575671 - time (sec): 26.46 - samples/sec: 3295.06 - lr: 0.000018 - momentum: 0.000000 2023-10-17 18:08:55,631 epoch 5 - iter 432/723 - loss 0.02771082 - time (sec): 31.62 - samples/sec: 3302.82 - lr: 0.000018 - momentum: 0.000000 2023-10-17 18:09:00,979 epoch 5 - iter 504/723 - loss 0.02956349 - time (sec): 36.97 - samples/sec: 3277.40 - lr: 0.000018 - momentum: 0.000000 2023-10-17 18:09:06,490 epoch 5 - iter 576/723 - loss 0.03084776 - time (sec): 42.48 - samples/sec: 3270.69 - lr: 0.000017 - momentum: 0.000000 2023-10-17 18:09:11,875 epoch 5 - iter 648/723 - loss 0.03111872 - time (sec): 47.87 - samples/sec: 3273.52 - lr: 0.000017 - momentum: 0.000000 2023-10-17 18:09:17,582 epoch 5 - iter 720/723 - loss 0.03213599 - time (sec): 53.57 - samples/sec: 3278.68 - lr: 0.000017 - momentum: 0.000000 2023-10-17 18:09:17,791 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:09:17,791 EPOCH 5 done: loss 0.0322 - lr: 0.000017 2023-10-17 18:09:21,121 DEV : loss 0.10735854506492615 - f1-score (micro avg) 0.852 2023-10-17 18:09:21,139 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:09:26,445 epoch 6 - iter 72/723 - loss 0.01459516 - time (sec): 5.30 - samples/sec: 3243.72 - lr: 0.000016 - momentum: 0.000000 2023-10-17 18:09:31,377 epoch 6 - iter 144/723 - loss 0.01916253 - time (sec): 10.24 - samples/sec: 3349.32 - lr: 0.000016 - momentum: 0.000000 2023-10-17 18:09:36,924 epoch 6 - iter 216/723 - loss 0.02024065 - time (sec): 15.78 - samples/sec: 3321.70 - lr: 0.000016 - momentum: 0.000000 2023-10-17 18:09:41,719 epoch 6 - iter 288/723 - loss 0.01952469 - time (sec): 20.58 - samples/sec: 3309.42 - lr: 0.000015 - momentum: 0.000000 2023-10-17 18:09:47,484 epoch 6 - iter 360/723 - loss 0.02084252 - time (sec): 26.34 - samples/sec: 3323.30 - lr: 0.000015 - momentum: 0.000000 2023-10-17 18:09:52,279 epoch 6 - iter 432/723 - loss 0.02140543 - time (sec): 31.14 - samples/sec: 3382.54 - lr: 0.000015 - momentum: 0.000000 2023-10-17 18:09:57,342 epoch 6 - iter 504/723 - loss 0.02072648 - time (sec): 36.20 - samples/sec: 3382.64 - lr: 0.000014 - momentum: 0.000000 2023-10-17 18:10:02,892 epoch 6 - iter 576/723 - loss 0.02209234 - time (sec): 41.75 - samples/sec: 3391.15 - lr: 0.000014 - momentum: 0.000000 2023-10-17 18:10:07,864 epoch 6 - iter 648/723 - loss 0.02271147 - time (sec): 46.72 - samples/sec: 3387.47 - lr: 0.000014 - momentum: 0.000000 2023-10-17 18:10:13,045 epoch 6 - iter 720/723 - loss 0.02324330 - time (sec): 51.91 - samples/sec: 3386.73 - lr: 0.000013 - momentum: 0.000000 2023-10-17 18:10:13,187 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:10:13,187 EPOCH 6 done: loss 0.0233 - lr: 0.000013 2023-10-17 18:10:16,951 DEV : loss 0.09898053109645844 - f1-score (micro avg) 0.8787 2023-10-17 18:10:16,973 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:10:22,091 epoch 7 - iter 72/723 - loss 0.01179340 - time (sec): 5.12 - samples/sec: 3273.92 - lr: 0.000013 - momentum: 0.000000 2023-10-17 18:10:27,430 epoch 7 - iter 144/723 - loss 0.01110712 - time (sec): 10.46 - samples/sec: 3283.64 - lr: 0.000013 - momentum: 0.000000 2023-10-17 18:10:32,925 epoch 7 - iter 216/723 - loss 0.01314864 - time (sec): 15.95 - samples/sec: 3319.64 - lr: 0.000012 - momentum: 0.000000 2023-10-17 18:10:38,240 epoch 7 - iter 288/723 - loss 0.01571089 - time (sec): 21.27 - samples/sec: 3304.77 - lr: 0.000012 - momentum: 0.000000 2023-10-17 18:10:43,718 epoch 7 - iter 360/723 - loss 0.01630848 - time (sec): 26.74 - samples/sec: 3311.36 - lr: 0.000012 - momentum: 0.000000 2023-10-17 18:10:49,005 epoch 7 - iter 432/723 - loss 0.01910391 - time (sec): 32.03 - samples/sec: 3309.95 - lr: 0.000011 - momentum: 0.000000 2023-10-17 18:10:54,705 epoch 7 - iter 504/723 - loss 0.01877318 - time (sec): 37.73 - samples/sec: 3310.78 - lr: 0.000011 - momentum: 0.000000 2023-10-17 18:10:59,635 epoch 7 - iter 576/723 - loss 0.01811978 - time (sec): 42.66 - samples/sec: 3315.06 - lr: 0.000011 - momentum: 0.000000 2023-10-17 18:11:04,812 epoch 7 - iter 648/723 - loss 0.01804125 - time (sec): 47.84 - samples/sec: 3320.49 - lr: 0.000010 - momentum: 0.000000 2023-10-17 18:11:10,018 epoch 7 - iter 720/723 - loss 0.01810548 - time (sec): 53.04 - samples/sec: 3312.43 - lr: 0.000010 - momentum: 0.000000 2023-10-17 18:11:10,198 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:11:10,199 EPOCH 7 done: loss 0.0181 - lr: 0.000010 2023-10-17 18:11:13,489 DEV : loss 0.112530916929245 - f1-score (micro avg) 0.8757 2023-10-17 18:11:13,508 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:11:18,960 epoch 8 - iter 72/723 - loss 0.00982102 - time (sec): 5.45 - samples/sec: 3354.45 - lr: 0.000010 - momentum: 0.000000 2023-10-17 18:11:24,206 epoch 8 - iter 144/723 - loss 0.01111392 - time (sec): 10.70 - samples/sec: 3381.94 - lr: 0.000009 - momentum: 0.000000 2023-10-17 18:11:29,275 epoch 8 - iter 216/723 - loss 0.01252694 - time (sec): 15.77 - samples/sec: 3393.20 - lr: 0.000009 - momentum: 0.000000 2023-10-17 18:11:34,521 epoch 8 - iter 288/723 - loss 0.01206236 - time (sec): 21.01 - samples/sec: 3334.63 - lr: 0.000009 - momentum: 0.000000 2023-10-17 18:11:40,082 epoch 8 - iter 360/723 - loss 0.01224851 - time (sec): 26.57 - samples/sec: 3331.83 - lr: 0.000008 - momentum: 0.000000 2023-10-17 18:11:45,118 epoch 8 - iter 432/723 - loss 0.01187705 - time (sec): 31.61 - samples/sec: 3354.56 - lr: 0.000008 - momentum: 0.000000 2023-10-17 18:11:50,453 epoch 8 - iter 504/723 - loss 0.01236789 - time (sec): 36.94 - samples/sec: 3310.07 - lr: 0.000008 - momentum: 0.000000 2023-10-17 18:11:55,804 epoch 8 - iter 576/723 - loss 0.01200156 - time (sec): 42.30 - samples/sec: 3306.51 - lr: 0.000007 - momentum: 0.000000 2023-10-17 18:12:01,328 epoch 8 - iter 648/723 - loss 0.01252768 - time (sec): 47.82 - samples/sec: 3309.92 - lr: 0.000007 - momentum: 0.000000 2023-10-17 18:12:06,499 epoch 8 - iter 720/723 - loss 0.01258916 - time (sec): 52.99 - samples/sec: 3318.22 - lr: 0.000007 - momentum: 0.000000 2023-10-17 18:12:06,648 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:12:06,648 EPOCH 8 done: loss 0.0126 - lr: 0.000007 2023-10-17 18:12:10,019 DEV : loss 0.13107195496559143 - f1-score (micro avg) 0.8717 2023-10-17 18:12:10,042 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:12:15,640 epoch 9 - iter 72/723 - loss 0.00761981 - time (sec): 5.60 - samples/sec: 3132.39 - lr: 0.000006 - momentum: 0.000000 2023-10-17 18:12:20,962 epoch 9 - iter 144/723 - loss 0.00858213 - time (sec): 10.92 - samples/sec: 3216.86 - lr: 0.000006 - momentum: 0.000000 2023-10-17 18:12:26,242 epoch 9 - iter 216/723 - loss 0.00853157 - time (sec): 16.20 - samples/sec: 3295.29 - lr: 0.000006 - momentum: 0.000000 2023-10-17 18:12:31,518 epoch 9 - iter 288/723 - loss 0.00876791 - time (sec): 21.47 - samples/sec: 3321.23 - lr: 0.000005 - momentum: 0.000000 2023-10-17 18:12:36,672 epoch 9 - iter 360/723 - loss 0.00859438 - time (sec): 26.63 - samples/sec: 3294.91 - lr: 0.000005 - momentum: 0.000000 2023-10-17 18:12:41,859 epoch 9 - iter 432/723 - loss 0.00864313 - time (sec): 31.82 - samples/sec: 3318.31 - lr: 0.000005 - momentum: 0.000000 2023-10-17 18:12:47,352 epoch 9 - iter 504/723 - loss 0.00850276 - time (sec): 37.31 - samples/sec: 3303.20 - lr: 0.000004 - momentum: 0.000000 2023-10-17 18:12:52,892 epoch 9 - iter 576/723 - loss 0.00950137 - time (sec): 42.85 - samples/sec: 3299.80 - lr: 0.000004 - momentum: 0.000000 2023-10-17 18:12:57,736 epoch 9 - iter 648/723 - loss 0.00908151 - time (sec): 47.69 - samples/sec: 3312.61 - lr: 0.000004 - momentum: 0.000000 2023-10-17 18:13:03,216 epoch 9 - iter 720/723 - loss 0.01015906 - time (sec): 53.17 - samples/sec: 3303.07 - lr: 0.000003 - momentum: 0.000000 2023-10-17 18:13:03,433 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:13:03,433 EPOCH 9 done: loss 0.0101 - lr: 0.000003 2023-10-17 18:13:07,279 DEV : loss 0.13919697701931 - f1-score (micro avg) 0.877 2023-10-17 18:13:07,296 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:13:12,754 epoch 10 - iter 72/723 - loss 0.01328488 - time (sec): 5.46 - samples/sec: 3397.73 - lr: 0.000003 - momentum: 0.000000 2023-10-17 18:13:18,228 epoch 10 - iter 144/723 - loss 0.00911430 - time (sec): 10.93 - samples/sec: 3247.20 - lr: 0.000003 - momentum: 0.000000 2023-10-17 18:13:23,492 epoch 10 - iter 216/723 - loss 0.00839632 - time (sec): 16.19 - samples/sec: 3258.25 - lr: 0.000002 - momentum: 0.000000 2023-10-17 18:13:28,454 epoch 10 - iter 288/723 - loss 0.00791484 - time (sec): 21.16 - samples/sec: 3301.29 - lr: 0.000002 - momentum: 0.000000 2023-10-17 18:13:34,083 epoch 10 - iter 360/723 - loss 0.00921151 - time (sec): 26.79 - samples/sec: 3299.18 - lr: 0.000002 - momentum: 0.000000 2023-10-17 18:13:39,827 epoch 10 - iter 432/723 - loss 0.00866624 - time (sec): 32.53 - samples/sec: 3286.87 - lr: 0.000001 - momentum: 0.000000 2023-10-17 18:13:44,998 epoch 10 - iter 504/723 - loss 0.00879169 - time (sec): 37.70 - samples/sec: 3293.71 - lr: 0.000001 - momentum: 0.000000 2023-10-17 18:13:49,760 epoch 10 - iter 576/723 - loss 0.00813104 - time (sec): 42.46 - samples/sec: 3326.77 - lr: 0.000001 - momentum: 0.000000 2023-10-17 18:13:55,140 epoch 10 - iter 648/723 - loss 0.00845184 - time (sec): 47.84 - samples/sec: 3334.43 - lr: 0.000000 - momentum: 0.000000 2023-10-17 18:14:00,144 epoch 10 - iter 720/723 - loss 0.00785129 - time (sec): 52.85 - samples/sec: 3320.66 - lr: 0.000000 - momentum: 0.000000 2023-10-17 18:14:00,361 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:14:00,361 EPOCH 10 done: loss 0.0078 - lr: 0.000000 2023-10-17 18:14:03,744 DEV : loss 0.13295623660087585 - f1-score (micro avg) 0.8843 2023-10-17 18:14:03,761 saving best model 2023-10-17 18:14:04,705 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:14:04,706 Loading model from best epoch ... 2023-10-17 18:14:06,078 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-17 18:14:09,142 Results: - F-score (micro) 0.8753 - F-score (macro) 0.7737 - Accuracy 0.7867 By class: precision recall f1-score support PER 0.8809 0.8589 0.8697 482 LOC 0.9342 0.9301 0.9322 458 ORG 0.5484 0.4928 0.5191 69 micro avg 0.8846 0.8662 0.8753 1009 macro avg 0.7878 0.7606 0.7737 1009 weighted avg 0.8823 0.8662 0.8741 1009 2023-10-17 18:14:09,142 ----------------------------------------------------------------------------------------------------