2023-10-25 13:04:00,403 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:04:00,403 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 13:04:00,404 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:04:00,404 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-25 13:04:00,404 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:04:00,404 Train: 6183 sentences 2023-10-25 13:04:00,404 (train_with_dev=False, train_with_test=False) 2023-10-25 13:04:00,404 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:04:00,404 Training Params: 2023-10-25 13:04:00,404 - learning_rate: "5e-05" 2023-10-25 13:04:00,404 - mini_batch_size: "8" 2023-10-25 13:04:00,404 - max_epochs: "10" 2023-10-25 13:04:00,404 - shuffle: "True" 2023-10-25 13:04:00,404 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:04:00,404 Plugins: 2023-10-25 13:04:00,404 - TensorboardLogger 2023-10-25 13:04:00,404 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 13:04:00,404 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:04:00,404 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 13:04:00,405 - metric: "('micro avg', 'f1-score')" 2023-10-25 13:04:00,405 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:04:00,405 Computation: 2023-10-25 13:04:00,405 - compute on device: cuda:0 2023-10-25 13:04:00,405 - embedding storage: none 2023-10-25 13:04:00,405 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:04:00,405 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-25 13:04:00,405 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:04:00,405 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:04:00,405 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 13:04:04,871 epoch 1 - iter 77/773 - loss 1.62607859 - time (sec): 4.47 - samples/sec: 2919.55 - lr: 0.000005 - momentum: 0.000000 2023-10-25 13:04:09,448 epoch 1 - iter 154/773 - loss 0.94210245 - time (sec): 9.04 - samples/sec: 2835.22 - lr: 0.000010 - momentum: 0.000000 2023-10-25 13:04:14,272 epoch 1 - iter 231/773 - loss 0.69033315 - time (sec): 13.87 - samples/sec: 2712.58 - lr: 0.000015 - momentum: 0.000000 2023-10-25 13:04:19,189 epoch 1 - iter 308/773 - loss 0.55795676 - time (sec): 18.78 - samples/sec: 2638.02 - lr: 0.000020 - momentum: 0.000000 2023-10-25 13:04:23,842 epoch 1 - iter 385/773 - loss 0.47173138 - time (sec): 23.44 - samples/sec: 2617.54 - lr: 0.000025 - momentum: 0.000000 2023-10-25 13:04:28,472 epoch 1 - iter 462/773 - loss 0.41386971 - time (sec): 28.07 - samples/sec: 2627.27 - lr: 0.000030 - momentum: 0.000000 2023-10-25 13:04:32,979 epoch 1 - iter 539/773 - loss 0.37054284 - time (sec): 32.57 - samples/sec: 2625.88 - lr: 0.000035 - momentum: 0.000000 2023-10-25 13:04:37,466 epoch 1 - iter 616/773 - loss 0.33663528 - time (sec): 37.06 - samples/sec: 2642.50 - lr: 0.000040 - momentum: 0.000000 2023-10-25 13:04:41,752 epoch 1 - iter 693/773 - loss 0.30816500 - time (sec): 41.35 - samples/sec: 2678.58 - lr: 0.000045 - momentum: 0.000000 2023-10-25 13:04:46,095 epoch 1 - iter 770/773 - loss 0.28446225 - time (sec): 45.69 - samples/sec: 2712.76 - lr: 0.000050 - momentum: 0.000000 2023-10-25 13:04:46,249 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:04:46,249 EPOCH 1 done: loss 0.2838 - lr: 0.000050 2023-10-25 13:04:49,502 DEV : loss 0.05165766924619675 - f1-score (micro avg) 0.7323 2023-10-25 13:04:49,522 saving best model 2023-10-25 13:04:50,070 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:04:54,732 epoch 2 - iter 77/773 - loss 0.10202012 - time (sec): 4.66 - samples/sec: 2459.39 - lr: 0.000049 - momentum: 0.000000 2023-10-25 13:04:59,242 epoch 2 - iter 154/773 - loss 0.08419062 - time (sec): 9.17 - samples/sec: 2528.27 - lr: 0.000049 - momentum: 0.000000 2023-10-25 13:05:03,731 epoch 2 - iter 231/773 - loss 0.08259793 - time (sec): 13.66 - samples/sec: 2568.79 - lr: 0.000048 - momentum: 0.000000 2023-10-25 13:05:08,253 epoch 2 - iter 308/773 - loss 0.08289606 - time (sec): 18.18 - samples/sec: 2638.40 - lr: 0.000048 - momentum: 0.000000 2023-10-25 13:05:12,813 epoch 2 - iter 385/773 - loss 0.08040769 - time (sec): 22.74 - samples/sec: 2694.06 - lr: 0.000047 - momentum: 0.000000 2023-10-25 13:05:17,136 epoch 2 - iter 462/773 - loss 0.07924733 - time (sec): 27.06 - samples/sec: 2716.80 - lr: 0.000047 - momentum: 0.000000 2023-10-25 13:05:21,425 epoch 2 - iter 539/773 - loss 0.07828472 - time (sec): 31.35 - samples/sec: 2778.47 - lr: 0.000046 - momentum: 0.000000 2023-10-25 13:05:25,710 epoch 2 - iter 616/773 - loss 0.07838880 - time (sec): 35.64 - samples/sec: 2786.75 - lr: 0.000046 - momentum: 0.000000 2023-10-25 13:05:29,957 epoch 2 - iter 693/773 - loss 0.07907713 - time (sec): 39.89 - samples/sec: 2791.31 - lr: 0.000045 - momentum: 0.000000 2023-10-25 13:05:34,285 epoch 2 - iter 770/773 - loss 0.07672251 - time (sec): 44.21 - samples/sec: 2801.22 - lr: 0.000044 - momentum: 0.000000 2023-10-25 13:05:34,454 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:05:34,455 EPOCH 2 done: loss 0.0767 - lr: 0.000044 2023-10-25 13:05:37,402 DEV : loss 0.05662866681814194 - f1-score (micro avg) 0.7628 2023-10-25 13:05:37,419 saving best model 2023-10-25 13:05:38,170 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:05:43,368 epoch 3 - iter 77/773 - loss 0.04830238 - time (sec): 5.19 - samples/sec: 2322.43 - lr: 0.000044 - momentum: 0.000000 2023-10-25 13:05:47,869 epoch 3 - iter 154/773 - loss 0.05073064 - time (sec): 9.70 - samples/sec: 2497.06 - lr: 0.000043 - momentum: 0.000000 2023-10-25 13:05:52,586 epoch 3 - iter 231/773 - loss 0.05345361 - time (sec): 14.41 - samples/sec: 2634.94 - lr: 0.000043 - momentum: 0.000000 2023-10-25 13:05:57,105 epoch 3 - iter 308/773 - loss 0.05233532 - time (sec): 18.93 - samples/sec: 2619.65 - lr: 0.000042 - momentum: 0.000000 2023-10-25 13:06:01,765 epoch 3 - iter 385/773 - loss 0.05274950 - time (sec): 23.59 - samples/sec: 2627.87 - lr: 0.000042 - momentum: 0.000000 2023-10-25 13:06:06,320 epoch 3 - iter 462/773 - loss 0.05328798 - time (sec): 28.15 - samples/sec: 2619.52 - lr: 0.000041 - momentum: 0.000000 2023-10-25 13:06:10,830 epoch 3 - iter 539/773 - loss 0.05175706 - time (sec): 32.66 - samples/sec: 2640.92 - lr: 0.000041 - momentum: 0.000000 2023-10-25 13:06:15,426 epoch 3 - iter 616/773 - loss 0.05144252 - time (sec): 37.25 - samples/sec: 2655.35 - lr: 0.000040 - momentum: 0.000000 2023-10-25 13:06:20,100 epoch 3 - iter 693/773 - loss 0.05097205 - time (sec): 41.93 - samples/sec: 2651.47 - lr: 0.000039 - momentum: 0.000000 2023-10-25 13:06:24,629 epoch 3 - iter 770/773 - loss 0.05141460 - time (sec): 46.46 - samples/sec: 2667.56 - lr: 0.000039 - momentum: 0.000000 2023-10-25 13:06:24,796 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:06:24,797 EPOCH 3 done: loss 0.0514 - lr: 0.000039 2023-10-25 13:06:27,708 DEV : loss 0.08030106127262115 - f1-score (micro avg) 0.7182 2023-10-25 13:06:27,729 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:06:32,291 epoch 4 - iter 77/773 - loss 0.03816760 - time (sec): 4.56 - samples/sec: 2653.55 - lr: 0.000038 - momentum: 0.000000 2023-10-25 13:06:37,003 epoch 4 - iter 154/773 - loss 0.03377926 - time (sec): 9.27 - samples/sec: 2620.46 - lr: 0.000038 - momentum: 0.000000 2023-10-25 13:06:41,801 epoch 4 - iter 231/773 - loss 0.03420877 - time (sec): 14.07 - samples/sec: 2623.09 - lr: 0.000037 - momentum: 0.000000 2023-10-25 13:06:46,421 epoch 4 - iter 308/773 - loss 0.03420744 - time (sec): 18.69 - samples/sec: 2623.52 - lr: 0.000037 - momentum: 0.000000 2023-10-25 13:06:50,969 epoch 4 - iter 385/773 - loss 0.03441451 - time (sec): 23.24 - samples/sec: 2597.91 - lr: 0.000036 - momentum: 0.000000 2023-10-25 13:06:55,487 epoch 4 - iter 462/773 - loss 0.03487924 - time (sec): 27.76 - samples/sec: 2589.27 - lr: 0.000036 - momentum: 0.000000 2023-10-25 13:07:00,136 epoch 4 - iter 539/773 - loss 0.03545083 - time (sec): 32.41 - samples/sec: 2615.60 - lr: 0.000035 - momentum: 0.000000 2023-10-25 13:07:04,532 epoch 4 - iter 616/773 - loss 0.03574374 - time (sec): 36.80 - samples/sec: 2654.64 - lr: 0.000034 - momentum: 0.000000 2023-10-25 13:07:09,155 epoch 4 - iter 693/773 - loss 0.03593766 - time (sec): 41.42 - samples/sec: 2684.99 - lr: 0.000034 - momentum: 0.000000 2023-10-25 13:07:13,897 epoch 4 - iter 770/773 - loss 0.03530953 - time (sec): 46.17 - samples/sec: 2684.07 - lr: 0.000033 - momentum: 0.000000 2023-10-25 13:07:14,074 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:07:14,075 EPOCH 4 done: loss 0.0354 - lr: 0.000033 2023-10-25 13:07:16,611 DEV : loss 0.08339047431945801 - f1-score (micro avg) 0.7649 2023-10-25 13:07:16,630 saving best model 2023-10-25 13:07:17,281 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:07:22,250 epoch 5 - iter 77/773 - loss 0.01949160 - time (sec): 4.97 - samples/sec: 2636.17 - lr: 0.000033 - momentum: 0.000000 2023-10-25 13:07:27,068 epoch 5 - iter 154/773 - loss 0.01928080 - time (sec): 9.78 - samples/sec: 2553.53 - lr: 0.000032 - momentum: 0.000000 2023-10-25 13:07:31,973 epoch 5 - iter 231/773 - loss 0.02144756 - time (sec): 14.69 - samples/sec: 2503.66 - lr: 0.000032 - momentum: 0.000000 2023-10-25 13:07:36,776 epoch 5 - iter 308/773 - loss 0.02257405 - time (sec): 19.49 - samples/sec: 2478.57 - lr: 0.000031 - momentum: 0.000000 2023-10-25 13:07:41,608 epoch 5 - iter 385/773 - loss 0.02362110 - time (sec): 24.32 - samples/sec: 2520.77 - lr: 0.000031 - momentum: 0.000000 2023-10-25 13:07:46,203 epoch 5 - iter 462/773 - loss 0.02401721 - time (sec): 28.92 - samples/sec: 2518.69 - lr: 0.000030 - momentum: 0.000000 2023-10-25 13:07:50,848 epoch 5 - iter 539/773 - loss 0.02564729 - time (sec): 33.56 - samples/sec: 2527.50 - lr: 0.000029 - momentum: 0.000000 2023-10-25 13:07:55,431 epoch 5 - iter 616/773 - loss 0.02576209 - time (sec): 38.15 - samples/sec: 2572.76 - lr: 0.000029 - momentum: 0.000000 2023-10-25 13:07:59,836 epoch 5 - iter 693/773 - loss 0.02549268 - time (sec): 42.55 - samples/sec: 2611.03 - lr: 0.000028 - momentum: 0.000000 2023-10-25 13:08:04,104 epoch 5 - iter 770/773 - loss 0.02500032 - time (sec): 46.82 - samples/sec: 2646.22 - lr: 0.000028 - momentum: 0.000000 2023-10-25 13:08:04,263 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:08:04,263 EPOCH 5 done: loss 0.0250 - lr: 0.000028 2023-10-25 13:08:06,931 DEV : loss 0.10233461856842041 - f1-score (micro avg) 0.7579 2023-10-25 13:08:06,952 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:08:11,484 epoch 6 - iter 77/773 - loss 0.02403484 - time (sec): 4.53 - samples/sec: 2724.96 - lr: 0.000027 - momentum: 0.000000 2023-10-25 13:08:16,103 epoch 6 - iter 154/773 - loss 0.01933766 - time (sec): 9.15 - samples/sec: 2749.34 - lr: 0.000027 - momentum: 0.000000 2023-10-25 13:08:20,644 epoch 6 - iter 231/773 - loss 0.01675474 - time (sec): 13.69 - samples/sec: 2703.76 - lr: 0.000026 - momentum: 0.000000 2023-10-25 13:08:25,139 epoch 6 - iter 308/773 - loss 0.01770050 - time (sec): 18.18 - samples/sec: 2730.90 - lr: 0.000026 - momentum: 0.000000 2023-10-25 13:08:29,794 epoch 6 - iter 385/773 - loss 0.01697379 - time (sec): 22.84 - samples/sec: 2721.83 - lr: 0.000025 - momentum: 0.000000 2023-10-25 13:08:34,492 epoch 6 - iter 462/773 - loss 0.01647189 - time (sec): 27.54 - samples/sec: 2720.32 - lr: 0.000024 - momentum: 0.000000 2023-10-25 13:08:39,812 epoch 6 - iter 539/773 - loss 0.01695804 - time (sec): 32.86 - samples/sec: 2653.54 - lr: 0.000024 - momentum: 0.000000 2023-10-25 13:08:44,380 epoch 6 - iter 616/773 - loss 0.01663896 - time (sec): 37.43 - samples/sec: 2653.24 - lr: 0.000023 - momentum: 0.000000 2023-10-25 13:08:48,904 epoch 6 - iter 693/773 - loss 0.01722230 - time (sec): 41.95 - samples/sec: 2662.44 - lr: 0.000023 - momentum: 0.000000 2023-10-25 13:08:53,262 epoch 6 - iter 770/773 - loss 0.01642065 - time (sec): 46.31 - samples/sec: 2675.35 - lr: 0.000022 - momentum: 0.000000 2023-10-25 13:08:53,417 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:08:53,417 EPOCH 6 done: loss 0.0164 - lr: 0.000022 2023-10-25 13:08:56,297 DEV : loss 0.10584240406751633 - f1-score (micro avg) 0.7992 2023-10-25 13:08:56,320 saving best model 2023-10-25 13:08:56,974 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:09:01,620 epoch 7 - iter 77/773 - loss 0.01534673 - time (sec): 4.64 - samples/sec: 2796.75 - lr: 0.000022 - momentum: 0.000000 2023-10-25 13:09:06,285 epoch 7 - iter 154/773 - loss 0.01242129 - time (sec): 9.31 - samples/sec: 2688.82 - lr: 0.000021 - momentum: 0.000000 2023-10-25 13:09:11,077 epoch 7 - iter 231/773 - loss 0.01054806 - time (sec): 14.10 - samples/sec: 2681.47 - lr: 0.000021 - momentum: 0.000000 2023-10-25 13:09:15,836 epoch 7 - iter 308/773 - loss 0.01064088 - time (sec): 18.86 - samples/sec: 2673.72 - lr: 0.000020 - momentum: 0.000000 2023-10-25 13:09:20,462 epoch 7 - iter 385/773 - loss 0.01089602 - time (sec): 23.48 - samples/sec: 2643.79 - lr: 0.000019 - momentum: 0.000000 2023-10-25 13:09:25,076 epoch 7 - iter 462/773 - loss 0.01071434 - time (sec): 28.10 - samples/sec: 2651.83 - lr: 0.000019 - momentum: 0.000000 2023-10-25 13:09:29,777 epoch 7 - iter 539/773 - loss 0.01088053 - time (sec): 32.80 - samples/sec: 2648.61 - lr: 0.000018 - momentum: 0.000000 2023-10-25 13:09:34,667 epoch 7 - iter 616/773 - loss 0.01092188 - time (sec): 37.69 - samples/sec: 2627.23 - lr: 0.000018 - momentum: 0.000000 2023-10-25 13:09:39,693 epoch 7 - iter 693/773 - loss 0.01146992 - time (sec): 42.72 - samples/sec: 2633.32 - lr: 0.000017 - momentum: 0.000000 2023-10-25 13:09:44,498 epoch 7 - iter 770/773 - loss 0.01173852 - time (sec): 47.52 - samples/sec: 2608.77 - lr: 0.000017 - momentum: 0.000000 2023-10-25 13:09:44,678 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:09:44,678 EPOCH 7 done: loss 0.0117 - lr: 0.000017 2023-10-25 13:09:47,214 DEV : loss 0.11367938667535782 - f1-score (micro avg) 0.7702 2023-10-25 13:09:47,233 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:09:52,081 epoch 8 - iter 77/773 - loss 0.00915681 - time (sec): 4.85 - samples/sec: 2566.64 - lr: 0.000016 - momentum: 0.000000 2023-10-25 13:09:56,977 epoch 8 - iter 154/773 - loss 0.00927730 - time (sec): 9.74 - samples/sec: 2495.94 - lr: 0.000016 - momentum: 0.000000 2023-10-25 13:10:01,947 epoch 8 - iter 231/773 - loss 0.00889633 - time (sec): 14.71 - samples/sec: 2492.37 - lr: 0.000015 - momentum: 0.000000 2023-10-25 13:10:06,837 epoch 8 - iter 308/773 - loss 0.00755093 - time (sec): 19.60 - samples/sec: 2506.99 - lr: 0.000014 - momentum: 0.000000 2023-10-25 13:10:11,651 epoch 8 - iter 385/773 - loss 0.00800710 - time (sec): 24.42 - samples/sec: 2523.49 - lr: 0.000014 - momentum: 0.000000 2023-10-25 13:10:16,478 epoch 8 - iter 462/773 - loss 0.00799662 - time (sec): 29.24 - samples/sec: 2517.18 - lr: 0.000013 - momentum: 0.000000 2023-10-25 13:10:21,349 epoch 8 - iter 539/773 - loss 0.00766083 - time (sec): 34.11 - samples/sec: 2569.14 - lr: 0.000013 - momentum: 0.000000 2023-10-25 13:10:26,302 epoch 8 - iter 616/773 - loss 0.00763491 - time (sec): 39.07 - samples/sec: 2560.31 - lr: 0.000012 - momentum: 0.000000 2023-10-25 13:10:31,080 epoch 8 - iter 693/773 - loss 0.00766449 - time (sec): 43.84 - samples/sec: 2546.17 - lr: 0.000012 - momentum: 0.000000 2023-10-25 13:10:35,826 epoch 8 - iter 770/773 - loss 0.00742265 - time (sec): 48.59 - samples/sec: 2547.19 - lr: 0.000011 - momentum: 0.000000 2023-10-25 13:10:36,008 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:10:36,009 EPOCH 8 done: loss 0.0075 - lr: 0.000011 2023-10-25 13:10:38,904 DEV : loss 0.1300455778837204 - f1-score (micro avg) 0.755 2023-10-25 13:10:38,923 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:10:43,606 epoch 9 - iter 77/773 - loss 0.00575536 - time (sec): 4.68 - samples/sec: 2808.71 - lr: 0.000011 - momentum: 0.000000 2023-10-25 13:10:48,290 epoch 9 - iter 154/773 - loss 0.00667575 - time (sec): 9.37 - samples/sec: 2659.97 - lr: 0.000010 - momentum: 0.000000 2023-10-25 13:10:52,840 epoch 9 - iter 231/773 - loss 0.00556811 - time (sec): 13.92 - samples/sec: 2714.47 - lr: 0.000009 - momentum: 0.000000 2023-10-25 13:10:57,278 epoch 9 - iter 308/773 - loss 0.00712098 - time (sec): 18.35 - samples/sec: 2701.17 - lr: 0.000009 - momentum: 0.000000 2023-10-25 13:11:01,945 epoch 9 - iter 385/773 - loss 0.00671541 - time (sec): 23.02 - samples/sec: 2693.66 - lr: 0.000008 - momentum: 0.000000 2023-10-25 13:11:06,517 epoch 9 - iter 462/773 - loss 0.00677819 - time (sec): 27.59 - samples/sec: 2706.24 - lr: 0.000008 - momentum: 0.000000 2023-10-25 13:11:11,088 epoch 9 - iter 539/773 - loss 0.00647608 - time (sec): 32.16 - samples/sec: 2718.72 - lr: 0.000007 - momentum: 0.000000 2023-10-25 13:11:15,685 epoch 9 - iter 616/773 - loss 0.00626968 - time (sec): 36.76 - samples/sec: 2728.56 - lr: 0.000007 - momentum: 0.000000 2023-10-25 13:11:20,159 epoch 9 - iter 693/773 - loss 0.00611977 - time (sec): 41.23 - samples/sec: 2725.48 - lr: 0.000006 - momentum: 0.000000 2023-10-25 13:11:24,583 epoch 9 - iter 770/773 - loss 0.00625835 - time (sec): 45.66 - samples/sec: 2715.43 - lr: 0.000006 - momentum: 0.000000 2023-10-25 13:11:24,746 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:11:24,747 EPOCH 9 done: loss 0.0062 - lr: 0.000006 2023-10-25 13:11:27,348 DEV : loss 0.13065063953399658 - f1-score (micro avg) 0.7686 2023-10-25 13:11:27,366 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:11:32,017 epoch 10 - iter 77/773 - loss 0.00367421 - time (sec): 4.65 - samples/sec: 2585.53 - lr: 0.000005 - momentum: 0.000000 2023-10-25 13:11:37,336 epoch 10 - iter 154/773 - loss 0.00561623 - time (sec): 9.97 - samples/sec: 2487.55 - lr: 0.000005 - momentum: 0.000000 2023-10-25 13:11:41,777 epoch 10 - iter 231/773 - loss 0.00497335 - time (sec): 14.41 - samples/sec: 2502.41 - lr: 0.000004 - momentum: 0.000000 2023-10-25 13:11:46,246 epoch 10 - iter 308/773 - loss 0.00484397 - time (sec): 18.88 - samples/sec: 2537.74 - lr: 0.000003 - momentum: 0.000000 2023-10-25 13:11:50,725 epoch 10 - iter 385/773 - loss 0.00486479 - time (sec): 23.36 - samples/sec: 2583.19 - lr: 0.000003 - momentum: 0.000000 2023-10-25 13:11:55,217 epoch 10 - iter 462/773 - loss 0.00448509 - time (sec): 27.85 - samples/sec: 2610.23 - lr: 0.000002 - momentum: 0.000000 2023-10-25 13:11:59,647 epoch 10 - iter 539/773 - loss 0.00421530 - time (sec): 32.28 - samples/sec: 2666.27 - lr: 0.000002 - momentum: 0.000000 2023-10-25 13:12:03,857 epoch 10 - iter 616/773 - loss 0.00437212 - time (sec): 36.49 - samples/sec: 2714.86 - lr: 0.000001 - momentum: 0.000000 2023-10-25 13:12:08,105 epoch 10 - iter 693/773 - loss 0.00404879 - time (sec): 40.74 - samples/sec: 2740.99 - lr: 0.000001 - momentum: 0.000000 2023-10-25 13:12:12,357 epoch 10 - iter 770/773 - loss 0.00377079 - time (sec): 44.99 - samples/sec: 2755.00 - lr: 0.000000 - momentum: 0.000000 2023-10-25 13:12:12,511 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:12:12,512 EPOCH 10 done: loss 0.0038 - lr: 0.000000 2023-10-25 13:12:15,102 DEV : loss 0.1328500360250473 - f1-score (micro avg) 0.7571 2023-10-25 13:12:15,570 ---------------------------------------------------------------------------------------------------- 2023-10-25 13:12:15,571 Loading model from best epoch ... 2023-10-25 13:12:17,308 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-25 13:12:26,537 Results: - F-score (micro) 0.7861 - F-score (macro) 0.6918 - Accuracy 0.6688 By class: precision recall f1-score support LOC 0.8186 0.8393 0.8288 946 BUILDING 0.6193 0.5892 0.6039 185 STREET 0.6429 0.6429 0.6429 56 micro avg 0.7812 0.7911 0.7861 1187 macro avg 0.6936 0.6905 0.6918 1187 weighted avg 0.7792 0.7911 0.7850 1187 2023-10-25 13:12:26,537 ----------------------------------------------------------------------------------------------------