stefan-it's picture
Upload ./training.log with huggingface_hub
93c3450
2023-10-25 13:04:00,403 ----------------------------------------------------------------------------------------------------
2023-10-25 13:04:00,403 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 13:04:00,404 ----------------------------------------------------------------------------------------------------
2023-10-25 13:04:00,404 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-25 13:04:00,404 ----------------------------------------------------------------------------------------------------
2023-10-25 13:04:00,404 Train: 6183 sentences
2023-10-25 13:04:00,404 (train_with_dev=False, train_with_test=False)
2023-10-25 13:04:00,404 ----------------------------------------------------------------------------------------------------
2023-10-25 13:04:00,404 Training Params:
2023-10-25 13:04:00,404 - learning_rate: "5e-05"
2023-10-25 13:04:00,404 - mini_batch_size: "8"
2023-10-25 13:04:00,404 - max_epochs: "10"
2023-10-25 13:04:00,404 - shuffle: "True"
2023-10-25 13:04:00,404 ----------------------------------------------------------------------------------------------------
2023-10-25 13:04:00,404 Plugins:
2023-10-25 13:04:00,404 - TensorboardLogger
2023-10-25 13:04:00,404 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 13:04:00,404 ----------------------------------------------------------------------------------------------------
2023-10-25 13:04:00,404 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 13:04:00,405 - metric: "('micro avg', 'f1-score')"
2023-10-25 13:04:00,405 ----------------------------------------------------------------------------------------------------
2023-10-25 13:04:00,405 Computation:
2023-10-25 13:04:00,405 - compute on device: cuda:0
2023-10-25 13:04:00,405 - embedding storage: none
2023-10-25 13:04:00,405 ----------------------------------------------------------------------------------------------------
2023-10-25 13:04:00,405 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-25 13:04:00,405 ----------------------------------------------------------------------------------------------------
2023-10-25 13:04:00,405 ----------------------------------------------------------------------------------------------------
2023-10-25 13:04:00,405 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 13:04:04,871 epoch 1 - iter 77/773 - loss 1.62607859 - time (sec): 4.47 - samples/sec: 2919.55 - lr: 0.000005 - momentum: 0.000000
2023-10-25 13:04:09,448 epoch 1 - iter 154/773 - loss 0.94210245 - time (sec): 9.04 - samples/sec: 2835.22 - lr: 0.000010 - momentum: 0.000000
2023-10-25 13:04:14,272 epoch 1 - iter 231/773 - loss 0.69033315 - time (sec): 13.87 - samples/sec: 2712.58 - lr: 0.000015 - momentum: 0.000000
2023-10-25 13:04:19,189 epoch 1 - iter 308/773 - loss 0.55795676 - time (sec): 18.78 - samples/sec: 2638.02 - lr: 0.000020 - momentum: 0.000000
2023-10-25 13:04:23,842 epoch 1 - iter 385/773 - loss 0.47173138 - time (sec): 23.44 - samples/sec: 2617.54 - lr: 0.000025 - momentum: 0.000000
2023-10-25 13:04:28,472 epoch 1 - iter 462/773 - loss 0.41386971 - time (sec): 28.07 - samples/sec: 2627.27 - lr: 0.000030 - momentum: 0.000000
2023-10-25 13:04:32,979 epoch 1 - iter 539/773 - loss 0.37054284 - time (sec): 32.57 - samples/sec: 2625.88 - lr: 0.000035 - momentum: 0.000000
2023-10-25 13:04:37,466 epoch 1 - iter 616/773 - loss 0.33663528 - time (sec): 37.06 - samples/sec: 2642.50 - lr: 0.000040 - momentum: 0.000000
2023-10-25 13:04:41,752 epoch 1 - iter 693/773 - loss 0.30816500 - time (sec): 41.35 - samples/sec: 2678.58 - lr: 0.000045 - momentum: 0.000000
2023-10-25 13:04:46,095 epoch 1 - iter 770/773 - loss 0.28446225 - time (sec): 45.69 - samples/sec: 2712.76 - lr: 0.000050 - momentum: 0.000000
2023-10-25 13:04:46,249 ----------------------------------------------------------------------------------------------------
2023-10-25 13:04:46,249 EPOCH 1 done: loss 0.2838 - lr: 0.000050
2023-10-25 13:04:49,502 DEV : loss 0.05165766924619675 - f1-score (micro avg) 0.7323
2023-10-25 13:04:49,522 saving best model
2023-10-25 13:04:50,070 ----------------------------------------------------------------------------------------------------
2023-10-25 13:04:54,732 epoch 2 - iter 77/773 - loss 0.10202012 - time (sec): 4.66 - samples/sec: 2459.39 - lr: 0.000049 - momentum: 0.000000
2023-10-25 13:04:59,242 epoch 2 - iter 154/773 - loss 0.08419062 - time (sec): 9.17 - samples/sec: 2528.27 - lr: 0.000049 - momentum: 0.000000
2023-10-25 13:05:03,731 epoch 2 - iter 231/773 - loss 0.08259793 - time (sec): 13.66 - samples/sec: 2568.79 - lr: 0.000048 - momentum: 0.000000
2023-10-25 13:05:08,253 epoch 2 - iter 308/773 - loss 0.08289606 - time (sec): 18.18 - samples/sec: 2638.40 - lr: 0.000048 - momentum: 0.000000
2023-10-25 13:05:12,813 epoch 2 - iter 385/773 - loss 0.08040769 - time (sec): 22.74 - samples/sec: 2694.06 - lr: 0.000047 - momentum: 0.000000
2023-10-25 13:05:17,136 epoch 2 - iter 462/773 - loss 0.07924733 - time (sec): 27.06 - samples/sec: 2716.80 - lr: 0.000047 - momentum: 0.000000
2023-10-25 13:05:21,425 epoch 2 - iter 539/773 - loss 0.07828472 - time (sec): 31.35 - samples/sec: 2778.47 - lr: 0.000046 - momentum: 0.000000
2023-10-25 13:05:25,710 epoch 2 - iter 616/773 - loss 0.07838880 - time (sec): 35.64 - samples/sec: 2786.75 - lr: 0.000046 - momentum: 0.000000
2023-10-25 13:05:29,957 epoch 2 - iter 693/773 - loss 0.07907713 - time (sec): 39.89 - samples/sec: 2791.31 - lr: 0.000045 - momentum: 0.000000
2023-10-25 13:05:34,285 epoch 2 - iter 770/773 - loss 0.07672251 - time (sec): 44.21 - samples/sec: 2801.22 - lr: 0.000044 - momentum: 0.000000
2023-10-25 13:05:34,454 ----------------------------------------------------------------------------------------------------
2023-10-25 13:05:34,455 EPOCH 2 done: loss 0.0767 - lr: 0.000044
2023-10-25 13:05:37,402 DEV : loss 0.05662866681814194 - f1-score (micro avg) 0.7628
2023-10-25 13:05:37,419 saving best model
2023-10-25 13:05:38,170 ----------------------------------------------------------------------------------------------------
2023-10-25 13:05:43,368 epoch 3 - iter 77/773 - loss 0.04830238 - time (sec): 5.19 - samples/sec: 2322.43 - lr: 0.000044 - momentum: 0.000000
2023-10-25 13:05:47,869 epoch 3 - iter 154/773 - loss 0.05073064 - time (sec): 9.70 - samples/sec: 2497.06 - lr: 0.000043 - momentum: 0.000000
2023-10-25 13:05:52,586 epoch 3 - iter 231/773 - loss 0.05345361 - time (sec): 14.41 - samples/sec: 2634.94 - lr: 0.000043 - momentum: 0.000000
2023-10-25 13:05:57,105 epoch 3 - iter 308/773 - loss 0.05233532 - time (sec): 18.93 - samples/sec: 2619.65 - lr: 0.000042 - momentum: 0.000000
2023-10-25 13:06:01,765 epoch 3 - iter 385/773 - loss 0.05274950 - time (sec): 23.59 - samples/sec: 2627.87 - lr: 0.000042 - momentum: 0.000000
2023-10-25 13:06:06,320 epoch 3 - iter 462/773 - loss 0.05328798 - time (sec): 28.15 - samples/sec: 2619.52 - lr: 0.000041 - momentum: 0.000000
2023-10-25 13:06:10,830 epoch 3 - iter 539/773 - loss 0.05175706 - time (sec): 32.66 - samples/sec: 2640.92 - lr: 0.000041 - momentum: 0.000000
2023-10-25 13:06:15,426 epoch 3 - iter 616/773 - loss 0.05144252 - time (sec): 37.25 - samples/sec: 2655.35 - lr: 0.000040 - momentum: 0.000000
2023-10-25 13:06:20,100 epoch 3 - iter 693/773 - loss 0.05097205 - time (sec): 41.93 - samples/sec: 2651.47 - lr: 0.000039 - momentum: 0.000000
2023-10-25 13:06:24,629 epoch 3 - iter 770/773 - loss 0.05141460 - time (sec): 46.46 - samples/sec: 2667.56 - lr: 0.000039 - momentum: 0.000000
2023-10-25 13:06:24,796 ----------------------------------------------------------------------------------------------------
2023-10-25 13:06:24,797 EPOCH 3 done: loss 0.0514 - lr: 0.000039
2023-10-25 13:06:27,708 DEV : loss 0.08030106127262115 - f1-score (micro avg) 0.7182
2023-10-25 13:06:27,729 ----------------------------------------------------------------------------------------------------
2023-10-25 13:06:32,291 epoch 4 - iter 77/773 - loss 0.03816760 - time (sec): 4.56 - samples/sec: 2653.55 - lr: 0.000038 - momentum: 0.000000
2023-10-25 13:06:37,003 epoch 4 - iter 154/773 - loss 0.03377926 - time (sec): 9.27 - samples/sec: 2620.46 - lr: 0.000038 - momentum: 0.000000
2023-10-25 13:06:41,801 epoch 4 - iter 231/773 - loss 0.03420877 - time (sec): 14.07 - samples/sec: 2623.09 - lr: 0.000037 - momentum: 0.000000
2023-10-25 13:06:46,421 epoch 4 - iter 308/773 - loss 0.03420744 - time (sec): 18.69 - samples/sec: 2623.52 - lr: 0.000037 - momentum: 0.000000
2023-10-25 13:06:50,969 epoch 4 - iter 385/773 - loss 0.03441451 - time (sec): 23.24 - samples/sec: 2597.91 - lr: 0.000036 - momentum: 0.000000
2023-10-25 13:06:55,487 epoch 4 - iter 462/773 - loss 0.03487924 - time (sec): 27.76 - samples/sec: 2589.27 - lr: 0.000036 - momentum: 0.000000
2023-10-25 13:07:00,136 epoch 4 - iter 539/773 - loss 0.03545083 - time (sec): 32.41 - samples/sec: 2615.60 - lr: 0.000035 - momentum: 0.000000
2023-10-25 13:07:04,532 epoch 4 - iter 616/773 - loss 0.03574374 - time (sec): 36.80 - samples/sec: 2654.64 - lr: 0.000034 - momentum: 0.000000
2023-10-25 13:07:09,155 epoch 4 - iter 693/773 - loss 0.03593766 - time (sec): 41.42 - samples/sec: 2684.99 - lr: 0.000034 - momentum: 0.000000
2023-10-25 13:07:13,897 epoch 4 - iter 770/773 - loss 0.03530953 - time (sec): 46.17 - samples/sec: 2684.07 - lr: 0.000033 - momentum: 0.000000
2023-10-25 13:07:14,074 ----------------------------------------------------------------------------------------------------
2023-10-25 13:07:14,075 EPOCH 4 done: loss 0.0354 - lr: 0.000033
2023-10-25 13:07:16,611 DEV : loss 0.08339047431945801 - f1-score (micro avg) 0.7649
2023-10-25 13:07:16,630 saving best model
2023-10-25 13:07:17,281 ----------------------------------------------------------------------------------------------------
2023-10-25 13:07:22,250 epoch 5 - iter 77/773 - loss 0.01949160 - time (sec): 4.97 - samples/sec: 2636.17 - lr: 0.000033 - momentum: 0.000000
2023-10-25 13:07:27,068 epoch 5 - iter 154/773 - loss 0.01928080 - time (sec): 9.78 - samples/sec: 2553.53 - lr: 0.000032 - momentum: 0.000000
2023-10-25 13:07:31,973 epoch 5 - iter 231/773 - loss 0.02144756 - time (sec): 14.69 - samples/sec: 2503.66 - lr: 0.000032 - momentum: 0.000000
2023-10-25 13:07:36,776 epoch 5 - iter 308/773 - loss 0.02257405 - time (sec): 19.49 - samples/sec: 2478.57 - lr: 0.000031 - momentum: 0.000000
2023-10-25 13:07:41,608 epoch 5 - iter 385/773 - loss 0.02362110 - time (sec): 24.32 - samples/sec: 2520.77 - lr: 0.000031 - momentum: 0.000000
2023-10-25 13:07:46,203 epoch 5 - iter 462/773 - loss 0.02401721 - time (sec): 28.92 - samples/sec: 2518.69 - lr: 0.000030 - momentum: 0.000000
2023-10-25 13:07:50,848 epoch 5 - iter 539/773 - loss 0.02564729 - time (sec): 33.56 - samples/sec: 2527.50 - lr: 0.000029 - momentum: 0.000000
2023-10-25 13:07:55,431 epoch 5 - iter 616/773 - loss 0.02576209 - time (sec): 38.15 - samples/sec: 2572.76 - lr: 0.000029 - momentum: 0.000000
2023-10-25 13:07:59,836 epoch 5 - iter 693/773 - loss 0.02549268 - time (sec): 42.55 - samples/sec: 2611.03 - lr: 0.000028 - momentum: 0.000000
2023-10-25 13:08:04,104 epoch 5 - iter 770/773 - loss 0.02500032 - time (sec): 46.82 - samples/sec: 2646.22 - lr: 0.000028 - momentum: 0.000000
2023-10-25 13:08:04,263 ----------------------------------------------------------------------------------------------------
2023-10-25 13:08:04,263 EPOCH 5 done: loss 0.0250 - lr: 0.000028
2023-10-25 13:08:06,931 DEV : loss 0.10233461856842041 - f1-score (micro avg) 0.7579
2023-10-25 13:08:06,952 ----------------------------------------------------------------------------------------------------
2023-10-25 13:08:11,484 epoch 6 - iter 77/773 - loss 0.02403484 - time (sec): 4.53 - samples/sec: 2724.96 - lr: 0.000027 - momentum: 0.000000
2023-10-25 13:08:16,103 epoch 6 - iter 154/773 - loss 0.01933766 - time (sec): 9.15 - samples/sec: 2749.34 - lr: 0.000027 - momentum: 0.000000
2023-10-25 13:08:20,644 epoch 6 - iter 231/773 - loss 0.01675474 - time (sec): 13.69 - samples/sec: 2703.76 - lr: 0.000026 - momentum: 0.000000
2023-10-25 13:08:25,139 epoch 6 - iter 308/773 - loss 0.01770050 - time (sec): 18.18 - samples/sec: 2730.90 - lr: 0.000026 - momentum: 0.000000
2023-10-25 13:08:29,794 epoch 6 - iter 385/773 - loss 0.01697379 - time (sec): 22.84 - samples/sec: 2721.83 - lr: 0.000025 - momentum: 0.000000
2023-10-25 13:08:34,492 epoch 6 - iter 462/773 - loss 0.01647189 - time (sec): 27.54 - samples/sec: 2720.32 - lr: 0.000024 - momentum: 0.000000
2023-10-25 13:08:39,812 epoch 6 - iter 539/773 - loss 0.01695804 - time (sec): 32.86 - samples/sec: 2653.54 - lr: 0.000024 - momentum: 0.000000
2023-10-25 13:08:44,380 epoch 6 - iter 616/773 - loss 0.01663896 - time (sec): 37.43 - samples/sec: 2653.24 - lr: 0.000023 - momentum: 0.000000
2023-10-25 13:08:48,904 epoch 6 - iter 693/773 - loss 0.01722230 - time (sec): 41.95 - samples/sec: 2662.44 - lr: 0.000023 - momentum: 0.000000
2023-10-25 13:08:53,262 epoch 6 - iter 770/773 - loss 0.01642065 - time (sec): 46.31 - samples/sec: 2675.35 - lr: 0.000022 - momentum: 0.000000
2023-10-25 13:08:53,417 ----------------------------------------------------------------------------------------------------
2023-10-25 13:08:53,417 EPOCH 6 done: loss 0.0164 - lr: 0.000022
2023-10-25 13:08:56,297 DEV : loss 0.10584240406751633 - f1-score (micro avg) 0.7992
2023-10-25 13:08:56,320 saving best model
2023-10-25 13:08:56,974 ----------------------------------------------------------------------------------------------------
2023-10-25 13:09:01,620 epoch 7 - iter 77/773 - loss 0.01534673 - time (sec): 4.64 - samples/sec: 2796.75 - lr: 0.000022 - momentum: 0.000000
2023-10-25 13:09:06,285 epoch 7 - iter 154/773 - loss 0.01242129 - time (sec): 9.31 - samples/sec: 2688.82 - lr: 0.000021 - momentum: 0.000000
2023-10-25 13:09:11,077 epoch 7 - iter 231/773 - loss 0.01054806 - time (sec): 14.10 - samples/sec: 2681.47 - lr: 0.000021 - momentum: 0.000000
2023-10-25 13:09:15,836 epoch 7 - iter 308/773 - loss 0.01064088 - time (sec): 18.86 - samples/sec: 2673.72 - lr: 0.000020 - momentum: 0.000000
2023-10-25 13:09:20,462 epoch 7 - iter 385/773 - loss 0.01089602 - time (sec): 23.48 - samples/sec: 2643.79 - lr: 0.000019 - momentum: 0.000000
2023-10-25 13:09:25,076 epoch 7 - iter 462/773 - loss 0.01071434 - time (sec): 28.10 - samples/sec: 2651.83 - lr: 0.000019 - momentum: 0.000000
2023-10-25 13:09:29,777 epoch 7 - iter 539/773 - loss 0.01088053 - time (sec): 32.80 - samples/sec: 2648.61 - lr: 0.000018 - momentum: 0.000000
2023-10-25 13:09:34,667 epoch 7 - iter 616/773 - loss 0.01092188 - time (sec): 37.69 - samples/sec: 2627.23 - lr: 0.000018 - momentum: 0.000000
2023-10-25 13:09:39,693 epoch 7 - iter 693/773 - loss 0.01146992 - time (sec): 42.72 - samples/sec: 2633.32 - lr: 0.000017 - momentum: 0.000000
2023-10-25 13:09:44,498 epoch 7 - iter 770/773 - loss 0.01173852 - time (sec): 47.52 - samples/sec: 2608.77 - lr: 0.000017 - momentum: 0.000000
2023-10-25 13:09:44,678 ----------------------------------------------------------------------------------------------------
2023-10-25 13:09:44,678 EPOCH 7 done: loss 0.0117 - lr: 0.000017
2023-10-25 13:09:47,214 DEV : loss 0.11367938667535782 - f1-score (micro avg) 0.7702
2023-10-25 13:09:47,233 ----------------------------------------------------------------------------------------------------
2023-10-25 13:09:52,081 epoch 8 - iter 77/773 - loss 0.00915681 - time (sec): 4.85 - samples/sec: 2566.64 - lr: 0.000016 - momentum: 0.000000
2023-10-25 13:09:56,977 epoch 8 - iter 154/773 - loss 0.00927730 - time (sec): 9.74 - samples/sec: 2495.94 - lr: 0.000016 - momentum: 0.000000
2023-10-25 13:10:01,947 epoch 8 - iter 231/773 - loss 0.00889633 - time (sec): 14.71 - samples/sec: 2492.37 - lr: 0.000015 - momentum: 0.000000
2023-10-25 13:10:06,837 epoch 8 - iter 308/773 - loss 0.00755093 - time (sec): 19.60 - samples/sec: 2506.99 - lr: 0.000014 - momentum: 0.000000
2023-10-25 13:10:11,651 epoch 8 - iter 385/773 - loss 0.00800710 - time (sec): 24.42 - samples/sec: 2523.49 - lr: 0.000014 - momentum: 0.000000
2023-10-25 13:10:16,478 epoch 8 - iter 462/773 - loss 0.00799662 - time (sec): 29.24 - samples/sec: 2517.18 - lr: 0.000013 - momentum: 0.000000
2023-10-25 13:10:21,349 epoch 8 - iter 539/773 - loss 0.00766083 - time (sec): 34.11 - samples/sec: 2569.14 - lr: 0.000013 - momentum: 0.000000
2023-10-25 13:10:26,302 epoch 8 - iter 616/773 - loss 0.00763491 - time (sec): 39.07 - samples/sec: 2560.31 - lr: 0.000012 - momentum: 0.000000
2023-10-25 13:10:31,080 epoch 8 - iter 693/773 - loss 0.00766449 - time (sec): 43.84 - samples/sec: 2546.17 - lr: 0.000012 - momentum: 0.000000
2023-10-25 13:10:35,826 epoch 8 - iter 770/773 - loss 0.00742265 - time (sec): 48.59 - samples/sec: 2547.19 - lr: 0.000011 - momentum: 0.000000
2023-10-25 13:10:36,008 ----------------------------------------------------------------------------------------------------
2023-10-25 13:10:36,009 EPOCH 8 done: loss 0.0075 - lr: 0.000011
2023-10-25 13:10:38,904 DEV : loss 0.1300455778837204 - f1-score (micro avg) 0.755
2023-10-25 13:10:38,923 ----------------------------------------------------------------------------------------------------
2023-10-25 13:10:43,606 epoch 9 - iter 77/773 - loss 0.00575536 - time (sec): 4.68 - samples/sec: 2808.71 - lr: 0.000011 - momentum: 0.000000
2023-10-25 13:10:48,290 epoch 9 - iter 154/773 - loss 0.00667575 - time (sec): 9.37 - samples/sec: 2659.97 - lr: 0.000010 - momentum: 0.000000
2023-10-25 13:10:52,840 epoch 9 - iter 231/773 - loss 0.00556811 - time (sec): 13.92 - samples/sec: 2714.47 - lr: 0.000009 - momentum: 0.000000
2023-10-25 13:10:57,278 epoch 9 - iter 308/773 - loss 0.00712098 - time (sec): 18.35 - samples/sec: 2701.17 - lr: 0.000009 - momentum: 0.000000
2023-10-25 13:11:01,945 epoch 9 - iter 385/773 - loss 0.00671541 - time (sec): 23.02 - samples/sec: 2693.66 - lr: 0.000008 - momentum: 0.000000
2023-10-25 13:11:06,517 epoch 9 - iter 462/773 - loss 0.00677819 - time (sec): 27.59 - samples/sec: 2706.24 - lr: 0.000008 - momentum: 0.000000
2023-10-25 13:11:11,088 epoch 9 - iter 539/773 - loss 0.00647608 - time (sec): 32.16 - samples/sec: 2718.72 - lr: 0.000007 - momentum: 0.000000
2023-10-25 13:11:15,685 epoch 9 - iter 616/773 - loss 0.00626968 - time (sec): 36.76 - samples/sec: 2728.56 - lr: 0.000007 - momentum: 0.000000
2023-10-25 13:11:20,159 epoch 9 - iter 693/773 - loss 0.00611977 - time (sec): 41.23 - samples/sec: 2725.48 - lr: 0.000006 - momentum: 0.000000
2023-10-25 13:11:24,583 epoch 9 - iter 770/773 - loss 0.00625835 - time (sec): 45.66 - samples/sec: 2715.43 - lr: 0.000006 - momentum: 0.000000
2023-10-25 13:11:24,746 ----------------------------------------------------------------------------------------------------
2023-10-25 13:11:24,747 EPOCH 9 done: loss 0.0062 - lr: 0.000006
2023-10-25 13:11:27,348 DEV : loss 0.13065063953399658 - f1-score (micro avg) 0.7686
2023-10-25 13:11:27,366 ----------------------------------------------------------------------------------------------------
2023-10-25 13:11:32,017 epoch 10 - iter 77/773 - loss 0.00367421 - time (sec): 4.65 - samples/sec: 2585.53 - lr: 0.000005 - momentum: 0.000000
2023-10-25 13:11:37,336 epoch 10 - iter 154/773 - loss 0.00561623 - time (sec): 9.97 - samples/sec: 2487.55 - lr: 0.000005 - momentum: 0.000000
2023-10-25 13:11:41,777 epoch 10 - iter 231/773 - loss 0.00497335 - time (sec): 14.41 - samples/sec: 2502.41 - lr: 0.000004 - momentum: 0.000000
2023-10-25 13:11:46,246 epoch 10 - iter 308/773 - loss 0.00484397 - time (sec): 18.88 - samples/sec: 2537.74 - lr: 0.000003 - momentum: 0.000000
2023-10-25 13:11:50,725 epoch 10 - iter 385/773 - loss 0.00486479 - time (sec): 23.36 - samples/sec: 2583.19 - lr: 0.000003 - momentum: 0.000000
2023-10-25 13:11:55,217 epoch 10 - iter 462/773 - loss 0.00448509 - time (sec): 27.85 - samples/sec: 2610.23 - lr: 0.000002 - momentum: 0.000000
2023-10-25 13:11:59,647 epoch 10 - iter 539/773 - loss 0.00421530 - time (sec): 32.28 - samples/sec: 2666.27 - lr: 0.000002 - momentum: 0.000000
2023-10-25 13:12:03,857 epoch 10 - iter 616/773 - loss 0.00437212 - time (sec): 36.49 - samples/sec: 2714.86 - lr: 0.000001 - momentum: 0.000000
2023-10-25 13:12:08,105 epoch 10 - iter 693/773 - loss 0.00404879 - time (sec): 40.74 - samples/sec: 2740.99 - lr: 0.000001 - momentum: 0.000000
2023-10-25 13:12:12,357 epoch 10 - iter 770/773 - loss 0.00377079 - time (sec): 44.99 - samples/sec: 2755.00 - lr: 0.000000 - momentum: 0.000000
2023-10-25 13:12:12,511 ----------------------------------------------------------------------------------------------------
2023-10-25 13:12:12,512 EPOCH 10 done: loss 0.0038 - lr: 0.000000
2023-10-25 13:12:15,102 DEV : loss 0.1328500360250473 - f1-score (micro avg) 0.7571
2023-10-25 13:12:15,570 ----------------------------------------------------------------------------------------------------
2023-10-25 13:12:15,571 Loading model from best epoch ...
2023-10-25 13:12:17,308 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-25 13:12:26,537
Results:
- F-score (micro) 0.7861
- F-score (macro) 0.6918
- Accuracy 0.6688
By class:
precision recall f1-score support
LOC 0.8186 0.8393 0.8288 946
BUILDING 0.6193 0.5892 0.6039 185
STREET 0.6429 0.6429 0.6429 56
micro avg 0.7812 0.7911 0.7861 1187
macro avg 0.6936 0.6905 0.6918 1187
weighted avg 0.7792 0.7911 0.7850 1187
2023-10-25 13:12:26,537 ----------------------------------------------------------------------------------------------------