stefan-it's picture
Upload folder using huggingface_hub
25a4fc1
raw
history blame contribute delete
No virus
24 kB
2023-10-13 08:57:44,355 ----------------------------------------------------------------------------------------------------
2023-10-13 08:57:44,356 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 08:57:44,356 ----------------------------------------------------------------------------------------------------
2023-10-13 08:57:44,356 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-13 08:57:44,356 ----------------------------------------------------------------------------------------------------
2023-10-13 08:57:44,356 Train: 1100 sentences
2023-10-13 08:57:44,356 (train_with_dev=False, train_with_test=False)
2023-10-13 08:57:44,356 ----------------------------------------------------------------------------------------------------
2023-10-13 08:57:44,356 Training Params:
2023-10-13 08:57:44,356 - learning_rate: "5e-05"
2023-10-13 08:57:44,356 - mini_batch_size: "8"
2023-10-13 08:57:44,356 - max_epochs: "10"
2023-10-13 08:57:44,356 - shuffle: "True"
2023-10-13 08:57:44,356 ----------------------------------------------------------------------------------------------------
2023-10-13 08:57:44,356 Plugins:
2023-10-13 08:57:44,356 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 08:57:44,356 ----------------------------------------------------------------------------------------------------
2023-10-13 08:57:44,356 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 08:57:44,356 - metric: "('micro avg', 'f1-score')"
2023-10-13 08:57:44,356 ----------------------------------------------------------------------------------------------------
2023-10-13 08:57:44,356 Computation:
2023-10-13 08:57:44,356 - compute on device: cuda:0
2023-10-13 08:57:44,356 - embedding storage: none
2023-10-13 08:57:44,356 ----------------------------------------------------------------------------------------------------
2023-10-13 08:57:44,357 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-13 08:57:44,357 ----------------------------------------------------------------------------------------------------
2023-10-13 08:57:44,357 ----------------------------------------------------------------------------------------------------
2023-10-13 08:57:45,066 epoch 1 - iter 13/138 - loss 3.19871477 - time (sec): 0.71 - samples/sec: 2838.21 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:57:45,802 epoch 1 - iter 26/138 - loss 2.89610673 - time (sec): 1.44 - samples/sec: 2760.43 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:57:46,524 epoch 1 - iter 39/138 - loss 2.36049543 - time (sec): 2.17 - samples/sec: 2768.07 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:57:47,246 epoch 1 - iter 52/138 - loss 1.95008445 - time (sec): 2.89 - samples/sec: 2883.52 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:57:47,976 epoch 1 - iter 65/138 - loss 1.69219973 - time (sec): 3.62 - samples/sec: 2930.93 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:57:48,723 epoch 1 - iter 78/138 - loss 1.51635053 - time (sec): 4.37 - samples/sec: 2930.56 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:57:49,458 epoch 1 - iter 91/138 - loss 1.37181089 - time (sec): 5.10 - samples/sec: 2922.55 - lr: 0.000033 - momentum: 0.000000
2023-10-13 08:57:50,183 epoch 1 - iter 104/138 - loss 1.24447856 - time (sec): 5.82 - samples/sec: 2955.21 - lr: 0.000037 - momentum: 0.000000
2023-10-13 08:57:50,865 epoch 1 - iter 117/138 - loss 1.14921726 - time (sec): 6.51 - samples/sec: 2980.86 - lr: 0.000042 - momentum: 0.000000
2023-10-13 08:57:51,559 epoch 1 - iter 130/138 - loss 1.07565563 - time (sec): 7.20 - samples/sec: 2962.77 - lr: 0.000047 - momentum: 0.000000
2023-10-13 08:57:52,012 ----------------------------------------------------------------------------------------------------
2023-10-13 08:57:52,012 EPOCH 1 done: loss 1.0337 - lr: 0.000047
2023-10-13 08:57:52,671 DEV : loss 0.22752070426940918 - f1-score (micro avg) 0.6682
2023-10-13 08:57:52,676 saving best model
2023-10-13 08:57:53,057 ----------------------------------------------------------------------------------------------------
2023-10-13 08:57:53,755 epoch 2 - iter 13/138 - loss 0.26158939 - time (sec): 0.70 - samples/sec: 3006.97 - lr: 0.000050 - momentum: 0.000000
2023-10-13 08:57:54,475 epoch 2 - iter 26/138 - loss 0.21700014 - time (sec): 1.42 - samples/sec: 3025.83 - lr: 0.000049 - momentum: 0.000000
2023-10-13 08:57:55,232 epoch 2 - iter 39/138 - loss 0.19436700 - time (sec): 2.17 - samples/sec: 3005.60 - lr: 0.000048 - momentum: 0.000000
2023-10-13 08:57:56,027 epoch 2 - iter 52/138 - loss 0.18948058 - time (sec): 2.97 - samples/sec: 3018.43 - lr: 0.000048 - momentum: 0.000000
2023-10-13 08:57:56,783 epoch 2 - iter 65/138 - loss 0.19422866 - time (sec): 3.72 - samples/sec: 3067.06 - lr: 0.000047 - momentum: 0.000000
2023-10-13 08:57:57,497 epoch 2 - iter 78/138 - loss 0.19168065 - time (sec): 4.44 - samples/sec: 2994.06 - lr: 0.000047 - momentum: 0.000000
2023-10-13 08:57:58,232 epoch 2 - iter 91/138 - loss 0.18872444 - time (sec): 5.17 - samples/sec: 2950.10 - lr: 0.000046 - momentum: 0.000000
2023-10-13 08:57:58,983 epoch 2 - iter 104/138 - loss 0.18461520 - time (sec): 5.92 - samples/sec: 2945.84 - lr: 0.000046 - momentum: 0.000000
2023-10-13 08:57:59,746 epoch 2 - iter 117/138 - loss 0.18182663 - time (sec): 6.69 - samples/sec: 2927.06 - lr: 0.000045 - momentum: 0.000000
2023-10-13 08:58:00,423 epoch 2 - iter 130/138 - loss 0.18006433 - time (sec): 7.36 - samples/sec: 2936.15 - lr: 0.000045 - momentum: 0.000000
2023-10-13 08:58:00,866 ----------------------------------------------------------------------------------------------------
2023-10-13 08:58:00,867 EPOCH 2 done: loss 0.1785 - lr: 0.000045
2023-10-13 08:58:01,517 DEV : loss 0.13236640393733978 - f1-score (micro avg) 0.8118
2023-10-13 08:58:01,522 saving best model
2023-10-13 08:58:01,977 ----------------------------------------------------------------------------------------------------
2023-10-13 08:58:02,737 epoch 3 - iter 13/138 - loss 0.08803266 - time (sec): 0.75 - samples/sec: 2807.32 - lr: 0.000044 - momentum: 0.000000
2023-10-13 08:58:03,522 epoch 3 - iter 26/138 - loss 0.07423184 - time (sec): 1.54 - samples/sec: 2913.30 - lr: 0.000043 - momentum: 0.000000
2023-10-13 08:58:04,214 epoch 3 - iter 39/138 - loss 0.08403636 - time (sec): 2.23 - samples/sec: 2912.80 - lr: 0.000043 - momentum: 0.000000
2023-10-13 08:58:04,912 epoch 3 - iter 52/138 - loss 0.08542265 - time (sec): 2.93 - samples/sec: 2852.10 - lr: 0.000042 - momentum: 0.000000
2023-10-13 08:58:05,637 epoch 3 - iter 65/138 - loss 0.08944773 - time (sec): 3.65 - samples/sec: 2912.28 - lr: 0.000042 - momentum: 0.000000
2023-10-13 08:58:06,362 epoch 3 - iter 78/138 - loss 0.08798574 - time (sec): 4.37 - samples/sec: 2907.54 - lr: 0.000041 - momentum: 0.000000
2023-10-13 08:58:07,089 epoch 3 - iter 91/138 - loss 0.09869091 - time (sec): 5.10 - samples/sec: 2961.96 - lr: 0.000041 - momentum: 0.000000
2023-10-13 08:58:07,806 epoch 3 - iter 104/138 - loss 0.09713533 - time (sec): 5.82 - samples/sec: 2968.89 - lr: 0.000040 - momentum: 0.000000
2023-10-13 08:58:08,550 epoch 3 - iter 117/138 - loss 0.09854157 - time (sec): 6.56 - samples/sec: 2973.31 - lr: 0.000040 - momentum: 0.000000
2023-10-13 08:58:09,276 epoch 3 - iter 130/138 - loss 0.09889222 - time (sec): 7.29 - samples/sec: 2957.02 - lr: 0.000039 - momentum: 0.000000
2023-10-13 08:58:09,718 ----------------------------------------------------------------------------------------------------
2023-10-13 08:58:09,718 EPOCH 3 done: loss 0.0998 - lr: 0.000039
2023-10-13 08:58:10,354 DEV : loss 0.13262909650802612 - f1-score (micro avg) 0.8483
2023-10-13 08:58:10,359 saving best model
2023-10-13 08:58:10,838 ----------------------------------------------------------------------------------------------------
2023-10-13 08:58:11,553 epoch 4 - iter 13/138 - loss 0.05094794 - time (sec): 0.71 - samples/sec: 3200.90 - lr: 0.000038 - momentum: 0.000000
2023-10-13 08:58:12,253 epoch 4 - iter 26/138 - loss 0.06347611 - time (sec): 1.41 - samples/sec: 3223.38 - lr: 0.000038 - momentum: 0.000000
2023-10-13 08:58:13,030 epoch 4 - iter 39/138 - loss 0.05700293 - time (sec): 2.19 - samples/sec: 3076.49 - lr: 0.000037 - momentum: 0.000000
2023-10-13 08:58:13,746 epoch 4 - iter 52/138 - loss 0.05711960 - time (sec): 2.91 - samples/sec: 2936.69 - lr: 0.000037 - momentum: 0.000000
2023-10-13 08:58:14,527 epoch 4 - iter 65/138 - loss 0.06612821 - time (sec): 3.69 - samples/sec: 2865.25 - lr: 0.000036 - momentum: 0.000000
2023-10-13 08:58:15,250 epoch 4 - iter 78/138 - loss 0.06073313 - time (sec): 4.41 - samples/sec: 2876.44 - lr: 0.000036 - momentum: 0.000000
2023-10-13 08:58:16,001 epoch 4 - iter 91/138 - loss 0.06310000 - time (sec): 5.16 - samples/sec: 2853.53 - lr: 0.000035 - momentum: 0.000000
2023-10-13 08:58:16,828 epoch 4 - iter 104/138 - loss 0.06924452 - time (sec): 5.99 - samples/sec: 2868.76 - lr: 0.000035 - momentum: 0.000000
2023-10-13 08:58:17,601 epoch 4 - iter 117/138 - loss 0.06882500 - time (sec): 6.76 - samples/sec: 2851.58 - lr: 0.000034 - momentum: 0.000000
2023-10-13 08:58:18,346 epoch 4 - iter 130/138 - loss 0.06960305 - time (sec): 7.51 - samples/sec: 2850.83 - lr: 0.000034 - momentum: 0.000000
2023-10-13 08:58:18,824 ----------------------------------------------------------------------------------------------------
2023-10-13 08:58:18,825 EPOCH 4 done: loss 0.0712 - lr: 0.000034
2023-10-13 08:58:19,475 DEV : loss 0.14707542955875397 - f1-score (micro avg) 0.8272
2023-10-13 08:58:19,480 ----------------------------------------------------------------------------------------------------
2023-10-13 08:58:20,274 epoch 5 - iter 13/138 - loss 0.03543947 - time (sec): 0.79 - samples/sec: 2901.84 - lr: 0.000033 - momentum: 0.000000
2023-10-13 08:58:20,985 epoch 5 - iter 26/138 - loss 0.04085515 - time (sec): 1.50 - samples/sec: 2903.87 - lr: 0.000032 - momentum: 0.000000
2023-10-13 08:58:21,691 epoch 5 - iter 39/138 - loss 0.04992583 - time (sec): 2.21 - samples/sec: 2912.06 - lr: 0.000032 - momentum: 0.000000
2023-10-13 08:58:22,429 epoch 5 - iter 52/138 - loss 0.05418000 - time (sec): 2.95 - samples/sec: 2959.47 - lr: 0.000031 - momentum: 0.000000
2023-10-13 08:58:23,160 epoch 5 - iter 65/138 - loss 0.05193546 - time (sec): 3.68 - samples/sec: 2998.23 - lr: 0.000031 - momentum: 0.000000
2023-10-13 08:58:23,863 epoch 5 - iter 78/138 - loss 0.04849240 - time (sec): 4.38 - samples/sec: 2956.36 - lr: 0.000030 - momentum: 0.000000
2023-10-13 08:58:24,546 epoch 5 - iter 91/138 - loss 0.04798557 - time (sec): 5.06 - samples/sec: 2980.23 - lr: 0.000030 - momentum: 0.000000
2023-10-13 08:58:25,265 epoch 5 - iter 104/138 - loss 0.04662516 - time (sec): 5.78 - samples/sec: 2976.44 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:58:26,015 epoch 5 - iter 117/138 - loss 0.05203494 - time (sec): 6.53 - samples/sec: 2982.25 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:58:26,707 epoch 5 - iter 130/138 - loss 0.04967259 - time (sec): 7.23 - samples/sec: 2961.77 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:58:27,173 ----------------------------------------------------------------------------------------------------
2023-10-13 08:58:27,173 EPOCH 5 done: loss 0.0490 - lr: 0.000028
2023-10-13 08:58:27,843 DEV : loss 0.15005655586719513 - f1-score (micro avg) 0.8501
2023-10-13 08:58:27,848 saving best model
2023-10-13 08:58:28,310 ----------------------------------------------------------------------------------------------------
2023-10-13 08:58:28,982 epoch 6 - iter 13/138 - loss 0.04557888 - time (sec): 0.67 - samples/sec: 3168.84 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:58:29,643 epoch 6 - iter 26/138 - loss 0.04511493 - time (sec): 1.33 - samples/sec: 3099.11 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:58:30,315 epoch 6 - iter 39/138 - loss 0.05003977 - time (sec): 2.00 - samples/sec: 3037.48 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:58:31,078 epoch 6 - iter 52/138 - loss 0.04184200 - time (sec): 2.77 - samples/sec: 2996.86 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:58:31,784 epoch 6 - iter 65/138 - loss 0.04444330 - time (sec): 3.47 - samples/sec: 2978.46 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:58:32,532 epoch 6 - iter 78/138 - loss 0.04511894 - time (sec): 4.22 - samples/sec: 2950.34 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:58:33,263 epoch 6 - iter 91/138 - loss 0.04235503 - time (sec): 4.95 - samples/sec: 2943.00 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:58:33,970 epoch 6 - iter 104/138 - loss 0.04162867 - time (sec): 5.66 - samples/sec: 2968.78 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:58:34,744 epoch 6 - iter 117/138 - loss 0.03722116 - time (sec): 6.43 - samples/sec: 2991.01 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:58:35,488 epoch 6 - iter 130/138 - loss 0.03409834 - time (sec): 7.18 - samples/sec: 3001.30 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:58:35,975 ----------------------------------------------------------------------------------------------------
2023-10-13 08:58:35,975 EPOCH 6 done: loss 0.0354 - lr: 0.000023
2023-10-13 08:58:36,637 DEV : loss 0.16685351729393005 - f1-score (micro avg) 0.8639
2023-10-13 08:58:36,643 saving best model
2023-10-13 08:58:37,114 ----------------------------------------------------------------------------------------------------
2023-10-13 08:58:37,828 epoch 7 - iter 13/138 - loss 0.01977267 - time (sec): 0.71 - samples/sec: 2980.66 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:58:38,560 epoch 7 - iter 26/138 - loss 0.02754021 - time (sec): 1.44 - samples/sec: 3030.57 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:58:39,268 epoch 7 - iter 39/138 - loss 0.02491542 - time (sec): 2.15 - samples/sec: 2953.52 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:58:40,039 epoch 7 - iter 52/138 - loss 0.03342961 - time (sec): 2.92 - samples/sec: 2965.94 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:58:40,766 epoch 7 - iter 65/138 - loss 0.03137856 - time (sec): 3.65 - samples/sec: 2949.20 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:58:41,475 epoch 7 - iter 78/138 - loss 0.02967382 - time (sec): 4.36 - samples/sec: 2913.44 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:58:42,160 epoch 7 - iter 91/138 - loss 0.02685385 - time (sec): 5.04 - samples/sec: 2929.12 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:58:42,855 epoch 7 - iter 104/138 - loss 0.02721307 - time (sec): 5.74 - samples/sec: 2932.13 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:58:43,606 epoch 7 - iter 117/138 - loss 0.03119356 - time (sec): 6.49 - samples/sec: 2911.22 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:58:44,317 epoch 7 - iter 130/138 - loss 0.02889865 - time (sec): 7.20 - samples/sec: 2961.63 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:58:44,797 ----------------------------------------------------------------------------------------------------
2023-10-13 08:58:44,797 EPOCH 7 done: loss 0.0298 - lr: 0.000017
2023-10-13 08:58:45,432 DEV : loss 0.17813566327095032 - f1-score (micro avg) 0.8636
2023-10-13 08:58:45,437 ----------------------------------------------------------------------------------------------------
2023-10-13 08:58:46,123 epoch 8 - iter 13/138 - loss 0.03541151 - time (sec): 0.69 - samples/sec: 3178.88 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:58:46,863 epoch 8 - iter 26/138 - loss 0.02622506 - time (sec): 1.42 - samples/sec: 3159.75 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:58:47,585 epoch 8 - iter 39/138 - loss 0.02389930 - time (sec): 2.15 - samples/sec: 3137.49 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:58:48,296 epoch 8 - iter 52/138 - loss 0.02205261 - time (sec): 2.86 - samples/sec: 3075.91 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:58:48,973 epoch 8 - iter 65/138 - loss 0.02665679 - time (sec): 3.54 - samples/sec: 3036.59 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:58:49,687 epoch 8 - iter 78/138 - loss 0.02287701 - time (sec): 4.25 - samples/sec: 3038.33 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:58:50,421 epoch 8 - iter 91/138 - loss 0.02221582 - time (sec): 4.98 - samples/sec: 2970.52 - lr: 0.000013 - momentum: 0.000000
2023-10-13 08:58:51,180 epoch 8 - iter 104/138 - loss 0.02401353 - time (sec): 5.74 - samples/sec: 2970.76 - lr: 0.000013 - momentum: 0.000000
2023-10-13 08:58:51,844 epoch 8 - iter 117/138 - loss 0.02385499 - time (sec): 6.41 - samples/sec: 2979.66 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:58:52,605 epoch 8 - iter 130/138 - loss 0.02237982 - time (sec): 7.17 - samples/sec: 2990.92 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:58:53,044 ----------------------------------------------------------------------------------------------------
2023-10-13 08:58:53,044 EPOCH 8 done: loss 0.0215 - lr: 0.000012
2023-10-13 08:58:53,691 DEV : loss 0.16658078134059906 - f1-score (micro avg) 0.8729
2023-10-13 08:58:53,696 saving best model
2023-10-13 08:58:54,142 ----------------------------------------------------------------------------------------------------
2023-10-13 08:58:54,908 epoch 9 - iter 13/138 - loss 0.00332000 - time (sec): 0.76 - samples/sec: 3032.77 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:58:55,676 epoch 9 - iter 26/138 - loss 0.01018189 - time (sec): 1.53 - samples/sec: 2897.38 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:58:56,374 epoch 9 - iter 39/138 - loss 0.00711881 - time (sec): 2.23 - samples/sec: 2901.51 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:58:57,094 epoch 9 - iter 52/138 - loss 0.01086945 - time (sec): 2.95 - samples/sec: 2965.20 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:58:57,835 epoch 9 - iter 65/138 - loss 0.01560931 - time (sec): 3.69 - samples/sec: 2984.23 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:58:58,563 epoch 9 - iter 78/138 - loss 0.01446554 - time (sec): 4.42 - samples/sec: 2998.82 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:58:59,278 epoch 9 - iter 91/138 - loss 0.01548107 - time (sec): 5.13 - samples/sec: 2990.47 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:59:00,018 epoch 9 - iter 104/138 - loss 0.01455742 - time (sec): 5.87 - samples/sec: 2978.72 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:59:00,764 epoch 9 - iter 117/138 - loss 0.01346173 - time (sec): 6.62 - samples/sec: 2974.79 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:59:01,440 epoch 9 - iter 130/138 - loss 0.01265528 - time (sec): 7.30 - samples/sec: 2964.38 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:59:01,877 ----------------------------------------------------------------------------------------------------
2023-10-13 08:59:01,878 EPOCH 9 done: loss 0.0140 - lr: 0.000006
2023-10-13 08:59:02,509 DEV : loss 0.16519632935523987 - f1-score (micro avg) 0.8851
2023-10-13 08:59:02,514 saving best model
2023-10-13 08:59:02,966 ----------------------------------------------------------------------------------------------------
2023-10-13 08:59:03,747 epoch 10 - iter 13/138 - loss 0.02129498 - time (sec): 0.78 - samples/sec: 2722.32 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:59:04,445 epoch 10 - iter 26/138 - loss 0.02113946 - time (sec): 1.48 - samples/sec: 2948.47 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:59:05,193 epoch 10 - iter 39/138 - loss 0.01666252 - time (sec): 2.23 - samples/sec: 2936.44 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:59:05,934 epoch 10 - iter 52/138 - loss 0.01387327 - time (sec): 2.97 - samples/sec: 2946.51 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:59:06,683 epoch 10 - iter 65/138 - loss 0.01320655 - time (sec): 3.72 - samples/sec: 2953.57 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:59:07,479 epoch 10 - iter 78/138 - loss 0.01282655 - time (sec): 4.51 - samples/sec: 2922.48 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:59:08,195 epoch 10 - iter 91/138 - loss 0.01283300 - time (sec): 5.23 - samples/sec: 2916.12 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:59:08,950 epoch 10 - iter 104/138 - loss 0.01181631 - time (sec): 5.98 - samples/sec: 2926.04 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:59:09,726 epoch 10 - iter 117/138 - loss 0.01058247 - time (sec): 6.76 - samples/sec: 2903.36 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:59:10,433 epoch 10 - iter 130/138 - loss 0.01126915 - time (sec): 7.47 - samples/sec: 2904.91 - lr: 0.000000 - momentum: 0.000000
2023-10-13 08:59:10,865 ----------------------------------------------------------------------------------------------------
2023-10-13 08:59:10,865 EPOCH 10 done: loss 0.0114 - lr: 0.000000
2023-10-13 08:59:11,520 DEV : loss 0.16343694925308228 - f1-score (micro avg) 0.8875
2023-10-13 08:59:11,525 saving best model
2023-10-13 08:59:12,404 ----------------------------------------------------------------------------------------------------
2023-10-13 08:59:12,406 Loading model from best epoch ...
2023-10-13 08:59:13,930 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-13 08:59:14,693
Results:
- F-score (micro) 0.9108
- F-score (macro) 0.5485
- Accuracy 0.8484
By class:
precision recall f1-score support
scope 0.8743 0.9091 0.8914 176
pers 0.9762 0.9609 0.9685 128
work 0.9014 0.8649 0.8828 74
object 0.0000 0.0000 0.0000 2
loc 0.0000 0.0000 0.0000 2
micro avg 0.9132 0.9084 0.9108 382
macro avg 0.5504 0.5470 0.5485 382
weighted avg 0.9045 0.9084 0.9062 382
2023-10-13 08:59:14,693 ----------------------------------------------------------------------------------------------------