stefan-it's picture
Upload folder using huggingface_hub
7bbd7ec
2023-10-13 08:16:43,981 ----------------------------------------------------------------------------------------------------
2023-10-13 08:16:43,982 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 08:16:43,982 ----------------------------------------------------------------------------------------------------
2023-10-13 08:16:43,982 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-13 08:16:43,982 ----------------------------------------------------------------------------------------------------
2023-10-13 08:16:43,982 Train: 1100 sentences
2023-10-13 08:16:43,982 (train_with_dev=False, train_with_test=False)
2023-10-13 08:16:43,982 ----------------------------------------------------------------------------------------------------
2023-10-13 08:16:43,982 Training Params:
2023-10-13 08:16:43,983 - learning_rate: "3e-05"
2023-10-13 08:16:43,983 - mini_batch_size: "8"
2023-10-13 08:16:43,983 - max_epochs: "10"
2023-10-13 08:16:43,983 - shuffle: "True"
2023-10-13 08:16:43,983 ----------------------------------------------------------------------------------------------------
2023-10-13 08:16:43,983 Plugins:
2023-10-13 08:16:43,983 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 08:16:43,983 ----------------------------------------------------------------------------------------------------
2023-10-13 08:16:43,983 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 08:16:43,983 - metric: "('micro avg', 'f1-score')"
2023-10-13 08:16:43,983 ----------------------------------------------------------------------------------------------------
2023-10-13 08:16:43,983 Computation:
2023-10-13 08:16:43,983 - compute on device: cuda:0
2023-10-13 08:16:43,983 - embedding storage: none
2023-10-13 08:16:43,983 ----------------------------------------------------------------------------------------------------
2023-10-13 08:16:43,983 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-13 08:16:43,983 ----------------------------------------------------------------------------------------------------
2023-10-13 08:16:43,983 ----------------------------------------------------------------------------------------------------
2023-10-13 08:16:44,680 epoch 1 - iter 13/138 - loss 3.39844404 - time (sec): 0.70 - samples/sec: 2706.86 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:16:45,399 epoch 1 - iter 26/138 - loss 3.23549503 - time (sec): 1.41 - samples/sec: 2965.31 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:16:46,106 epoch 1 - iter 39/138 - loss 2.90185712 - time (sec): 2.12 - samples/sec: 2979.32 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:16:46,859 epoch 1 - iter 52/138 - loss 2.46747970 - time (sec): 2.87 - samples/sec: 2930.49 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:16:47,547 epoch 1 - iter 65/138 - loss 2.14588051 - time (sec): 3.56 - samples/sec: 2937.87 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:16:48,318 epoch 1 - iter 78/138 - loss 1.88299776 - time (sec): 4.33 - samples/sec: 2947.77 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:16:49,054 epoch 1 - iter 91/138 - loss 1.70363280 - time (sec): 5.07 - samples/sec: 2965.90 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:16:49,758 epoch 1 - iter 104/138 - loss 1.58115194 - time (sec): 5.77 - samples/sec: 2939.87 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:16:50,547 epoch 1 - iter 117/138 - loss 1.45380970 - time (sec): 6.56 - samples/sec: 2934.33 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:16:51,258 epoch 1 - iter 130/138 - loss 1.36485968 - time (sec): 7.27 - samples/sec: 2938.41 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:16:51,694 ----------------------------------------------------------------------------------------------------
2023-10-13 08:16:51,694 EPOCH 1 done: loss 1.3062 - lr: 0.000028
2023-10-13 08:16:52,448 DEV : loss 0.35115909576416016 - f1-score (micro avg) 0.4553
2023-10-13 08:16:52,452 saving best model
2023-10-13 08:16:52,770 ----------------------------------------------------------------------------------------------------
2023-10-13 08:16:53,462 epoch 2 - iter 13/138 - loss 0.37155380 - time (sec): 0.69 - samples/sec: 2906.62 - lr: 0.000030 - momentum: 0.000000
2023-10-13 08:16:54,167 epoch 2 - iter 26/138 - loss 0.34526703 - time (sec): 1.39 - samples/sec: 2951.48 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:16:54,892 epoch 2 - iter 39/138 - loss 0.32593876 - time (sec): 2.12 - samples/sec: 2839.93 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:16:55,632 epoch 2 - iter 52/138 - loss 0.30436836 - time (sec): 2.86 - samples/sec: 2908.86 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:16:56,367 epoch 2 - iter 65/138 - loss 0.28816251 - time (sec): 3.59 - samples/sec: 2932.48 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:16:57,101 epoch 2 - iter 78/138 - loss 0.27814445 - time (sec): 4.33 - samples/sec: 2927.00 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:16:57,803 epoch 2 - iter 91/138 - loss 0.27008342 - time (sec): 5.03 - samples/sec: 2951.87 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:16:58,539 epoch 2 - iter 104/138 - loss 0.25724546 - time (sec): 5.77 - samples/sec: 2946.49 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:16:59,295 epoch 2 - iter 117/138 - loss 0.24745775 - time (sec): 6.52 - samples/sec: 2956.31 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:17:00,103 epoch 2 - iter 130/138 - loss 0.24316299 - time (sec): 7.33 - samples/sec: 2913.47 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:17:00,583 ----------------------------------------------------------------------------------------------------
2023-10-13 08:17:00,583 EPOCH 2 done: loss 0.2428 - lr: 0.000027
2023-10-13 08:17:01,233 DEV : loss 0.15248945355415344 - f1-score (micro avg) 0.7903
2023-10-13 08:17:01,238 saving best model
2023-10-13 08:17:01,652 ----------------------------------------------------------------------------------------------------
2023-10-13 08:17:02,385 epoch 3 - iter 13/138 - loss 0.15002974 - time (sec): 0.73 - samples/sec: 2759.03 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:17:03,185 epoch 3 - iter 26/138 - loss 0.12249150 - time (sec): 1.53 - samples/sec: 2802.88 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:17:03,989 epoch 3 - iter 39/138 - loss 0.13316301 - time (sec): 2.33 - samples/sec: 2812.66 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:17:04,713 epoch 3 - iter 52/138 - loss 0.13360280 - time (sec): 3.06 - samples/sec: 2878.75 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:17:05,409 epoch 3 - iter 65/138 - loss 0.13899541 - time (sec): 3.75 - samples/sec: 2902.32 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:17:06,175 epoch 3 - iter 78/138 - loss 0.13057252 - time (sec): 4.52 - samples/sec: 2869.17 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:17:06,916 epoch 3 - iter 91/138 - loss 0.12703094 - time (sec): 5.26 - samples/sec: 2893.38 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:17:07,684 epoch 3 - iter 104/138 - loss 0.12995365 - time (sec): 6.03 - samples/sec: 2899.33 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:17:08,358 epoch 3 - iter 117/138 - loss 0.12516223 - time (sec): 6.70 - samples/sec: 2891.58 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:17:09,044 epoch 3 - iter 130/138 - loss 0.12446066 - time (sec): 7.39 - samples/sec: 2895.78 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:17:09,475 ----------------------------------------------------------------------------------------------------
2023-10-13 08:17:09,476 EPOCH 3 done: loss 0.1227 - lr: 0.000024
2023-10-13 08:17:10,208 DEV : loss 0.1409459114074707 - f1-score (micro avg) 0.8139
2023-10-13 08:17:10,213 saving best model
2023-10-13 08:17:10,662 ----------------------------------------------------------------------------------------------------
2023-10-13 08:17:11,406 epoch 4 - iter 13/138 - loss 0.07710387 - time (sec): 0.74 - samples/sec: 3083.05 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:17:12,165 epoch 4 - iter 26/138 - loss 0.07396413 - time (sec): 1.50 - samples/sec: 3053.44 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:17:12,893 epoch 4 - iter 39/138 - loss 0.08239205 - time (sec): 2.23 - samples/sec: 3037.21 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:17:13,626 epoch 4 - iter 52/138 - loss 0.08158653 - time (sec): 2.96 - samples/sec: 2942.70 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:17:14,382 epoch 4 - iter 65/138 - loss 0.08294943 - time (sec): 3.72 - samples/sec: 2928.15 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:17:15,085 epoch 4 - iter 78/138 - loss 0.07872819 - time (sec): 4.42 - samples/sec: 2917.61 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:17:15,848 epoch 4 - iter 91/138 - loss 0.08338656 - time (sec): 5.18 - samples/sec: 2934.28 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:17:16,582 epoch 4 - iter 104/138 - loss 0.07799777 - time (sec): 5.92 - samples/sec: 2901.17 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:17:17,361 epoch 4 - iter 117/138 - loss 0.07627958 - time (sec): 6.70 - samples/sec: 2883.90 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:17:18,071 epoch 4 - iter 130/138 - loss 0.07776798 - time (sec): 7.41 - samples/sec: 2909.91 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:17:18,511 ----------------------------------------------------------------------------------------------------
2023-10-13 08:17:18,511 EPOCH 4 done: loss 0.0807 - lr: 0.000020
2023-10-13 08:17:19,182 DEV : loss 0.16273224353790283 - f1-score (micro avg) 0.7991
2023-10-13 08:17:19,187 ----------------------------------------------------------------------------------------------------
2023-10-13 08:17:19,855 epoch 5 - iter 13/138 - loss 0.05709504 - time (sec): 0.67 - samples/sec: 2818.71 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:17:20,572 epoch 5 - iter 26/138 - loss 0.05187303 - time (sec): 1.38 - samples/sec: 2866.68 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:17:21,330 epoch 5 - iter 39/138 - loss 0.04727304 - time (sec): 2.14 - samples/sec: 2987.88 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:17:22,062 epoch 5 - iter 52/138 - loss 0.04973235 - time (sec): 2.87 - samples/sec: 3002.96 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:17:22,812 epoch 5 - iter 65/138 - loss 0.04510072 - time (sec): 3.62 - samples/sec: 3022.00 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:17:23,552 epoch 5 - iter 78/138 - loss 0.04683378 - time (sec): 4.36 - samples/sec: 3039.12 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:17:24,312 epoch 5 - iter 91/138 - loss 0.04896954 - time (sec): 5.12 - samples/sec: 2988.11 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:17:25,009 epoch 5 - iter 104/138 - loss 0.05259543 - time (sec): 5.82 - samples/sec: 2973.59 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:17:25,731 epoch 5 - iter 117/138 - loss 0.05890980 - time (sec): 6.54 - samples/sec: 2977.78 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:17:26,421 epoch 5 - iter 130/138 - loss 0.05857912 - time (sec): 7.23 - samples/sec: 2969.13 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:17:26,857 ----------------------------------------------------------------------------------------------------
2023-10-13 08:17:26,857 EPOCH 5 done: loss 0.0571 - lr: 0.000017
2023-10-13 08:17:27,593 DEV : loss 0.12265711277723312 - f1-score (micro avg) 0.8592
2023-10-13 08:17:27,598 saving best model
2023-10-13 08:17:28,146 ----------------------------------------------------------------------------------------------------
2023-10-13 08:17:28,912 epoch 6 - iter 13/138 - loss 0.03861435 - time (sec): 0.76 - samples/sec: 2512.97 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:17:29,718 epoch 6 - iter 26/138 - loss 0.03965503 - time (sec): 1.57 - samples/sec: 2641.43 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:17:30,490 epoch 6 - iter 39/138 - loss 0.03115827 - time (sec): 2.34 - samples/sec: 2684.97 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:17:31,303 epoch 6 - iter 52/138 - loss 0.04365662 - time (sec): 3.15 - samples/sec: 2699.78 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:17:32,073 epoch 6 - iter 65/138 - loss 0.04301878 - time (sec): 3.92 - samples/sec: 2701.58 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:17:32,865 epoch 6 - iter 78/138 - loss 0.04324298 - time (sec): 4.72 - samples/sec: 2698.35 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:17:33,686 epoch 6 - iter 91/138 - loss 0.04093737 - time (sec): 5.54 - samples/sec: 2724.30 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:17:34,490 epoch 6 - iter 104/138 - loss 0.03947619 - time (sec): 6.34 - samples/sec: 2692.45 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:17:35,323 epoch 6 - iter 117/138 - loss 0.03955605 - time (sec): 7.18 - samples/sec: 2688.97 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:17:36,082 epoch 6 - iter 130/138 - loss 0.04356352 - time (sec): 7.93 - samples/sec: 2704.54 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:17:36,557 ----------------------------------------------------------------------------------------------------
2023-10-13 08:17:36,558 EPOCH 6 done: loss 0.0437 - lr: 0.000014
2023-10-13 08:17:37,243 DEV : loss 0.1548129767179489 - f1-score (micro avg) 0.8561
2023-10-13 08:17:37,249 ----------------------------------------------------------------------------------------------------
2023-10-13 08:17:38,061 epoch 7 - iter 13/138 - loss 0.01816521 - time (sec): 0.81 - samples/sec: 2894.44 - lr: 0.000013 - momentum: 0.000000
2023-10-13 08:17:38,813 epoch 7 - iter 26/138 - loss 0.02673547 - time (sec): 1.56 - samples/sec: 2722.59 - lr: 0.000013 - momentum: 0.000000
2023-10-13 08:17:39,534 epoch 7 - iter 39/138 - loss 0.02314484 - time (sec): 2.28 - samples/sec: 2694.98 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:17:40,305 epoch 7 - iter 52/138 - loss 0.02781928 - time (sec): 3.05 - samples/sec: 2741.79 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:17:41,040 epoch 7 - iter 65/138 - loss 0.03771659 - time (sec): 3.79 - samples/sec: 2718.41 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:17:41,866 epoch 7 - iter 78/138 - loss 0.03164946 - time (sec): 4.62 - samples/sec: 2729.40 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:17:42,631 epoch 7 - iter 91/138 - loss 0.02897504 - time (sec): 5.38 - samples/sec: 2717.25 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:17:43,474 epoch 7 - iter 104/138 - loss 0.03063681 - time (sec): 6.22 - samples/sec: 2735.99 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:17:44,228 epoch 7 - iter 117/138 - loss 0.03320605 - time (sec): 6.98 - samples/sec: 2730.87 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:17:44,966 epoch 7 - iter 130/138 - loss 0.03497984 - time (sec): 7.72 - samples/sec: 2756.03 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:17:45,455 ----------------------------------------------------------------------------------------------------
2023-10-13 08:17:45,455 EPOCH 7 done: loss 0.0361 - lr: 0.000010
2023-10-13 08:17:46,201 DEV : loss 0.15721119940280914 - f1-score (micro avg) 0.866
2023-10-13 08:17:46,210 saving best model
2023-10-13 08:17:46,739 ----------------------------------------------------------------------------------------------------
2023-10-13 08:17:47,554 epoch 8 - iter 13/138 - loss 0.02314467 - time (sec): 0.81 - samples/sec: 2774.37 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:17:48,321 epoch 8 - iter 26/138 - loss 0.02332959 - time (sec): 1.58 - samples/sec: 2814.09 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:17:49,092 epoch 8 - iter 39/138 - loss 0.04235226 - time (sec): 2.35 - samples/sec: 2753.54 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:17:49,846 epoch 8 - iter 52/138 - loss 0.03582989 - time (sec): 3.10 - samples/sec: 2751.19 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:17:50,598 epoch 8 - iter 65/138 - loss 0.03149068 - time (sec): 3.85 - samples/sec: 2778.19 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:17:51,331 epoch 8 - iter 78/138 - loss 0.03150046 - time (sec): 4.59 - samples/sec: 2774.97 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:17:52,065 epoch 8 - iter 91/138 - loss 0.02865518 - time (sec): 5.32 - samples/sec: 2815.91 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:17:52,783 epoch 8 - iter 104/138 - loss 0.02740999 - time (sec): 6.04 - samples/sec: 2821.56 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:17:53,584 epoch 8 - iter 117/138 - loss 0.02749516 - time (sec): 6.84 - samples/sec: 2813.26 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:17:54,344 epoch 8 - iter 130/138 - loss 0.02705509 - time (sec): 7.60 - samples/sec: 2818.50 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:17:54,843 ----------------------------------------------------------------------------------------------------
2023-10-13 08:17:54,844 EPOCH 8 done: loss 0.0266 - lr: 0.000007
2023-10-13 08:17:55,537 DEV : loss 0.1643933802843094 - f1-score (micro avg) 0.8709
2023-10-13 08:17:55,544 saving best model
2023-10-13 08:17:56,041 ----------------------------------------------------------------------------------------------------
2023-10-13 08:17:56,830 epoch 9 - iter 13/138 - loss 0.01569001 - time (sec): 0.78 - samples/sec: 2822.38 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:17:57,580 epoch 9 - iter 26/138 - loss 0.01092007 - time (sec): 1.53 - samples/sec: 2797.66 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:17:58,416 epoch 9 - iter 39/138 - loss 0.02074358 - time (sec): 2.37 - samples/sec: 2850.11 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:17:59,172 epoch 9 - iter 52/138 - loss 0.02807243 - time (sec): 3.13 - samples/sec: 2854.77 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:17:59,954 epoch 9 - iter 65/138 - loss 0.02373621 - time (sec): 3.91 - samples/sec: 2823.97 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:18:00,744 epoch 9 - iter 78/138 - loss 0.02436927 - time (sec): 4.70 - samples/sec: 2860.10 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:18:01,509 epoch 9 - iter 91/138 - loss 0.02370290 - time (sec): 5.46 - samples/sec: 2826.18 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:18:02,322 epoch 9 - iter 104/138 - loss 0.02179671 - time (sec): 6.28 - samples/sec: 2775.46 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:18:03,056 epoch 9 - iter 117/138 - loss 0.02060853 - time (sec): 7.01 - samples/sec: 2769.66 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:18:03,776 epoch 9 - iter 130/138 - loss 0.02192204 - time (sec): 7.73 - samples/sec: 2806.46 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:18:04,226 ----------------------------------------------------------------------------------------------------
2023-10-13 08:18:04,227 EPOCH 9 done: loss 0.0214 - lr: 0.000004
2023-10-13 08:18:04,923 DEV : loss 0.16664454340934753 - f1-score (micro avg) 0.8719
2023-10-13 08:18:04,929 saving best model
2023-10-13 08:18:05,418 ----------------------------------------------------------------------------------------------------
2023-10-13 08:18:06,192 epoch 10 - iter 13/138 - loss 0.01513958 - time (sec): 0.77 - samples/sec: 2843.13 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:18:06,972 epoch 10 - iter 26/138 - loss 0.01424129 - time (sec): 1.55 - samples/sec: 3012.13 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:18:07,701 epoch 10 - iter 39/138 - loss 0.01070624 - time (sec): 2.28 - samples/sec: 3003.21 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:18:08,446 epoch 10 - iter 52/138 - loss 0.01186796 - time (sec): 3.03 - samples/sec: 2942.69 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:18:09,153 epoch 10 - iter 65/138 - loss 0.01262779 - time (sec): 3.73 - samples/sec: 2993.47 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:18:09,858 epoch 10 - iter 78/138 - loss 0.01286369 - time (sec): 4.44 - samples/sec: 2990.04 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:18:10,574 epoch 10 - iter 91/138 - loss 0.01304090 - time (sec): 5.15 - samples/sec: 2951.18 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:18:11,337 epoch 10 - iter 104/138 - loss 0.01335445 - time (sec): 5.92 - samples/sec: 2927.92 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:18:12,074 epoch 10 - iter 117/138 - loss 0.01458012 - time (sec): 6.65 - samples/sec: 2923.61 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:18:12,849 epoch 10 - iter 130/138 - loss 0.01450931 - time (sec): 7.43 - samples/sec: 2911.53 - lr: 0.000000 - momentum: 0.000000
2023-10-13 08:18:13,275 ----------------------------------------------------------------------------------------------------
2023-10-13 08:18:13,275 EPOCH 10 done: loss 0.0140 - lr: 0.000000
2023-10-13 08:18:13,993 DEV : loss 0.1673877239227295 - f1-score (micro avg) 0.8743
2023-10-13 08:18:13,998 saving best model
2023-10-13 08:18:14,764 ----------------------------------------------------------------------------------------------------
2023-10-13 08:18:14,765 Loading model from best epoch ...
2023-10-13 08:18:16,181 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-13 08:18:17,010
Results:
- F-score (micro) 0.8984
- F-score (macro) 0.6346
- Accuracy 0.8374
By class:
precision recall f1-score support
scope 0.8804 0.9205 0.9000 176
pers 0.9462 0.9609 0.9535 128
work 0.8429 0.7973 0.8194 74
loc 0.5000 0.5000 0.5000 2
object 0.0000 0.0000 0.0000 2
micro avg 0.8938 0.9031 0.8984 382
macro avg 0.6339 0.6357 0.6346 382
weighted avg 0.8886 0.9031 0.8955 382
2023-10-13 08:18:17,010 ----------------------------------------------------------------------------------------------------