stefan-it's picture
Upload folder using huggingface_hub
128a210
2023-10-13 10:30:09,192 ----------------------------------------------------------------------------------------------------
2023-10-13 10:30:09,193 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 10:30:09,193 ----------------------------------------------------------------------------------------------------
2023-10-13 10:30:09,193 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-13 10:30:09,193 ----------------------------------------------------------------------------------------------------
2023-10-13 10:30:09,193 Train: 966 sentences
2023-10-13 10:30:09,193 (train_with_dev=False, train_with_test=False)
2023-10-13 10:30:09,193 ----------------------------------------------------------------------------------------------------
2023-10-13 10:30:09,193 Training Params:
2023-10-13 10:30:09,193 - learning_rate: "3e-05"
2023-10-13 10:30:09,193 - mini_batch_size: "4"
2023-10-13 10:30:09,193 - max_epochs: "10"
2023-10-13 10:30:09,193 - shuffle: "True"
2023-10-13 10:30:09,193 ----------------------------------------------------------------------------------------------------
2023-10-13 10:30:09,193 Plugins:
2023-10-13 10:30:09,193 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 10:30:09,193 ----------------------------------------------------------------------------------------------------
2023-10-13 10:30:09,193 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 10:30:09,193 - metric: "('micro avg', 'f1-score')"
2023-10-13 10:30:09,193 ----------------------------------------------------------------------------------------------------
2023-10-13 10:30:09,194 Computation:
2023-10-13 10:30:09,194 - compute on device: cuda:0
2023-10-13 10:30:09,194 - embedding storage: none
2023-10-13 10:30:09,194 ----------------------------------------------------------------------------------------------------
2023-10-13 10:30:09,194 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-13 10:30:09,194 ----------------------------------------------------------------------------------------------------
2023-10-13 10:30:09,194 ----------------------------------------------------------------------------------------------------
2023-10-13 10:30:11,316 epoch 1 - iter 24/242 - loss 3.31899833 - time (sec): 2.12 - samples/sec: 1099.73 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:30:12,625 epoch 1 - iter 48/242 - loss 2.96790424 - time (sec): 3.43 - samples/sec: 1462.00 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:30:13,935 epoch 1 - iter 72/242 - loss 2.37819050 - time (sec): 4.74 - samples/sec: 1614.82 - lr: 0.000009 - momentum: 0.000000
2023-10-13 10:30:15,194 epoch 1 - iter 96/242 - loss 1.96624830 - time (sec): 6.00 - samples/sec: 1677.13 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:30:16,452 epoch 1 - iter 120/242 - loss 1.73643668 - time (sec): 7.26 - samples/sec: 1683.88 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:30:17,743 epoch 1 - iter 144/242 - loss 1.51974639 - time (sec): 8.55 - samples/sec: 1739.01 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:30:19,013 epoch 1 - iter 168/242 - loss 1.36954910 - time (sec): 9.82 - samples/sec: 1761.71 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:30:20,238 epoch 1 - iter 192/242 - loss 1.24200364 - time (sec): 11.04 - samples/sec: 1801.53 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:30:21,443 epoch 1 - iter 216/242 - loss 1.14147406 - time (sec): 12.25 - samples/sec: 1811.60 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:30:22,713 epoch 1 - iter 240/242 - loss 1.05944884 - time (sec): 13.52 - samples/sec: 1816.46 - lr: 0.000030 - momentum: 0.000000
2023-10-13 10:30:22,834 ----------------------------------------------------------------------------------------------------
2023-10-13 10:30:22,834 EPOCH 1 done: loss 1.0537 - lr: 0.000030
2023-10-13 10:30:23,525 DEV : loss 0.27631089091300964 - f1-score (micro avg) 0.457
2023-10-13 10:30:23,530 saving best model
2023-10-13 10:30:23,924 ----------------------------------------------------------------------------------------------------
2023-10-13 10:30:25,134 epoch 2 - iter 24/242 - loss 0.25291019 - time (sec): 1.21 - samples/sec: 2019.76 - lr: 0.000030 - momentum: 0.000000
2023-10-13 10:30:26,315 epoch 2 - iter 48/242 - loss 0.28868353 - time (sec): 2.39 - samples/sec: 1996.82 - lr: 0.000029 - momentum: 0.000000
2023-10-13 10:30:27,555 epoch 2 - iter 72/242 - loss 0.26832173 - time (sec): 3.63 - samples/sec: 2035.03 - lr: 0.000029 - momentum: 0.000000
2023-10-13 10:30:28,799 epoch 2 - iter 96/242 - loss 0.25056380 - time (sec): 4.87 - samples/sec: 2040.48 - lr: 0.000029 - momentum: 0.000000
2023-10-13 10:30:30,034 epoch 2 - iter 120/242 - loss 0.24861522 - time (sec): 6.11 - samples/sec: 2019.03 - lr: 0.000028 - momentum: 0.000000
2023-10-13 10:30:31,264 epoch 2 - iter 144/242 - loss 0.24350868 - time (sec): 7.34 - samples/sec: 1981.02 - lr: 0.000028 - momentum: 0.000000
2023-10-13 10:30:32,502 epoch 2 - iter 168/242 - loss 0.23234542 - time (sec): 8.58 - samples/sec: 1988.14 - lr: 0.000028 - momentum: 0.000000
2023-10-13 10:30:33,737 epoch 2 - iter 192/242 - loss 0.23022383 - time (sec): 9.81 - samples/sec: 1999.91 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:30:34,936 epoch 2 - iter 216/242 - loss 0.21995150 - time (sec): 11.01 - samples/sec: 2043.60 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:30:36,034 epoch 2 - iter 240/242 - loss 0.21735290 - time (sec): 12.11 - samples/sec: 2034.20 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:30:36,117 ----------------------------------------------------------------------------------------------------
2023-10-13 10:30:36,117 EPOCH 2 done: loss 0.2168 - lr: 0.000027
2023-10-13 10:30:36,917 DEV : loss 0.14946463704109192 - f1-score (micro avg) 0.7621
2023-10-13 10:30:36,924 saving best model
2023-10-13 10:30:37,446 ----------------------------------------------------------------------------------------------------
2023-10-13 10:30:38,618 epoch 3 - iter 24/242 - loss 0.18291948 - time (sec): 1.17 - samples/sec: 2092.17 - lr: 0.000026 - momentum: 0.000000
2023-10-13 10:30:39,983 epoch 3 - iter 48/242 - loss 0.15886607 - time (sec): 2.53 - samples/sec: 1968.78 - lr: 0.000026 - momentum: 0.000000
2023-10-13 10:30:41,202 epoch 3 - iter 72/242 - loss 0.13699532 - time (sec): 3.75 - samples/sec: 1971.01 - lr: 0.000026 - momentum: 0.000000
2023-10-13 10:30:42,261 epoch 3 - iter 96/242 - loss 0.13323718 - time (sec): 4.81 - samples/sec: 2019.19 - lr: 0.000025 - momentum: 0.000000
2023-10-13 10:30:43,378 epoch 3 - iter 120/242 - loss 0.12809419 - time (sec): 5.93 - samples/sec: 2056.47 - lr: 0.000025 - momentum: 0.000000
2023-10-13 10:30:44,545 epoch 3 - iter 144/242 - loss 0.11962736 - time (sec): 7.10 - samples/sec: 2068.28 - lr: 0.000025 - momentum: 0.000000
2023-10-13 10:30:45,597 epoch 3 - iter 168/242 - loss 0.12663136 - time (sec): 8.15 - samples/sec: 2098.46 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:30:46,739 epoch 3 - iter 192/242 - loss 0.12128219 - time (sec): 9.29 - samples/sec: 2118.70 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:30:47,876 epoch 3 - iter 216/242 - loss 0.12074200 - time (sec): 10.43 - samples/sec: 2126.21 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:30:49,007 epoch 3 - iter 240/242 - loss 0.12123157 - time (sec): 11.56 - samples/sec: 2127.78 - lr: 0.000023 - momentum: 0.000000
2023-10-13 10:30:49,096 ----------------------------------------------------------------------------------------------------
2023-10-13 10:30:49,097 EPOCH 3 done: loss 0.1219 - lr: 0.000023
2023-10-13 10:30:49,922 DEV : loss 0.1250181347131729 - f1-score (micro avg) 0.8202
2023-10-13 10:30:49,927 saving best model
2023-10-13 10:30:50,429 ----------------------------------------------------------------------------------------------------
2023-10-13 10:30:51,583 epoch 4 - iter 24/242 - loss 0.12817993 - time (sec): 1.15 - samples/sec: 2190.49 - lr: 0.000023 - momentum: 0.000000
2023-10-13 10:30:52,701 epoch 4 - iter 48/242 - loss 0.09658017 - time (sec): 2.27 - samples/sec: 2274.08 - lr: 0.000023 - momentum: 0.000000
2023-10-13 10:30:53,861 epoch 4 - iter 72/242 - loss 0.08645078 - time (sec): 3.43 - samples/sec: 2234.05 - lr: 0.000022 - momentum: 0.000000
2023-10-13 10:30:55,032 epoch 4 - iter 96/242 - loss 0.08823514 - time (sec): 4.60 - samples/sec: 2122.16 - lr: 0.000022 - momentum: 0.000000
2023-10-13 10:30:56,153 epoch 4 - iter 120/242 - loss 0.08030354 - time (sec): 5.72 - samples/sec: 2190.18 - lr: 0.000022 - momentum: 0.000000
2023-10-13 10:30:57,271 epoch 4 - iter 144/242 - loss 0.08186884 - time (sec): 6.84 - samples/sec: 2181.87 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:30:58,487 epoch 4 - iter 168/242 - loss 0.08565366 - time (sec): 8.06 - samples/sec: 2156.28 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:30:59,707 epoch 4 - iter 192/242 - loss 0.08516789 - time (sec): 9.28 - samples/sec: 2127.35 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:31:00,937 epoch 4 - iter 216/242 - loss 0.08260317 - time (sec): 10.51 - samples/sec: 2094.32 - lr: 0.000020 - momentum: 0.000000
2023-10-13 10:31:02,125 epoch 4 - iter 240/242 - loss 0.08329194 - time (sec): 11.69 - samples/sec: 2099.31 - lr: 0.000020 - momentum: 0.000000
2023-10-13 10:31:02,212 ----------------------------------------------------------------------------------------------------
2023-10-13 10:31:02,213 EPOCH 4 done: loss 0.0832 - lr: 0.000020
2023-10-13 10:31:03,064 DEV : loss 0.12788808345794678 - f1-score (micro avg) 0.8173
2023-10-13 10:31:03,069 ----------------------------------------------------------------------------------------------------
2023-10-13 10:31:04,301 epoch 5 - iter 24/242 - loss 0.05323062 - time (sec): 1.23 - samples/sec: 2054.39 - lr: 0.000020 - momentum: 0.000000
2023-10-13 10:31:05,453 epoch 5 - iter 48/242 - loss 0.06788494 - time (sec): 2.38 - samples/sec: 2019.33 - lr: 0.000019 - momentum: 0.000000
2023-10-13 10:31:06,610 epoch 5 - iter 72/242 - loss 0.06078541 - time (sec): 3.54 - samples/sec: 2062.05 - lr: 0.000019 - momentum: 0.000000
2023-10-13 10:31:07,755 epoch 5 - iter 96/242 - loss 0.05537332 - time (sec): 4.68 - samples/sec: 2055.60 - lr: 0.000019 - momentum: 0.000000
2023-10-13 10:31:08,869 epoch 5 - iter 120/242 - loss 0.05728335 - time (sec): 5.80 - samples/sec: 2130.10 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:31:09,944 epoch 5 - iter 144/242 - loss 0.05842797 - time (sec): 6.87 - samples/sec: 2195.35 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:31:11,013 epoch 5 - iter 168/242 - loss 0.05932681 - time (sec): 7.94 - samples/sec: 2222.93 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:31:12,102 epoch 5 - iter 192/242 - loss 0.05776849 - time (sec): 9.03 - samples/sec: 2211.23 - lr: 0.000017 - momentum: 0.000000
2023-10-13 10:31:13,144 epoch 5 - iter 216/242 - loss 0.05734916 - time (sec): 10.07 - samples/sec: 2184.78 - lr: 0.000017 - momentum: 0.000000
2023-10-13 10:31:14,209 epoch 5 - iter 240/242 - loss 0.06319448 - time (sec): 11.14 - samples/sec: 2200.20 - lr: 0.000017 - momentum: 0.000000
2023-10-13 10:31:14,297 ----------------------------------------------------------------------------------------------------
2023-10-13 10:31:14,298 EPOCH 5 done: loss 0.0632 - lr: 0.000017
2023-10-13 10:31:15,123 DEV : loss 0.14355506002902985 - f1-score (micro avg) 0.8088
2023-10-13 10:31:15,128 ----------------------------------------------------------------------------------------------------
2023-10-13 10:31:16,350 epoch 6 - iter 24/242 - loss 0.03620606 - time (sec): 1.22 - samples/sec: 2088.71 - lr: 0.000016 - momentum: 0.000000
2023-10-13 10:31:17,549 epoch 6 - iter 48/242 - loss 0.05031164 - time (sec): 2.42 - samples/sec: 1929.09 - lr: 0.000016 - momentum: 0.000000
2023-10-13 10:31:18,745 epoch 6 - iter 72/242 - loss 0.04763385 - time (sec): 3.62 - samples/sec: 1944.95 - lr: 0.000016 - momentum: 0.000000
2023-10-13 10:31:19,911 epoch 6 - iter 96/242 - loss 0.04680506 - time (sec): 4.78 - samples/sec: 1955.54 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:31:21,079 epoch 6 - iter 120/242 - loss 0.04553476 - time (sec): 5.95 - samples/sec: 2025.34 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:31:22,243 epoch 6 - iter 144/242 - loss 0.04492284 - time (sec): 7.11 - samples/sec: 2042.94 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:31:23,419 epoch 6 - iter 168/242 - loss 0.04377389 - time (sec): 8.29 - samples/sec: 2076.37 - lr: 0.000014 - momentum: 0.000000
2023-10-13 10:31:24,567 epoch 6 - iter 192/242 - loss 0.04183081 - time (sec): 9.44 - samples/sec: 2113.37 - lr: 0.000014 - momentum: 0.000000
2023-10-13 10:31:25,718 epoch 6 - iter 216/242 - loss 0.04363399 - time (sec): 10.59 - samples/sec: 2121.87 - lr: 0.000014 - momentum: 0.000000
2023-10-13 10:31:26,813 epoch 6 - iter 240/242 - loss 0.04419028 - time (sec): 11.68 - samples/sec: 2107.39 - lr: 0.000013 - momentum: 0.000000
2023-10-13 10:31:26,904 ----------------------------------------------------------------------------------------------------
2023-10-13 10:31:26,904 EPOCH 6 done: loss 0.0441 - lr: 0.000013
2023-10-13 10:31:27,724 DEV : loss 0.1649819314479828 - f1-score (micro avg) 0.8183
2023-10-13 10:31:27,729 ----------------------------------------------------------------------------------------------------
2023-10-13 10:31:28,813 epoch 7 - iter 24/242 - loss 0.03165528 - time (sec): 1.08 - samples/sec: 2042.55 - lr: 0.000013 - momentum: 0.000000
2023-10-13 10:31:29,923 epoch 7 - iter 48/242 - loss 0.02546364 - time (sec): 2.19 - samples/sec: 2105.70 - lr: 0.000013 - momentum: 0.000000
2023-10-13 10:31:31,053 epoch 7 - iter 72/242 - loss 0.02380691 - time (sec): 3.32 - samples/sec: 2162.89 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:31:32,166 epoch 7 - iter 96/242 - loss 0.02865025 - time (sec): 4.44 - samples/sec: 2162.77 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:31:33,249 epoch 7 - iter 120/242 - loss 0.03286107 - time (sec): 5.52 - samples/sec: 2184.29 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:31:34,348 epoch 7 - iter 144/242 - loss 0.03808170 - time (sec): 6.62 - samples/sec: 2216.88 - lr: 0.000011 - momentum: 0.000000
2023-10-13 10:31:35,438 epoch 7 - iter 168/242 - loss 0.03745498 - time (sec): 7.71 - samples/sec: 2247.71 - lr: 0.000011 - momentum: 0.000000
2023-10-13 10:31:36,523 epoch 7 - iter 192/242 - loss 0.03486179 - time (sec): 8.79 - samples/sec: 2252.25 - lr: 0.000011 - momentum: 0.000000
2023-10-13 10:31:37,601 epoch 7 - iter 216/242 - loss 0.03330677 - time (sec): 9.87 - samples/sec: 2232.51 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:31:38,735 epoch 7 - iter 240/242 - loss 0.03221765 - time (sec): 11.00 - samples/sec: 2242.25 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:31:38,822 ----------------------------------------------------------------------------------------------------
2023-10-13 10:31:38,822 EPOCH 7 done: loss 0.0321 - lr: 0.000010
2023-10-13 10:31:39,671 DEV : loss 0.1871682107448578 - f1-score (micro avg) 0.8069
2023-10-13 10:31:39,676 ----------------------------------------------------------------------------------------------------
2023-10-13 10:31:40,777 epoch 8 - iter 24/242 - loss 0.01700191 - time (sec): 1.10 - samples/sec: 2214.72 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:31:41,955 epoch 8 - iter 48/242 - loss 0.01712285 - time (sec): 2.28 - samples/sec: 2021.92 - lr: 0.000009 - momentum: 0.000000
2023-10-13 10:31:43,087 epoch 8 - iter 72/242 - loss 0.02033325 - time (sec): 3.41 - samples/sec: 2071.63 - lr: 0.000009 - momentum: 0.000000
2023-10-13 10:31:44,180 epoch 8 - iter 96/242 - loss 0.01811933 - time (sec): 4.50 - samples/sec: 2065.26 - lr: 0.000009 - momentum: 0.000000
2023-10-13 10:31:45,378 epoch 8 - iter 120/242 - loss 0.01662740 - time (sec): 5.70 - samples/sec: 2134.06 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:31:46,543 epoch 8 - iter 144/242 - loss 0.01520657 - time (sec): 6.86 - samples/sec: 2131.70 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:31:47,677 epoch 8 - iter 168/242 - loss 0.01531022 - time (sec): 8.00 - samples/sec: 2118.00 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:31:48,782 epoch 8 - iter 192/242 - loss 0.01806298 - time (sec): 9.10 - samples/sec: 2148.20 - lr: 0.000007 - momentum: 0.000000
2023-10-13 10:31:49,864 epoch 8 - iter 216/242 - loss 0.02135875 - time (sec): 10.19 - samples/sec: 2170.09 - lr: 0.000007 - momentum: 0.000000
2023-10-13 10:31:50,948 epoch 8 - iter 240/242 - loss 0.02328127 - time (sec): 11.27 - samples/sec: 2183.11 - lr: 0.000007 - momentum: 0.000000
2023-10-13 10:31:51,034 ----------------------------------------------------------------------------------------------------
2023-10-13 10:31:51,034 EPOCH 8 done: loss 0.0232 - lr: 0.000007
2023-10-13 10:31:51,824 DEV : loss 0.19017556309700012 - f1-score (micro avg) 0.8074
2023-10-13 10:31:51,829 ----------------------------------------------------------------------------------------------------
2023-10-13 10:31:52,940 epoch 9 - iter 24/242 - loss 0.01171644 - time (sec): 1.11 - samples/sec: 2446.12 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:31:54,081 epoch 9 - iter 48/242 - loss 0.01109408 - time (sec): 2.25 - samples/sec: 2346.81 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:31:55,183 epoch 9 - iter 72/242 - loss 0.01061616 - time (sec): 3.35 - samples/sec: 2216.51 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:31:56,296 epoch 9 - iter 96/242 - loss 0.01554713 - time (sec): 4.47 - samples/sec: 2233.37 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:31:57,371 epoch 9 - iter 120/242 - loss 0.01353043 - time (sec): 5.54 - samples/sec: 2255.15 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:31:58,482 epoch 9 - iter 144/242 - loss 0.01882666 - time (sec): 6.65 - samples/sec: 2293.73 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:31:59,560 epoch 9 - iter 168/242 - loss 0.01755098 - time (sec): 7.73 - samples/sec: 2282.37 - lr: 0.000004 - momentum: 0.000000
2023-10-13 10:32:00,720 epoch 9 - iter 192/242 - loss 0.01714456 - time (sec): 8.89 - samples/sec: 2235.67 - lr: 0.000004 - momentum: 0.000000
2023-10-13 10:32:01,808 epoch 9 - iter 216/242 - loss 0.01700689 - time (sec): 9.98 - samples/sec: 2256.60 - lr: 0.000004 - momentum: 0.000000
2023-10-13 10:32:02,879 epoch 9 - iter 240/242 - loss 0.01626621 - time (sec): 11.05 - samples/sec: 2229.19 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:32:02,965 ----------------------------------------------------------------------------------------------------
2023-10-13 10:32:02,966 EPOCH 9 done: loss 0.0162 - lr: 0.000003
2023-10-13 10:32:03,716 DEV : loss 0.2009798139333725 - f1-score (micro avg) 0.8169
2023-10-13 10:32:03,721 ----------------------------------------------------------------------------------------------------
2023-10-13 10:32:04,784 epoch 10 - iter 24/242 - loss 0.02317398 - time (sec): 1.06 - samples/sec: 2128.45 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:32:05,865 epoch 10 - iter 48/242 - loss 0.02491819 - time (sec): 2.14 - samples/sec: 2189.82 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:32:06,936 epoch 10 - iter 72/242 - loss 0.02514844 - time (sec): 3.21 - samples/sec: 2264.57 - lr: 0.000002 - momentum: 0.000000
2023-10-13 10:32:08,134 epoch 10 - iter 96/242 - loss 0.02131346 - time (sec): 4.41 - samples/sec: 2244.67 - lr: 0.000002 - momentum: 0.000000
2023-10-13 10:32:09,333 epoch 10 - iter 120/242 - loss 0.01880207 - time (sec): 5.61 - samples/sec: 2212.12 - lr: 0.000002 - momentum: 0.000000
2023-10-13 10:32:10,487 epoch 10 - iter 144/242 - loss 0.01830613 - time (sec): 6.77 - samples/sec: 2156.40 - lr: 0.000001 - momentum: 0.000000
2023-10-13 10:32:11,608 epoch 10 - iter 168/242 - loss 0.01761614 - time (sec): 7.89 - samples/sec: 2137.04 - lr: 0.000001 - momentum: 0.000000
2023-10-13 10:32:12,752 epoch 10 - iter 192/242 - loss 0.01679154 - time (sec): 9.03 - samples/sec: 2117.70 - lr: 0.000001 - momentum: 0.000000
2023-10-13 10:32:13,867 epoch 10 - iter 216/242 - loss 0.01529364 - time (sec): 10.14 - samples/sec: 2145.05 - lr: 0.000000 - momentum: 0.000000
2023-10-13 10:32:15,026 epoch 10 - iter 240/242 - loss 0.01407741 - time (sec): 11.30 - samples/sec: 2170.83 - lr: 0.000000 - momentum: 0.000000
2023-10-13 10:32:15,114 ----------------------------------------------------------------------------------------------------
2023-10-13 10:32:15,115 EPOCH 10 done: loss 0.0140 - lr: 0.000000
2023-10-13 10:32:15,938 DEV : loss 0.20342250168323517 - f1-score (micro avg) 0.8144
2023-10-13 10:32:16,329 ----------------------------------------------------------------------------------------------------
2023-10-13 10:32:16,330 Loading model from best epoch ...
2023-10-13 10:32:18,153 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-13 10:32:19,052
Results:
- F-score (micro) 0.7666
- F-score (macro) 0.4582
- Accuracy 0.6465
By class:
precision recall f1-score support
pers 0.7468 0.8489 0.7946 139
scope 0.8028 0.8837 0.8413 129
work 0.6064 0.7125 0.6552 80
loc 0.0000 0.0000 0.0000 9
date 0.0000 0.0000 0.0000 3
micro avg 0.7335 0.8028 0.7666 360
macro avg 0.4312 0.4890 0.4582 360
weighted avg 0.7108 0.8028 0.7539 360
2023-10-13 10:32:19,052 ----------------------------------------------------------------------------------------------------