stefan-it's picture
Upload folder using huggingface_hub
4e52b9f
raw
history blame
24.3 kB
2023-10-13 16:18:19,873 ----------------------------------------------------------------------------------------------------
2023-10-13 16:18:19,874 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 16:18:19,874 ----------------------------------------------------------------------------------------------------
2023-10-13 16:18:19,874 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
2023-10-13 16:18:19,874 ----------------------------------------------------------------------------------------------------
2023-10-13 16:18:19,874 Train: 5901 sentences
2023-10-13 16:18:19,874 (train_with_dev=False, train_with_test=False)
2023-10-13 16:18:19,874 ----------------------------------------------------------------------------------------------------
2023-10-13 16:18:19,874 Training Params:
2023-10-13 16:18:19,874 - learning_rate: "3e-05"
2023-10-13 16:18:19,874 - mini_batch_size: "4"
2023-10-13 16:18:19,874 - max_epochs: "10"
2023-10-13 16:18:19,874 - shuffle: "True"
2023-10-13 16:18:19,874 ----------------------------------------------------------------------------------------------------
2023-10-13 16:18:19,874 Plugins:
2023-10-13 16:18:19,874 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 16:18:19,874 ----------------------------------------------------------------------------------------------------
2023-10-13 16:18:19,874 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 16:18:19,874 - metric: "('micro avg', 'f1-score')"
2023-10-13 16:18:19,874 ----------------------------------------------------------------------------------------------------
2023-10-13 16:18:19,875 Computation:
2023-10-13 16:18:19,875 - compute on device: cuda:0
2023-10-13 16:18:19,875 - embedding storage: none
2023-10-13 16:18:19,875 ----------------------------------------------------------------------------------------------------
2023-10-13 16:18:19,875 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-13 16:18:19,875 ----------------------------------------------------------------------------------------------------
2023-10-13 16:18:19,875 ----------------------------------------------------------------------------------------------------
2023-10-13 16:18:26,876 epoch 1 - iter 147/1476 - loss 2.68811671 - time (sec): 7.00 - samples/sec: 2530.06 - lr: 0.000003 - momentum: 0.000000
2023-10-13 16:18:33,935 epoch 1 - iter 294/1476 - loss 1.63436810 - time (sec): 14.06 - samples/sec: 2556.09 - lr: 0.000006 - momentum: 0.000000
2023-10-13 16:18:40,951 epoch 1 - iter 441/1476 - loss 1.24746196 - time (sec): 21.07 - samples/sec: 2466.58 - lr: 0.000009 - momentum: 0.000000
2023-10-13 16:18:48,209 epoch 1 - iter 588/1476 - loss 1.01857941 - time (sec): 28.33 - samples/sec: 2425.58 - lr: 0.000012 - momentum: 0.000000
2023-10-13 16:18:55,376 epoch 1 - iter 735/1476 - loss 0.88100047 - time (sec): 35.50 - samples/sec: 2410.76 - lr: 0.000015 - momentum: 0.000000
2023-10-13 16:19:02,343 epoch 1 - iter 882/1476 - loss 0.77829880 - time (sec): 42.47 - samples/sec: 2408.70 - lr: 0.000018 - momentum: 0.000000
2023-10-13 16:19:09,239 epoch 1 - iter 1029/1476 - loss 0.70303474 - time (sec): 49.36 - samples/sec: 2402.58 - lr: 0.000021 - momentum: 0.000000
2023-10-13 16:19:15,979 epoch 1 - iter 1176/1476 - loss 0.64928259 - time (sec): 56.10 - samples/sec: 2376.92 - lr: 0.000024 - momentum: 0.000000
2023-10-13 16:19:22,808 epoch 1 - iter 1323/1476 - loss 0.60030866 - time (sec): 62.93 - samples/sec: 2378.25 - lr: 0.000027 - momentum: 0.000000
2023-10-13 16:19:29,670 epoch 1 - iter 1470/1476 - loss 0.56005280 - time (sec): 69.79 - samples/sec: 2376.48 - lr: 0.000030 - momentum: 0.000000
2023-10-13 16:19:29,929 ----------------------------------------------------------------------------------------------------
2023-10-13 16:19:29,929 EPOCH 1 done: loss 0.5590 - lr: 0.000030
2023-10-13 16:19:36,108 DEV : loss 0.12960338592529297 - f1-score (micro avg) 0.7114
2023-10-13 16:19:36,137 saving best model
2023-10-13 16:19:36,590 ----------------------------------------------------------------------------------------------------
2023-10-13 16:19:43,259 epoch 2 - iter 147/1476 - loss 0.14371163 - time (sec): 6.67 - samples/sec: 2282.89 - lr: 0.000030 - momentum: 0.000000
2023-10-13 16:19:50,126 epoch 2 - iter 294/1476 - loss 0.14687695 - time (sec): 13.53 - samples/sec: 2321.77 - lr: 0.000029 - momentum: 0.000000
2023-10-13 16:19:56,916 epoch 2 - iter 441/1476 - loss 0.14324943 - time (sec): 20.33 - samples/sec: 2350.40 - lr: 0.000029 - momentum: 0.000000
2023-10-13 16:20:03,828 epoch 2 - iter 588/1476 - loss 0.13800148 - time (sec): 27.24 - samples/sec: 2341.03 - lr: 0.000029 - momentum: 0.000000
2023-10-13 16:20:10,771 epoch 2 - iter 735/1476 - loss 0.14220914 - time (sec): 34.18 - samples/sec: 2310.96 - lr: 0.000028 - momentum: 0.000000
2023-10-13 16:20:17,816 epoch 2 - iter 882/1476 - loss 0.14051112 - time (sec): 41.22 - samples/sec: 2330.35 - lr: 0.000028 - momentum: 0.000000
2023-10-13 16:20:25,068 epoch 2 - iter 1029/1476 - loss 0.13606559 - time (sec): 48.48 - samples/sec: 2361.19 - lr: 0.000028 - momentum: 0.000000
2023-10-13 16:20:31,924 epoch 2 - iter 1176/1476 - loss 0.13105034 - time (sec): 55.33 - samples/sec: 2366.25 - lr: 0.000027 - momentum: 0.000000
2023-10-13 16:20:38,969 epoch 2 - iter 1323/1476 - loss 0.13230912 - time (sec): 62.38 - samples/sec: 2373.34 - lr: 0.000027 - momentum: 0.000000
2023-10-13 16:20:46,152 epoch 2 - iter 1470/1476 - loss 0.13197270 - time (sec): 69.56 - samples/sec: 2380.76 - lr: 0.000027 - momentum: 0.000000
2023-10-13 16:20:46,430 ----------------------------------------------------------------------------------------------------
2023-10-13 16:20:46,431 EPOCH 2 done: loss 0.1321 - lr: 0.000027
2023-10-13 16:20:57,635 DEV : loss 0.14676305651664734 - f1-score (micro avg) 0.7483
2023-10-13 16:20:57,663 saving best model
2023-10-13 16:20:58,251 ----------------------------------------------------------------------------------------------------
2023-10-13 16:21:05,130 epoch 3 - iter 147/1476 - loss 0.07375163 - time (sec): 6.88 - samples/sec: 2256.94 - lr: 0.000026 - momentum: 0.000000
2023-10-13 16:21:12,085 epoch 3 - iter 294/1476 - loss 0.08555357 - time (sec): 13.83 - samples/sec: 2339.15 - lr: 0.000026 - momentum: 0.000000
2023-10-13 16:21:18,976 epoch 3 - iter 441/1476 - loss 0.08707338 - time (sec): 20.72 - samples/sec: 2346.38 - lr: 0.000026 - momentum: 0.000000
2023-10-13 16:21:25,687 epoch 3 - iter 588/1476 - loss 0.08703905 - time (sec): 27.43 - samples/sec: 2337.30 - lr: 0.000025 - momentum: 0.000000
2023-10-13 16:21:32,780 epoch 3 - iter 735/1476 - loss 0.08725903 - time (sec): 34.53 - samples/sec: 2362.45 - lr: 0.000025 - momentum: 0.000000
2023-10-13 16:21:39,920 epoch 3 - iter 882/1476 - loss 0.08502530 - time (sec): 41.67 - samples/sec: 2401.73 - lr: 0.000025 - momentum: 0.000000
2023-10-13 16:21:46,866 epoch 3 - iter 1029/1476 - loss 0.08313622 - time (sec): 48.61 - samples/sec: 2388.85 - lr: 0.000024 - momentum: 0.000000
2023-10-13 16:21:53,843 epoch 3 - iter 1176/1476 - loss 0.08509026 - time (sec): 55.59 - samples/sec: 2410.45 - lr: 0.000024 - momentum: 0.000000
2023-10-13 16:22:00,456 epoch 3 - iter 1323/1476 - loss 0.08340603 - time (sec): 62.20 - samples/sec: 2415.55 - lr: 0.000024 - momentum: 0.000000
2023-10-13 16:22:07,228 epoch 3 - iter 1470/1476 - loss 0.08513356 - time (sec): 68.97 - samples/sec: 2404.08 - lr: 0.000023 - momentum: 0.000000
2023-10-13 16:22:07,489 ----------------------------------------------------------------------------------------------------
2023-10-13 16:22:07,489 EPOCH 3 done: loss 0.0851 - lr: 0.000023
2023-10-13 16:22:18,604 DEV : loss 0.1558631807565689 - f1-score (micro avg) 0.8021
2023-10-13 16:22:18,632 saving best model
2023-10-13 16:22:19,132 ----------------------------------------------------------------------------------------------------
2023-10-13 16:22:25,971 epoch 4 - iter 147/1476 - loss 0.05321490 - time (sec): 6.84 - samples/sec: 2228.72 - lr: 0.000023 - momentum: 0.000000
2023-10-13 16:22:32,717 epoch 4 - iter 294/1476 - loss 0.05088317 - time (sec): 13.58 - samples/sec: 2289.73 - lr: 0.000023 - momentum: 0.000000
2023-10-13 16:22:39,592 epoch 4 - iter 441/1476 - loss 0.05582560 - time (sec): 20.46 - samples/sec: 2337.16 - lr: 0.000022 - momentum: 0.000000
2023-10-13 16:22:46,278 epoch 4 - iter 588/1476 - loss 0.05853689 - time (sec): 27.14 - samples/sec: 2329.23 - lr: 0.000022 - momentum: 0.000000
2023-10-13 16:22:53,495 epoch 4 - iter 735/1476 - loss 0.05804754 - time (sec): 34.36 - samples/sec: 2325.70 - lr: 0.000022 - momentum: 0.000000
2023-10-13 16:23:00,651 epoch 4 - iter 882/1476 - loss 0.05583338 - time (sec): 41.52 - samples/sec: 2346.35 - lr: 0.000021 - momentum: 0.000000
2023-10-13 16:23:07,936 epoch 4 - iter 1029/1476 - loss 0.05481177 - time (sec): 48.80 - samples/sec: 2384.46 - lr: 0.000021 - momentum: 0.000000
2023-10-13 16:23:14,776 epoch 4 - iter 1176/1476 - loss 0.05475217 - time (sec): 55.64 - samples/sec: 2389.60 - lr: 0.000021 - momentum: 0.000000
2023-10-13 16:23:21,862 epoch 4 - iter 1323/1476 - loss 0.05730543 - time (sec): 62.73 - samples/sec: 2386.95 - lr: 0.000020 - momentum: 0.000000
2023-10-13 16:23:28,584 epoch 4 - iter 1470/1476 - loss 0.05764999 - time (sec): 69.45 - samples/sec: 2386.25 - lr: 0.000020 - momentum: 0.000000
2023-10-13 16:23:28,863 ----------------------------------------------------------------------------------------------------
2023-10-13 16:23:28,864 EPOCH 4 done: loss 0.0578 - lr: 0.000020
2023-10-13 16:23:40,076 DEV : loss 0.18168844282627106 - f1-score (micro avg) 0.803
2023-10-13 16:23:40,104 saving best model
2023-10-13 16:23:40,595 ----------------------------------------------------------------------------------------------------
2023-10-13 16:23:47,639 epoch 5 - iter 147/1476 - loss 0.05535559 - time (sec): 7.04 - samples/sec: 2350.97 - lr: 0.000020 - momentum: 0.000000
2023-10-13 16:23:54,732 epoch 5 - iter 294/1476 - loss 0.04177522 - time (sec): 14.14 - samples/sec: 2354.73 - lr: 0.000019 - momentum: 0.000000
2023-10-13 16:24:01,904 epoch 5 - iter 441/1476 - loss 0.04295503 - time (sec): 21.31 - samples/sec: 2362.66 - lr: 0.000019 - momentum: 0.000000
2023-10-13 16:24:09,007 epoch 5 - iter 588/1476 - loss 0.03870174 - time (sec): 28.41 - samples/sec: 2337.31 - lr: 0.000019 - momentum: 0.000000
2023-10-13 16:24:15,872 epoch 5 - iter 735/1476 - loss 0.03947753 - time (sec): 35.28 - samples/sec: 2340.03 - lr: 0.000018 - momentum: 0.000000
2023-10-13 16:24:22,622 epoch 5 - iter 882/1476 - loss 0.04215224 - time (sec): 42.03 - samples/sec: 2333.54 - lr: 0.000018 - momentum: 0.000000
2023-10-13 16:24:29,595 epoch 5 - iter 1029/1476 - loss 0.04176081 - time (sec): 49.00 - samples/sec: 2334.66 - lr: 0.000018 - momentum: 0.000000
2023-10-13 16:24:36,904 epoch 5 - iter 1176/1476 - loss 0.04176508 - time (sec): 56.31 - samples/sec: 2359.18 - lr: 0.000017 - momentum: 0.000000
2023-10-13 16:24:44,014 epoch 5 - iter 1323/1476 - loss 0.04105383 - time (sec): 63.42 - samples/sec: 2376.53 - lr: 0.000017 - momentum: 0.000000
2023-10-13 16:24:50,709 epoch 5 - iter 1470/1476 - loss 0.04223612 - time (sec): 70.11 - samples/sec: 2365.53 - lr: 0.000017 - momentum: 0.000000
2023-10-13 16:24:50,971 ----------------------------------------------------------------------------------------------------
2023-10-13 16:24:50,972 EPOCH 5 done: loss 0.0421 - lr: 0.000017
2023-10-13 16:25:02,159 DEV : loss 0.17941001057624817 - f1-score (micro avg) 0.8126
2023-10-13 16:25:02,189 saving best model
2023-10-13 16:25:02,695 ----------------------------------------------------------------------------------------------------
2023-10-13 16:25:09,753 epoch 6 - iter 147/1476 - loss 0.02653020 - time (sec): 7.06 - samples/sec: 2115.54 - lr: 0.000016 - momentum: 0.000000
2023-10-13 16:25:16,999 epoch 6 - iter 294/1476 - loss 0.02862818 - time (sec): 14.30 - samples/sec: 2386.80 - lr: 0.000016 - momentum: 0.000000
2023-10-13 16:25:23,876 epoch 6 - iter 441/1476 - loss 0.03132242 - time (sec): 21.18 - samples/sec: 2410.71 - lr: 0.000016 - momentum: 0.000000
2023-10-13 16:25:30,766 epoch 6 - iter 588/1476 - loss 0.03209656 - time (sec): 28.07 - samples/sec: 2401.90 - lr: 0.000015 - momentum: 0.000000
2023-10-13 16:25:37,800 epoch 6 - iter 735/1476 - loss 0.03269220 - time (sec): 35.10 - samples/sec: 2419.37 - lr: 0.000015 - momentum: 0.000000
2023-10-13 16:25:44,929 epoch 6 - iter 882/1476 - loss 0.03367211 - time (sec): 42.23 - samples/sec: 2419.30 - lr: 0.000015 - momentum: 0.000000
2023-10-13 16:25:51,539 epoch 6 - iter 1029/1476 - loss 0.03249115 - time (sec): 48.84 - samples/sec: 2403.54 - lr: 0.000014 - momentum: 0.000000
2023-10-13 16:25:58,447 epoch 6 - iter 1176/1476 - loss 0.03140458 - time (sec): 55.75 - samples/sec: 2402.72 - lr: 0.000014 - momentum: 0.000000
2023-10-13 16:26:05,272 epoch 6 - iter 1323/1476 - loss 0.03117712 - time (sec): 62.58 - samples/sec: 2393.92 - lr: 0.000014 - momentum: 0.000000
2023-10-13 16:26:12,144 epoch 6 - iter 1470/1476 - loss 0.03087811 - time (sec): 69.45 - samples/sec: 2389.87 - lr: 0.000013 - momentum: 0.000000
2023-10-13 16:26:12,398 ----------------------------------------------------------------------------------------------------
2023-10-13 16:26:12,399 EPOCH 6 done: loss 0.0308 - lr: 0.000013
2023-10-13 16:26:23,570 DEV : loss 0.1737550050020218 - f1-score (micro avg) 0.8073
2023-10-13 16:26:23,599 ----------------------------------------------------------------------------------------------------
2023-10-13 16:26:30,817 epoch 7 - iter 147/1476 - loss 0.01587476 - time (sec): 7.22 - samples/sec: 2370.42 - lr: 0.000013 - momentum: 0.000000
2023-10-13 16:26:37,420 epoch 7 - iter 294/1476 - loss 0.01435257 - time (sec): 13.82 - samples/sec: 2325.55 - lr: 0.000013 - momentum: 0.000000
2023-10-13 16:26:44,799 epoch 7 - iter 441/1476 - loss 0.02157759 - time (sec): 21.20 - samples/sec: 2384.74 - lr: 0.000012 - momentum: 0.000000
2023-10-13 16:26:52,071 epoch 7 - iter 588/1476 - loss 0.02281126 - time (sec): 28.47 - samples/sec: 2434.79 - lr: 0.000012 - momentum: 0.000000
2023-10-13 16:26:58,793 epoch 7 - iter 735/1476 - loss 0.02402209 - time (sec): 35.19 - samples/sec: 2398.19 - lr: 0.000012 - momentum: 0.000000
2023-10-13 16:27:05,497 epoch 7 - iter 882/1476 - loss 0.02296830 - time (sec): 41.90 - samples/sec: 2381.17 - lr: 0.000011 - momentum: 0.000000
2023-10-13 16:27:12,184 epoch 7 - iter 1029/1476 - loss 0.02320940 - time (sec): 48.58 - samples/sec: 2384.78 - lr: 0.000011 - momentum: 0.000000
2023-10-13 16:27:18,947 epoch 7 - iter 1176/1476 - loss 0.02303719 - time (sec): 55.35 - samples/sec: 2376.63 - lr: 0.000011 - momentum: 0.000000
2023-10-13 16:27:25,778 epoch 7 - iter 1323/1476 - loss 0.02138011 - time (sec): 62.18 - samples/sec: 2373.80 - lr: 0.000010 - momentum: 0.000000
2023-10-13 16:27:32,996 epoch 7 - iter 1470/1476 - loss 0.02155465 - time (sec): 69.40 - samples/sec: 2390.58 - lr: 0.000010 - momentum: 0.000000
2023-10-13 16:27:33,255 ----------------------------------------------------------------------------------------------------
2023-10-13 16:27:33,255 EPOCH 7 done: loss 0.0215 - lr: 0.000010
2023-10-13 16:27:44,418 DEV : loss 0.20156921446323395 - f1-score (micro avg) 0.8156
2023-10-13 16:27:44,448 saving best model
2023-10-13 16:27:45,057 ----------------------------------------------------------------------------------------------------
2023-10-13 16:27:52,066 epoch 8 - iter 147/1476 - loss 0.01727771 - time (sec): 7.00 - samples/sec: 2398.66 - lr: 0.000010 - momentum: 0.000000
2023-10-13 16:27:59,060 epoch 8 - iter 294/1476 - loss 0.01903390 - time (sec): 14.00 - samples/sec: 2412.30 - lr: 0.000009 - momentum: 0.000000
2023-10-13 16:28:06,483 epoch 8 - iter 441/1476 - loss 0.01852869 - time (sec): 21.42 - samples/sec: 2482.55 - lr: 0.000009 - momentum: 0.000000
2023-10-13 16:28:13,207 epoch 8 - iter 588/1476 - loss 0.01894146 - time (sec): 28.14 - samples/sec: 2415.93 - lr: 0.000009 - momentum: 0.000000
2023-10-13 16:28:20,113 epoch 8 - iter 735/1476 - loss 0.01775006 - time (sec): 35.05 - samples/sec: 2390.37 - lr: 0.000008 - momentum: 0.000000
2023-10-13 16:28:26,932 epoch 8 - iter 882/1476 - loss 0.01706730 - time (sec): 41.87 - samples/sec: 2369.20 - lr: 0.000008 - momentum: 0.000000
2023-10-13 16:28:33,847 epoch 8 - iter 1029/1476 - loss 0.01593317 - time (sec): 48.78 - samples/sec: 2361.06 - lr: 0.000008 - momentum: 0.000000
2023-10-13 16:28:40,797 epoch 8 - iter 1176/1476 - loss 0.01516748 - time (sec): 55.73 - samples/sec: 2338.33 - lr: 0.000007 - momentum: 0.000000
2023-10-13 16:28:47,956 epoch 8 - iter 1323/1476 - loss 0.01460520 - time (sec): 62.89 - samples/sec: 2354.33 - lr: 0.000007 - momentum: 0.000000
2023-10-13 16:28:54,942 epoch 8 - iter 1470/1476 - loss 0.01394633 - time (sec): 69.88 - samples/sec: 2372.44 - lr: 0.000007 - momentum: 0.000000
2023-10-13 16:28:55,205 ----------------------------------------------------------------------------------------------------
2023-10-13 16:28:55,205 EPOCH 8 done: loss 0.0139 - lr: 0.000007
2023-10-13 16:29:06,318 DEV : loss 0.21695148944854736 - f1-score (micro avg) 0.8318
2023-10-13 16:29:06,348 saving best model
2023-10-13 16:29:06,888 ----------------------------------------------------------------------------------------------------
2023-10-13 16:29:13,839 epoch 9 - iter 147/1476 - loss 0.01149428 - time (sec): 6.95 - samples/sec: 2241.86 - lr: 0.000006 - momentum: 0.000000
2023-10-13 16:29:20,892 epoch 9 - iter 294/1476 - loss 0.01051718 - time (sec): 14.00 - samples/sec: 2354.76 - lr: 0.000006 - momentum: 0.000000
2023-10-13 16:29:27,592 epoch 9 - iter 441/1476 - loss 0.01109949 - time (sec): 20.70 - samples/sec: 2346.93 - lr: 0.000006 - momentum: 0.000000
2023-10-13 16:29:34,486 epoch 9 - iter 588/1476 - loss 0.01034791 - time (sec): 27.59 - samples/sec: 2365.06 - lr: 0.000005 - momentum: 0.000000
2023-10-13 16:29:41,777 epoch 9 - iter 735/1476 - loss 0.01038753 - time (sec): 34.88 - samples/sec: 2389.31 - lr: 0.000005 - momentum: 0.000000
2023-10-13 16:29:48,663 epoch 9 - iter 882/1476 - loss 0.00977770 - time (sec): 41.77 - samples/sec: 2369.64 - lr: 0.000005 - momentum: 0.000000
2023-10-13 16:29:55,755 epoch 9 - iter 1029/1476 - loss 0.00886349 - time (sec): 48.86 - samples/sec: 2375.51 - lr: 0.000004 - momentum: 0.000000
2023-10-13 16:30:02,562 epoch 9 - iter 1176/1476 - loss 0.00868716 - time (sec): 55.67 - samples/sec: 2364.56 - lr: 0.000004 - momentum: 0.000000
2023-10-13 16:30:09,255 epoch 9 - iter 1323/1476 - loss 0.00810351 - time (sec): 62.36 - samples/sec: 2374.52 - lr: 0.000004 - momentum: 0.000000
2023-10-13 16:30:16,370 epoch 9 - iter 1470/1476 - loss 0.01065603 - time (sec): 69.48 - samples/sec: 2383.02 - lr: 0.000003 - momentum: 0.000000
2023-10-13 16:30:16,664 ----------------------------------------------------------------------------------------------------
2023-10-13 16:30:16,664 EPOCH 9 done: loss 0.0110 - lr: 0.000003
2023-10-13 16:30:27,816 DEV : loss 0.20820745825767517 - f1-score (micro avg) 0.8377
2023-10-13 16:30:27,845 saving best model
2023-10-13 16:30:28,406 ----------------------------------------------------------------------------------------------------
2023-10-13 16:30:35,677 epoch 10 - iter 147/1476 - loss 0.00895335 - time (sec): 7.27 - samples/sec: 2440.59 - lr: 0.000003 - momentum: 0.000000
2023-10-13 16:30:42,470 epoch 10 - iter 294/1476 - loss 0.00653471 - time (sec): 14.06 - samples/sec: 2395.44 - lr: 0.000003 - momentum: 0.000000
2023-10-13 16:30:49,027 epoch 10 - iter 441/1476 - loss 0.00745494 - time (sec): 20.62 - samples/sec: 2388.45 - lr: 0.000002 - momentum: 0.000000
2023-10-13 16:30:56,197 epoch 10 - iter 588/1476 - loss 0.00712878 - time (sec): 27.79 - samples/sec: 2366.71 - lr: 0.000002 - momentum: 0.000000
2023-10-13 16:31:03,271 epoch 10 - iter 735/1476 - loss 0.00682681 - time (sec): 34.86 - samples/sec: 2347.22 - lr: 0.000002 - momentum: 0.000000
2023-10-13 16:31:10,608 epoch 10 - iter 882/1476 - loss 0.00780092 - time (sec): 42.20 - samples/sec: 2386.20 - lr: 0.000001 - momentum: 0.000000
2023-10-13 16:31:17,347 epoch 10 - iter 1029/1476 - loss 0.00743818 - time (sec): 48.94 - samples/sec: 2359.12 - lr: 0.000001 - momentum: 0.000000
2023-10-13 16:31:24,541 epoch 10 - iter 1176/1476 - loss 0.00763687 - time (sec): 56.13 - samples/sec: 2360.28 - lr: 0.000001 - momentum: 0.000000
2023-10-13 16:31:31,528 epoch 10 - iter 1323/1476 - loss 0.00786299 - time (sec): 63.12 - samples/sec: 2367.54 - lr: 0.000000 - momentum: 0.000000
2023-10-13 16:31:38,440 epoch 10 - iter 1470/1476 - loss 0.00744089 - time (sec): 70.03 - samples/sec: 2368.59 - lr: 0.000000 - momentum: 0.000000
2023-10-13 16:31:38,703 ----------------------------------------------------------------------------------------------------
2023-10-13 16:31:38,703 EPOCH 10 done: loss 0.0074 - lr: 0.000000
2023-10-13 16:31:50,374 DEV : loss 0.2123088389635086 - f1-score (micro avg) 0.8335
2023-10-13 16:31:50,806 ----------------------------------------------------------------------------------------------------
2023-10-13 16:31:50,807 Loading model from best epoch ...
2023-10-13 16:31:52,210 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
2023-10-13 16:31:58,089
Results:
- F-score (micro) 0.7975
- F-score (macro) 0.703
- Accuracy 0.6891
By class:
precision recall f1-score support
loc 0.8449 0.8823 0.8632 858
pers 0.7500 0.8045 0.7763 537
org 0.5968 0.5606 0.5781 132
prod 0.6935 0.7049 0.6992 61
time 0.5556 0.6481 0.5983 54
micro avg 0.7792 0.8167 0.7975 1642
macro avg 0.6881 0.7201 0.7030 1642
weighted avg 0.7788 0.8167 0.7970 1642
2023-10-13 16:31:58,089 ----------------------------------------------------------------------------------------------------