stefan-it's picture
Upload folder using huggingface_hub
d477832
2023-10-11 23:49:43,888 ----------------------------------------------------------------------------------------------------
2023-10-11 23:49:43,890 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-11 23:49:43,890 ----------------------------------------------------------------------------------------------------
2023-10-11 23:49:43,890 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-11 23:49:43,890 ----------------------------------------------------------------------------------------------------
2023-10-11 23:49:43,891 Train: 7142 sentences
2023-10-11 23:49:43,891 (train_with_dev=False, train_with_test=False)
2023-10-11 23:49:43,891 ----------------------------------------------------------------------------------------------------
2023-10-11 23:49:43,891 Training Params:
2023-10-11 23:49:43,891 - learning_rate: "0.00015"
2023-10-11 23:49:43,891 - mini_batch_size: "4"
2023-10-11 23:49:43,891 - max_epochs: "10"
2023-10-11 23:49:43,891 - shuffle: "True"
2023-10-11 23:49:43,891 ----------------------------------------------------------------------------------------------------
2023-10-11 23:49:43,891 Plugins:
2023-10-11 23:49:43,891 - TensorboardLogger
2023-10-11 23:49:43,891 - LinearScheduler | warmup_fraction: '0.1'
2023-10-11 23:49:43,891 ----------------------------------------------------------------------------------------------------
2023-10-11 23:49:43,891 Final evaluation on model from best epoch (best-model.pt)
2023-10-11 23:49:43,891 - metric: "('micro avg', 'f1-score')"
2023-10-11 23:49:43,892 ----------------------------------------------------------------------------------------------------
2023-10-11 23:49:43,892 Computation:
2023-10-11 23:49:43,892 - compute on device: cuda:0
2023-10-11 23:49:43,892 - embedding storage: none
2023-10-11 23:49:43,892 ----------------------------------------------------------------------------------------------------
2023-10-11 23:49:43,892 Model training base path: "hmbench-newseye/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-5"
2023-10-11 23:49:43,892 ----------------------------------------------------------------------------------------------------
2023-10-11 23:49:43,892 ----------------------------------------------------------------------------------------------------
2023-10-11 23:49:43,892 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-11 23:50:42,826 epoch 1 - iter 178/1786 - loss 2.80679544 - time (sec): 58.93 - samples/sec: 459.62 - lr: 0.000015 - momentum: 0.000000
2023-10-11 23:51:38,946 epoch 1 - iter 356/1786 - loss 2.65034125 - time (sec): 115.05 - samples/sec: 459.46 - lr: 0.000030 - momentum: 0.000000
2023-10-11 23:52:36,349 epoch 1 - iter 534/1786 - loss 2.36581802 - time (sec): 172.45 - samples/sec: 462.09 - lr: 0.000045 - momentum: 0.000000
2023-10-11 23:53:31,134 epoch 1 - iter 712/1786 - loss 2.08798123 - time (sec): 227.24 - samples/sec: 461.30 - lr: 0.000060 - momentum: 0.000000
2023-10-11 23:54:26,815 epoch 1 - iter 890/1786 - loss 1.83005658 - time (sec): 282.92 - samples/sec: 457.29 - lr: 0.000075 - momentum: 0.000000
2023-10-11 23:55:20,922 epoch 1 - iter 1068/1786 - loss 1.63577264 - time (sec): 337.03 - samples/sec: 454.06 - lr: 0.000090 - momentum: 0.000000
2023-10-11 23:56:15,319 epoch 1 - iter 1246/1786 - loss 1.47180576 - time (sec): 391.43 - samples/sec: 452.29 - lr: 0.000105 - momentum: 0.000000
2023-10-11 23:57:09,254 epoch 1 - iter 1424/1786 - loss 1.34808408 - time (sec): 445.36 - samples/sec: 447.88 - lr: 0.000120 - momentum: 0.000000
2023-10-11 23:58:01,774 epoch 1 - iter 1602/1786 - loss 1.23634249 - time (sec): 497.88 - samples/sec: 449.14 - lr: 0.000134 - momentum: 0.000000
2023-10-11 23:58:54,318 epoch 1 - iter 1780/1786 - loss 1.14122061 - time (sec): 550.42 - samples/sec: 450.73 - lr: 0.000149 - momentum: 0.000000
2023-10-11 23:58:55,874 ----------------------------------------------------------------------------------------------------
2023-10-11 23:58:55,875 EPOCH 1 done: loss 1.1386 - lr: 0.000149
2023-10-11 23:59:15,310 DEV : loss 0.18069760501384735 - f1-score (micro avg) 0.5999
2023-10-11 23:59:15,342 saving best model
2023-10-11 23:59:16,229 ----------------------------------------------------------------------------------------------------
2023-10-12 00:00:10,169 epoch 2 - iter 178/1786 - loss 0.20711620 - time (sec): 53.94 - samples/sec: 462.66 - lr: 0.000148 - momentum: 0.000000
2023-10-12 00:01:04,653 epoch 2 - iter 356/1786 - loss 0.19247005 - time (sec): 108.42 - samples/sec: 464.22 - lr: 0.000147 - momentum: 0.000000
2023-10-12 00:02:00,536 epoch 2 - iter 534/1786 - loss 0.17794878 - time (sec): 164.30 - samples/sec: 458.93 - lr: 0.000145 - momentum: 0.000000
2023-10-12 00:02:55,523 epoch 2 - iter 712/1786 - loss 0.16748993 - time (sec): 219.29 - samples/sec: 455.31 - lr: 0.000143 - momentum: 0.000000
2023-10-12 00:03:52,876 epoch 2 - iter 890/1786 - loss 0.15565609 - time (sec): 276.64 - samples/sec: 456.14 - lr: 0.000142 - momentum: 0.000000
2023-10-12 00:04:47,012 epoch 2 - iter 1068/1786 - loss 0.15126686 - time (sec): 330.78 - samples/sec: 451.88 - lr: 0.000140 - momentum: 0.000000
2023-10-12 00:05:41,211 epoch 2 - iter 1246/1786 - loss 0.14667868 - time (sec): 384.98 - samples/sec: 450.19 - lr: 0.000138 - momentum: 0.000000
2023-10-12 00:06:37,571 epoch 2 - iter 1424/1786 - loss 0.14213535 - time (sec): 441.34 - samples/sec: 450.42 - lr: 0.000137 - momentum: 0.000000
2023-10-12 00:07:31,406 epoch 2 - iter 1602/1786 - loss 0.13922305 - time (sec): 495.17 - samples/sec: 450.14 - lr: 0.000135 - momentum: 0.000000
2023-10-12 00:08:24,847 epoch 2 - iter 1780/1786 - loss 0.13416379 - time (sec): 548.62 - samples/sec: 451.32 - lr: 0.000133 - momentum: 0.000000
2023-10-12 00:08:26,740 ----------------------------------------------------------------------------------------------------
2023-10-12 00:08:26,741 EPOCH 2 done: loss 0.1342 - lr: 0.000133
2023-10-12 00:08:48,065 DEV : loss 0.11438284069299698 - f1-score (micro avg) 0.7637
2023-10-12 00:08:48,096 saving best model
2023-10-12 00:08:51,147 ----------------------------------------------------------------------------------------------------
2023-10-12 00:09:44,510 epoch 3 - iter 178/1786 - loss 0.06683015 - time (sec): 53.36 - samples/sec: 459.96 - lr: 0.000132 - momentum: 0.000000
2023-10-12 00:10:39,022 epoch 3 - iter 356/1786 - loss 0.06750141 - time (sec): 107.87 - samples/sec: 464.38 - lr: 0.000130 - momentum: 0.000000
2023-10-12 00:11:33,861 epoch 3 - iter 534/1786 - loss 0.07127963 - time (sec): 162.71 - samples/sec: 453.13 - lr: 0.000128 - momentum: 0.000000
2023-10-12 00:12:28,893 epoch 3 - iter 712/1786 - loss 0.07159232 - time (sec): 217.74 - samples/sec: 449.53 - lr: 0.000127 - momentum: 0.000000
2023-10-12 00:13:25,201 epoch 3 - iter 890/1786 - loss 0.06993220 - time (sec): 274.05 - samples/sec: 451.16 - lr: 0.000125 - momentum: 0.000000
2023-10-12 00:14:19,775 epoch 3 - iter 1068/1786 - loss 0.07164423 - time (sec): 328.62 - samples/sec: 454.34 - lr: 0.000123 - momentum: 0.000000
2023-10-12 00:15:14,214 epoch 3 - iter 1246/1786 - loss 0.07103782 - time (sec): 383.06 - samples/sec: 454.45 - lr: 0.000122 - momentum: 0.000000
2023-10-12 00:16:08,217 epoch 3 - iter 1424/1786 - loss 0.07200320 - time (sec): 437.07 - samples/sec: 451.39 - lr: 0.000120 - momentum: 0.000000
2023-10-12 00:17:03,415 epoch 3 - iter 1602/1786 - loss 0.07374063 - time (sec): 492.26 - samples/sec: 450.32 - lr: 0.000118 - momentum: 0.000000
2023-10-12 00:17:59,334 epoch 3 - iter 1780/1786 - loss 0.07209817 - time (sec): 548.18 - samples/sec: 452.30 - lr: 0.000117 - momentum: 0.000000
2023-10-12 00:18:00,989 ----------------------------------------------------------------------------------------------------
2023-10-12 00:18:00,990 EPOCH 3 done: loss 0.0724 - lr: 0.000117
2023-10-12 00:18:24,115 DEV : loss 0.1281142383813858 - f1-score (micro avg) 0.7826
2023-10-12 00:18:24,147 saving best model
2023-10-12 00:18:40,438 ----------------------------------------------------------------------------------------------------
2023-10-12 00:19:35,647 epoch 4 - iter 178/1786 - loss 0.05657524 - time (sec): 55.21 - samples/sec: 485.86 - lr: 0.000115 - momentum: 0.000000
2023-10-12 00:20:30,635 epoch 4 - iter 356/1786 - loss 0.05425663 - time (sec): 110.19 - samples/sec: 461.11 - lr: 0.000113 - momentum: 0.000000
2023-10-12 00:21:25,821 epoch 4 - iter 534/1786 - loss 0.05035837 - time (sec): 165.38 - samples/sec: 458.97 - lr: 0.000112 - momentum: 0.000000
2023-10-12 00:22:20,078 epoch 4 - iter 712/1786 - loss 0.05157633 - time (sec): 219.64 - samples/sec: 457.43 - lr: 0.000110 - momentum: 0.000000
2023-10-12 00:23:14,198 epoch 4 - iter 890/1786 - loss 0.05225442 - time (sec): 273.76 - samples/sec: 453.18 - lr: 0.000108 - momentum: 0.000000
2023-10-12 00:24:09,428 epoch 4 - iter 1068/1786 - loss 0.05116186 - time (sec): 328.99 - samples/sec: 453.97 - lr: 0.000107 - momentum: 0.000000
2023-10-12 00:25:03,579 epoch 4 - iter 1246/1786 - loss 0.05109624 - time (sec): 383.14 - samples/sec: 451.90 - lr: 0.000105 - momentum: 0.000000
2023-10-12 00:25:57,486 epoch 4 - iter 1424/1786 - loss 0.05046609 - time (sec): 437.04 - samples/sec: 451.18 - lr: 0.000103 - momentum: 0.000000
2023-10-12 00:26:54,137 epoch 4 - iter 1602/1786 - loss 0.05099487 - time (sec): 493.70 - samples/sec: 453.53 - lr: 0.000102 - momentum: 0.000000
2023-10-12 00:27:48,105 epoch 4 - iter 1780/1786 - loss 0.05104060 - time (sec): 547.66 - samples/sec: 453.01 - lr: 0.000100 - momentum: 0.000000
2023-10-12 00:27:49,733 ----------------------------------------------------------------------------------------------------
2023-10-12 00:27:49,733 EPOCH 4 done: loss 0.0511 - lr: 0.000100
2023-10-12 00:28:11,834 DEV : loss 0.14586448669433594 - f1-score (micro avg) 0.8022
2023-10-12 00:28:11,868 saving best model
2023-10-12 00:28:22,595 ----------------------------------------------------------------------------------------------------
2023-10-12 00:29:16,940 epoch 5 - iter 178/1786 - loss 0.03100195 - time (sec): 54.34 - samples/sec: 451.46 - lr: 0.000098 - momentum: 0.000000
2023-10-12 00:30:10,437 epoch 5 - iter 356/1786 - loss 0.03243173 - time (sec): 107.84 - samples/sec: 454.11 - lr: 0.000097 - momentum: 0.000000
2023-10-12 00:31:04,353 epoch 5 - iter 534/1786 - loss 0.03471857 - time (sec): 161.75 - samples/sec: 459.07 - lr: 0.000095 - momentum: 0.000000
2023-10-12 00:31:57,749 epoch 5 - iter 712/1786 - loss 0.03271395 - time (sec): 215.15 - samples/sec: 454.35 - lr: 0.000093 - momentum: 0.000000
2023-10-12 00:32:54,440 epoch 5 - iter 890/1786 - loss 0.03439854 - time (sec): 271.84 - samples/sec: 447.74 - lr: 0.000092 - momentum: 0.000000
2023-10-12 00:33:50,980 epoch 5 - iter 1068/1786 - loss 0.03404286 - time (sec): 328.38 - samples/sec: 445.72 - lr: 0.000090 - momentum: 0.000000
2023-10-12 00:34:49,518 epoch 5 - iter 1246/1786 - loss 0.03481655 - time (sec): 386.92 - samples/sec: 448.76 - lr: 0.000088 - momentum: 0.000000
2023-10-12 00:35:44,748 epoch 5 - iter 1424/1786 - loss 0.03575715 - time (sec): 442.15 - samples/sec: 448.84 - lr: 0.000087 - momentum: 0.000000
2023-10-12 00:36:41,662 epoch 5 - iter 1602/1786 - loss 0.03690779 - time (sec): 499.06 - samples/sec: 447.33 - lr: 0.000085 - momentum: 0.000000
2023-10-12 00:37:37,592 epoch 5 - iter 1780/1786 - loss 0.03763514 - time (sec): 554.99 - samples/sec: 447.04 - lr: 0.000083 - momentum: 0.000000
2023-10-12 00:37:39,239 ----------------------------------------------------------------------------------------------------
2023-10-12 00:37:39,239 EPOCH 5 done: loss 0.0377 - lr: 0.000083
2023-10-12 00:38:02,466 DEV : loss 0.16297270357608795 - f1-score (micro avg) 0.8016
2023-10-12 00:38:02,499 ----------------------------------------------------------------------------------------------------
2023-10-12 00:38:57,768 epoch 6 - iter 178/1786 - loss 0.02668569 - time (sec): 55.27 - samples/sec: 464.94 - lr: 0.000082 - momentum: 0.000000
2023-10-12 00:39:51,559 epoch 6 - iter 356/1786 - loss 0.02601327 - time (sec): 109.06 - samples/sec: 456.49 - lr: 0.000080 - momentum: 0.000000
2023-10-12 00:40:48,681 epoch 6 - iter 534/1786 - loss 0.02650022 - time (sec): 166.18 - samples/sec: 464.21 - lr: 0.000078 - momentum: 0.000000
2023-10-12 00:41:43,107 epoch 6 - iter 712/1786 - loss 0.02802396 - time (sec): 220.61 - samples/sec: 459.29 - lr: 0.000077 - momentum: 0.000000
2023-10-12 00:42:38,563 epoch 6 - iter 890/1786 - loss 0.02839531 - time (sec): 276.06 - samples/sec: 461.31 - lr: 0.000075 - momentum: 0.000000
2023-10-12 00:43:36,355 epoch 6 - iter 1068/1786 - loss 0.02844378 - time (sec): 333.85 - samples/sec: 454.37 - lr: 0.000073 - momentum: 0.000000
2023-10-12 00:44:32,658 epoch 6 - iter 1246/1786 - loss 0.02807222 - time (sec): 390.16 - samples/sec: 450.56 - lr: 0.000072 - momentum: 0.000000
2023-10-12 00:45:29,116 epoch 6 - iter 1424/1786 - loss 0.02797952 - time (sec): 446.62 - samples/sec: 449.93 - lr: 0.000070 - momentum: 0.000000
2023-10-12 00:46:23,766 epoch 6 - iter 1602/1786 - loss 0.02796450 - time (sec): 501.27 - samples/sec: 447.34 - lr: 0.000068 - momentum: 0.000000
2023-10-12 00:47:17,742 epoch 6 - iter 1780/1786 - loss 0.02870438 - time (sec): 555.24 - samples/sec: 446.07 - lr: 0.000067 - momentum: 0.000000
2023-10-12 00:47:19,636 ----------------------------------------------------------------------------------------------------
2023-10-12 00:47:19,636 EPOCH 6 done: loss 0.0287 - lr: 0.000067
2023-10-12 00:47:42,108 DEV : loss 0.18033917248249054 - f1-score (micro avg) 0.7928
2023-10-12 00:47:42,140 ----------------------------------------------------------------------------------------------------
2023-10-12 00:48:37,819 epoch 7 - iter 178/1786 - loss 0.02839486 - time (sec): 55.68 - samples/sec: 432.28 - lr: 0.000065 - momentum: 0.000000
2023-10-12 00:49:33,964 epoch 7 - iter 356/1786 - loss 0.02271661 - time (sec): 111.82 - samples/sec: 445.36 - lr: 0.000063 - momentum: 0.000000
2023-10-12 00:50:28,746 epoch 7 - iter 534/1786 - loss 0.02274955 - time (sec): 166.60 - samples/sec: 441.96 - lr: 0.000062 - momentum: 0.000000
2023-10-12 00:51:23,278 epoch 7 - iter 712/1786 - loss 0.02044904 - time (sec): 221.14 - samples/sec: 449.84 - lr: 0.000060 - momentum: 0.000000
2023-10-12 00:52:15,708 epoch 7 - iter 890/1786 - loss 0.02037246 - time (sec): 273.57 - samples/sec: 455.10 - lr: 0.000058 - momentum: 0.000000
2023-10-12 00:53:08,412 epoch 7 - iter 1068/1786 - loss 0.01965538 - time (sec): 326.27 - samples/sec: 457.62 - lr: 0.000057 - momentum: 0.000000
2023-10-12 00:54:02,855 epoch 7 - iter 1246/1786 - loss 0.01962515 - time (sec): 380.71 - samples/sec: 456.00 - lr: 0.000055 - momentum: 0.000000
2023-10-12 00:54:57,358 epoch 7 - iter 1424/1786 - loss 0.02031428 - time (sec): 435.22 - samples/sec: 454.87 - lr: 0.000053 - momentum: 0.000000
2023-10-12 00:55:52,160 epoch 7 - iter 1602/1786 - loss 0.02032900 - time (sec): 490.02 - samples/sec: 455.60 - lr: 0.000052 - momentum: 0.000000
2023-10-12 00:56:45,131 epoch 7 - iter 1780/1786 - loss 0.02028515 - time (sec): 542.99 - samples/sec: 456.49 - lr: 0.000050 - momentum: 0.000000
2023-10-12 00:56:46,800 ----------------------------------------------------------------------------------------------------
2023-10-12 00:56:46,801 EPOCH 7 done: loss 0.0202 - lr: 0.000050
2023-10-12 00:57:08,705 DEV : loss 0.19071684777736664 - f1-score (micro avg) 0.7876
2023-10-12 00:57:08,738 ----------------------------------------------------------------------------------------------------
2023-10-12 00:58:03,715 epoch 8 - iter 178/1786 - loss 0.01789598 - time (sec): 54.97 - samples/sec: 455.59 - lr: 0.000048 - momentum: 0.000000
2023-10-12 00:58:59,441 epoch 8 - iter 356/1786 - loss 0.01800163 - time (sec): 110.70 - samples/sec: 454.42 - lr: 0.000047 - momentum: 0.000000
2023-10-12 00:59:55,590 epoch 8 - iter 534/1786 - loss 0.01707568 - time (sec): 166.85 - samples/sec: 450.43 - lr: 0.000045 - momentum: 0.000000
2023-10-12 01:00:50,158 epoch 8 - iter 712/1786 - loss 0.01690096 - time (sec): 221.42 - samples/sec: 444.03 - lr: 0.000043 - momentum: 0.000000
2023-10-12 01:01:42,049 epoch 8 - iter 890/1786 - loss 0.01620502 - time (sec): 273.31 - samples/sec: 445.86 - lr: 0.000042 - momentum: 0.000000
2023-10-12 01:02:35,862 epoch 8 - iter 1068/1786 - loss 0.01592801 - time (sec): 327.12 - samples/sec: 452.71 - lr: 0.000040 - momentum: 0.000000
2023-10-12 01:03:29,081 epoch 8 - iter 1246/1786 - loss 0.01640440 - time (sec): 380.34 - samples/sec: 449.84 - lr: 0.000038 - momentum: 0.000000
2023-10-12 01:04:24,552 epoch 8 - iter 1424/1786 - loss 0.01591161 - time (sec): 435.81 - samples/sec: 452.42 - lr: 0.000037 - momentum: 0.000000
2023-10-12 01:05:20,992 epoch 8 - iter 1602/1786 - loss 0.01561111 - time (sec): 492.25 - samples/sec: 453.96 - lr: 0.000035 - momentum: 0.000000
2023-10-12 01:06:14,936 epoch 8 - iter 1780/1786 - loss 0.01528229 - time (sec): 546.20 - samples/sec: 453.87 - lr: 0.000033 - momentum: 0.000000
2023-10-12 01:06:16,726 ----------------------------------------------------------------------------------------------------
2023-10-12 01:06:16,726 EPOCH 8 done: loss 0.0152 - lr: 0.000033
2023-10-12 01:06:38,521 DEV : loss 0.2063598781824112 - f1-score (micro avg) 0.7803
2023-10-12 01:06:38,552 ----------------------------------------------------------------------------------------------------
2023-10-12 01:07:33,634 epoch 9 - iter 178/1786 - loss 0.01097164 - time (sec): 55.08 - samples/sec: 470.62 - lr: 0.000032 - momentum: 0.000000
2023-10-12 01:08:26,848 epoch 9 - iter 356/1786 - loss 0.00871512 - time (sec): 108.29 - samples/sec: 461.16 - lr: 0.000030 - momentum: 0.000000
2023-10-12 01:09:20,346 epoch 9 - iter 534/1786 - loss 0.01138957 - time (sec): 161.79 - samples/sec: 459.33 - lr: 0.000028 - momentum: 0.000000
2023-10-12 01:10:14,450 epoch 9 - iter 712/1786 - loss 0.01055240 - time (sec): 215.90 - samples/sec: 455.40 - lr: 0.000027 - momentum: 0.000000
2023-10-12 01:11:10,367 epoch 9 - iter 890/1786 - loss 0.00994944 - time (sec): 271.81 - samples/sec: 451.87 - lr: 0.000025 - momentum: 0.000000
2023-10-12 01:12:06,227 epoch 9 - iter 1068/1786 - loss 0.00965138 - time (sec): 327.67 - samples/sec: 455.01 - lr: 0.000023 - momentum: 0.000000
2023-10-12 01:13:00,155 epoch 9 - iter 1246/1786 - loss 0.01059332 - time (sec): 381.60 - samples/sec: 459.44 - lr: 0.000022 - momentum: 0.000000
2023-10-12 01:13:53,715 epoch 9 - iter 1424/1786 - loss 0.01083122 - time (sec): 435.16 - samples/sec: 460.45 - lr: 0.000020 - momentum: 0.000000
2023-10-12 01:14:47,729 epoch 9 - iter 1602/1786 - loss 0.01115893 - time (sec): 489.17 - samples/sec: 458.87 - lr: 0.000018 - momentum: 0.000000
2023-10-12 01:15:40,190 epoch 9 - iter 1780/1786 - loss 0.01106046 - time (sec): 541.64 - samples/sec: 458.00 - lr: 0.000017 - momentum: 0.000000
2023-10-12 01:15:41,822 ----------------------------------------------------------------------------------------------------
2023-10-12 01:15:41,822 EPOCH 9 done: loss 0.0110 - lr: 0.000017
2023-10-12 01:16:03,451 DEV : loss 0.21163716912269592 - f1-score (micro avg) 0.791
2023-10-12 01:16:03,484 ----------------------------------------------------------------------------------------------------
2023-10-12 01:16:56,422 epoch 10 - iter 178/1786 - loss 0.00564481 - time (sec): 52.94 - samples/sec: 476.78 - lr: 0.000015 - momentum: 0.000000
2023-10-12 01:17:50,049 epoch 10 - iter 356/1786 - loss 0.00861168 - time (sec): 106.56 - samples/sec: 473.86 - lr: 0.000013 - momentum: 0.000000
2023-10-12 01:18:41,937 epoch 10 - iter 534/1786 - loss 0.00834080 - time (sec): 158.45 - samples/sec: 478.79 - lr: 0.000012 - momentum: 0.000000
2023-10-12 01:19:35,823 epoch 10 - iter 712/1786 - loss 0.00920845 - time (sec): 212.34 - samples/sec: 476.27 - lr: 0.000010 - momentum: 0.000000
2023-10-12 01:20:28,835 epoch 10 - iter 890/1786 - loss 0.00915627 - time (sec): 265.35 - samples/sec: 476.63 - lr: 0.000008 - momentum: 0.000000
2023-10-12 01:21:19,749 epoch 10 - iter 1068/1786 - loss 0.00855923 - time (sec): 316.26 - samples/sec: 475.74 - lr: 0.000007 - momentum: 0.000000
2023-10-12 01:22:11,915 epoch 10 - iter 1246/1786 - loss 0.00883037 - time (sec): 368.43 - samples/sec: 478.28 - lr: 0.000005 - momentum: 0.000000
2023-10-12 01:23:02,937 epoch 10 - iter 1424/1786 - loss 0.00832188 - time (sec): 419.45 - samples/sec: 477.38 - lr: 0.000003 - momentum: 0.000000
2023-10-12 01:23:54,187 epoch 10 - iter 1602/1786 - loss 0.00826011 - time (sec): 470.70 - samples/sec: 476.61 - lr: 0.000002 - momentum: 0.000000
2023-10-12 01:24:45,370 epoch 10 - iter 1780/1786 - loss 0.00851589 - time (sec): 521.88 - samples/sec: 475.55 - lr: 0.000000 - momentum: 0.000000
2023-10-12 01:24:46,804 ----------------------------------------------------------------------------------------------------
2023-10-12 01:24:46,804 EPOCH 10 done: loss 0.0085 - lr: 0.000000
2023-10-12 01:25:08,325 DEV : loss 0.20861276984214783 - f1-score (micro avg) 0.7838
2023-10-12 01:25:09,217 ----------------------------------------------------------------------------------------------------
2023-10-12 01:25:09,219 Loading model from best epoch ...
2023-10-12 01:25:13,155 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-12 01:26:23,943
Results:
- F-score (micro) 0.6972
- F-score (macro) 0.5865
- Accuracy 0.5472
By class:
precision recall f1-score support
LOC 0.7162 0.6959 0.7059 1095
PER 0.7693 0.7678 0.7685 1012
ORG 0.5026 0.5490 0.5248 357
HumanProd 0.2615 0.5152 0.3469 33
micro avg 0.6928 0.7016 0.6972 2497
macro avg 0.5624 0.6320 0.5865 2497
weighted avg 0.7012 0.7016 0.7006 2497
2023-10-12 01:26:23,943 ----------------------------------------------------------------------------------------------------