stefan-it's picture
Upload folder using huggingface_hub
7cf2fa0
2023-10-13 13:26:01,621 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:01,624 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 13:26:01,624 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:01,625 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-13 13:26:01,625 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:01,625 Train: 6183 sentences
2023-10-13 13:26:01,625 (train_with_dev=False, train_with_test=False)
2023-10-13 13:26:01,625 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:01,625 Training Params:
2023-10-13 13:26:01,625 - learning_rate: "0.00016"
2023-10-13 13:26:01,625 - mini_batch_size: "8"
2023-10-13 13:26:01,625 - max_epochs: "10"
2023-10-13 13:26:01,625 - shuffle: "True"
2023-10-13 13:26:01,625 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:01,626 Plugins:
2023-10-13 13:26:01,626 - TensorboardLogger
2023-10-13 13:26:01,626 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 13:26:01,626 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:01,626 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 13:26:01,626 - metric: "('micro avg', 'f1-score')"
2023-10-13 13:26:01,626 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:01,626 Computation:
2023-10-13 13:26:01,626 - compute on device: cuda:0
2023-10-13 13:26:01,626 - embedding storage: none
2023-10-13 13:26:01,626 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:01,626 Model training base path: "hmbench-topres19th/en-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-2"
2023-10-13 13:26:01,627 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:01,627 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:01,627 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-13 13:26:45,818 epoch 1 - iter 77/773 - loss 2.57062010 - time (sec): 44.19 - samples/sec: 279.83 - lr: 0.000016 - momentum: 0.000000
2023-10-13 13:27:29,191 epoch 1 - iter 154/773 - loss 2.53541742 - time (sec): 87.56 - samples/sec: 273.18 - lr: 0.000032 - momentum: 0.000000
2023-10-13 13:28:12,935 epoch 1 - iter 231/773 - loss 2.36209147 - time (sec): 131.31 - samples/sec: 279.40 - lr: 0.000048 - momentum: 0.000000
2023-10-13 13:28:54,426 epoch 1 - iter 308/773 - loss 2.11589559 - time (sec): 172.80 - samples/sec: 291.48 - lr: 0.000064 - momentum: 0.000000
2023-10-13 13:29:35,243 epoch 1 - iter 385/773 - loss 1.87498350 - time (sec): 213.61 - samples/sec: 294.41 - lr: 0.000079 - momentum: 0.000000
2023-10-13 13:30:15,300 epoch 1 - iter 462/773 - loss 1.66074017 - time (sec): 253.67 - samples/sec: 293.66 - lr: 0.000095 - momentum: 0.000000
2023-10-13 13:30:55,241 epoch 1 - iter 539/773 - loss 1.47343454 - time (sec): 293.61 - samples/sec: 293.37 - lr: 0.000111 - momentum: 0.000000
2023-10-13 13:31:36,180 epoch 1 - iter 616/773 - loss 1.31856275 - time (sec): 334.55 - samples/sec: 293.10 - lr: 0.000127 - momentum: 0.000000
2023-10-13 13:32:18,222 epoch 1 - iter 693/773 - loss 1.18884911 - time (sec): 376.59 - samples/sec: 294.74 - lr: 0.000143 - momentum: 0.000000
2023-10-13 13:32:58,624 epoch 1 - iter 770/773 - loss 1.08342681 - time (sec): 416.99 - samples/sec: 297.00 - lr: 0.000159 - momentum: 0.000000
2023-10-13 13:33:00,095 ----------------------------------------------------------------------------------------------------
2023-10-13 13:33:00,095 EPOCH 1 done: loss 1.0799 - lr: 0.000159
2023-10-13 13:33:17,290 DEV : loss 0.09480314701795578 - f1-score (micro avg) 0.5298
2023-10-13 13:33:17,321 saving best model
2023-10-13 13:33:18,262 ----------------------------------------------------------------------------------------------------
2023-10-13 13:34:00,643 epoch 2 - iter 77/773 - loss 0.13616110 - time (sec): 42.38 - samples/sec: 285.85 - lr: 0.000158 - momentum: 0.000000
2023-10-13 13:34:40,741 epoch 2 - iter 154/773 - loss 0.13603970 - time (sec): 82.48 - samples/sec: 302.30 - lr: 0.000156 - momentum: 0.000000
2023-10-13 13:35:20,081 epoch 2 - iter 231/773 - loss 0.12633571 - time (sec): 121.82 - samples/sec: 305.57 - lr: 0.000155 - momentum: 0.000000
2023-10-13 13:36:00,308 epoch 2 - iter 308/773 - loss 0.11757551 - time (sec): 162.04 - samples/sec: 308.22 - lr: 0.000153 - momentum: 0.000000
2023-10-13 13:36:40,604 epoch 2 - iter 385/773 - loss 0.11330549 - time (sec): 202.34 - samples/sec: 304.24 - lr: 0.000151 - momentum: 0.000000
2023-10-13 13:37:22,329 epoch 2 - iter 462/773 - loss 0.10747945 - time (sec): 244.07 - samples/sec: 305.52 - lr: 0.000149 - momentum: 0.000000
2023-10-13 13:38:02,731 epoch 2 - iter 539/773 - loss 0.10730665 - time (sec): 284.47 - samples/sec: 307.32 - lr: 0.000148 - momentum: 0.000000
2023-10-13 13:38:43,718 epoch 2 - iter 616/773 - loss 0.10535188 - time (sec): 325.45 - samples/sec: 305.88 - lr: 0.000146 - momentum: 0.000000
2023-10-13 13:39:28,058 epoch 2 - iter 693/773 - loss 0.10308060 - time (sec): 369.79 - samples/sec: 302.56 - lr: 0.000144 - momentum: 0.000000
2023-10-13 13:40:10,324 epoch 2 - iter 770/773 - loss 0.10039870 - time (sec): 412.06 - samples/sec: 300.66 - lr: 0.000142 - momentum: 0.000000
2023-10-13 13:40:11,973 ----------------------------------------------------------------------------------------------------
2023-10-13 13:40:11,973 EPOCH 2 done: loss 0.1002 - lr: 0.000142
2023-10-13 13:40:31,298 DEV : loss 0.057838067412376404 - f1-score (micro avg) 0.7813
2023-10-13 13:40:31,328 saving best model
2023-10-13 13:40:34,113 ----------------------------------------------------------------------------------------------------
2023-10-13 13:41:16,472 epoch 3 - iter 77/773 - loss 0.06633658 - time (sec): 42.36 - samples/sec: 301.00 - lr: 0.000140 - momentum: 0.000000
2023-10-13 13:41:58,905 epoch 3 - iter 154/773 - loss 0.06277207 - time (sec): 84.79 - samples/sec: 297.67 - lr: 0.000139 - momentum: 0.000000
2023-10-13 13:42:40,672 epoch 3 - iter 231/773 - loss 0.06457756 - time (sec): 126.55 - samples/sec: 291.46 - lr: 0.000137 - momentum: 0.000000
2023-10-13 13:43:23,276 epoch 3 - iter 308/773 - loss 0.06768297 - time (sec): 169.16 - samples/sec: 295.98 - lr: 0.000135 - momentum: 0.000000
2023-10-13 13:44:02,782 epoch 3 - iter 385/773 - loss 0.06578215 - time (sec): 208.66 - samples/sec: 297.09 - lr: 0.000133 - momentum: 0.000000
2023-10-13 13:44:44,904 epoch 3 - iter 462/773 - loss 0.06565666 - time (sec): 250.79 - samples/sec: 296.04 - lr: 0.000132 - momentum: 0.000000
2023-10-13 13:45:27,591 epoch 3 - iter 539/773 - loss 0.06483917 - time (sec): 293.47 - samples/sec: 296.59 - lr: 0.000130 - momentum: 0.000000
2023-10-13 13:46:07,958 epoch 3 - iter 616/773 - loss 0.06239288 - time (sec): 333.84 - samples/sec: 299.57 - lr: 0.000128 - momentum: 0.000000
2023-10-13 13:46:48,009 epoch 3 - iter 693/773 - loss 0.06147020 - time (sec): 373.89 - samples/sec: 300.51 - lr: 0.000126 - momentum: 0.000000
2023-10-13 13:47:27,430 epoch 3 - iter 770/773 - loss 0.06074472 - time (sec): 413.31 - samples/sec: 299.26 - lr: 0.000125 - momentum: 0.000000
2023-10-13 13:47:29,015 ----------------------------------------------------------------------------------------------------
2023-10-13 13:47:29,015 EPOCH 3 done: loss 0.0608 - lr: 0.000125
2023-10-13 13:47:46,060 DEV : loss 0.04795033112168312 - f1-score (micro avg) 0.786
2023-10-13 13:47:46,091 saving best model
2023-10-13 13:47:48,760 ----------------------------------------------------------------------------------------------------
2023-10-13 13:48:31,247 epoch 4 - iter 77/773 - loss 0.04570769 - time (sec): 42.48 - samples/sec: 266.33 - lr: 0.000123 - momentum: 0.000000
2023-10-13 13:49:11,886 epoch 4 - iter 154/773 - loss 0.04142650 - time (sec): 83.12 - samples/sec: 293.56 - lr: 0.000121 - momentum: 0.000000
2023-10-13 13:49:50,782 epoch 4 - iter 231/773 - loss 0.04222222 - time (sec): 122.02 - samples/sec: 295.97 - lr: 0.000119 - momentum: 0.000000
2023-10-13 13:50:31,804 epoch 4 - iter 308/773 - loss 0.04243309 - time (sec): 163.04 - samples/sec: 306.86 - lr: 0.000117 - momentum: 0.000000
2023-10-13 13:51:11,088 epoch 4 - iter 385/773 - loss 0.03983764 - time (sec): 202.32 - samples/sec: 306.16 - lr: 0.000116 - momentum: 0.000000
2023-10-13 13:51:51,434 epoch 4 - iter 462/773 - loss 0.03841069 - time (sec): 242.67 - samples/sec: 307.84 - lr: 0.000114 - momentum: 0.000000
2023-10-13 13:52:31,480 epoch 4 - iter 539/773 - loss 0.03825856 - time (sec): 282.72 - samples/sec: 308.09 - lr: 0.000112 - momentum: 0.000000
2023-10-13 13:53:11,592 epoch 4 - iter 616/773 - loss 0.03669827 - time (sec): 322.83 - samples/sec: 309.31 - lr: 0.000110 - momentum: 0.000000
2023-10-13 13:53:51,388 epoch 4 - iter 693/773 - loss 0.03804052 - time (sec): 362.62 - samples/sec: 308.06 - lr: 0.000109 - momentum: 0.000000
2023-10-13 13:54:31,531 epoch 4 - iter 770/773 - loss 0.03792700 - time (sec): 402.77 - samples/sec: 307.42 - lr: 0.000107 - momentum: 0.000000
2023-10-13 13:54:33,017 ----------------------------------------------------------------------------------------------------
2023-10-13 13:54:33,018 EPOCH 4 done: loss 0.0378 - lr: 0.000107
2023-10-13 13:54:49,877 DEV : loss 0.06375983357429504 - f1-score (micro avg) 0.8
2023-10-13 13:54:49,910 saving best model
2023-10-13 13:54:50,939 ----------------------------------------------------------------------------------------------------
2023-10-13 13:55:30,979 epoch 5 - iter 77/773 - loss 0.02552778 - time (sec): 40.04 - samples/sec: 305.94 - lr: 0.000105 - momentum: 0.000000
2023-10-13 13:56:14,288 epoch 5 - iter 154/773 - loss 0.02578947 - time (sec): 83.35 - samples/sec: 292.86 - lr: 0.000103 - momentum: 0.000000
2023-10-13 13:56:55,531 epoch 5 - iter 231/773 - loss 0.02476470 - time (sec): 124.59 - samples/sec: 303.09 - lr: 0.000101 - momentum: 0.000000
2023-10-13 13:57:36,632 epoch 5 - iter 308/773 - loss 0.02496423 - time (sec): 165.69 - samples/sec: 305.85 - lr: 0.000100 - momentum: 0.000000
2023-10-13 13:58:16,866 epoch 5 - iter 385/773 - loss 0.02406075 - time (sec): 205.92 - samples/sec: 302.56 - lr: 0.000098 - momentum: 0.000000
2023-10-13 13:58:57,095 epoch 5 - iter 462/773 - loss 0.02389150 - time (sec): 246.15 - samples/sec: 303.26 - lr: 0.000096 - momentum: 0.000000
2023-10-13 13:59:36,789 epoch 5 - iter 539/773 - loss 0.02394706 - time (sec): 285.85 - samples/sec: 302.28 - lr: 0.000094 - momentum: 0.000000
2023-10-13 14:00:17,171 epoch 5 - iter 616/773 - loss 0.02525134 - time (sec): 326.23 - samples/sec: 305.03 - lr: 0.000093 - momentum: 0.000000
2023-10-13 14:00:56,105 epoch 5 - iter 693/773 - loss 0.02473743 - time (sec): 365.16 - samples/sec: 305.88 - lr: 0.000091 - momentum: 0.000000
2023-10-13 14:01:35,740 epoch 5 - iter 770/773 - loss 0.02498417 - time (sec): 404.80 - samples/sec: 305.61 - lr: 0.000089 - momentum: 0.000000
2023-10-13 14:01:37,312 ----------------------------------------------------------------------------------------------------
2023-10-13 14:01:37,312 EPOCH 5 done: loss 0.0251 - lr: 0.000089
2023-10-13 14:01:54,361 DEV : loss 0.06829117983579636 - f1-score (micro avg) 0.818
2023-10-13 14:01:54,402 saving best model
2023-10-13 14:01:57,066 ----------------------------------------------------------------------------------------------------
2023-10-13 14:02:37,096 epoch 6 - iter 77/773 - loss 0.01718220 - time (sec): 40.02 - samples/sec: 279.69 - lr: 0.000087 - momentum: 0.000000
2023-10-13 14:03:16,814 epoch 6 - iter 154/773 - loss 0.01550972 - time (sec): 79.74 - samples/sec: 303.25 - lr: 0.000085 - momentum: 0.000000
2023-10-13 14:03:56,340 epoch 6 - iter 231/773 - loss 0.01618194 - time (sec): 119.27 - samples/sec: 304.29 - lr: 0.000084 - momentum: 0.000000
2023-10-13 14:04:35,237 epoch 6 - iter 308/773 - loss 0.01557615 - time (sec): 158.17 - samples/sec: 306.65 - lr: 0.000082 - momentum: 0.000000
2023-10-13 14:05:14,712 epoch 6 - iter 385/773 - loss 0.01677557 - time (sec): 197.64 - samples/sec: 306.89 - lr: 0.000080 - momentum: 0.000000
2023-10-13 14:05:55,045 epoch 6 - iter 462/773 - loss 0.01629595 - time (sec): 237.97 - samples/sec: 309.87 - lr: 0.000078 - momentum: 0.000000
2023-10-13 14:06:34,474 epoch 6 - iter 539/773 - loss 0.01552389 - time (sec): 277.40 - samples/sec: 310.32 - lr: 0.000077 - momentum: 0.000000
2023-10-13 14:07:15,189 epoch 6 - iter 616/773 - loss 0.01710578 - time (sec): 318.12 - samples/sec: 311.28 - lr: 0.000075 - momentum: 0.000000
2023-10-13 14:07:56,193 epoch 6 - iter 693/773 - loss 0.01669410 - time (sec): 359.12 - samples/sec: 310.20 - lr: 0.000073 - momentum: 0.000000
2023-10-13 14:08:37,693 epoch 6 - iter 770/773 - loss 0.01618705 - time (sec): 400.62 - samples/sec: 308.81 - lr: 0.000071 - momentum: 0.000000
2023-10-13 14:08:39,327 ----------------------------------------------------------------------------------------------------
2023-10-13 14:08:39,327 EPOCH 6 done: loss 0.0162 - lr: 0.000071
2023-10-13 14:08:57,695 DEV : loss 0.08230733126401901 - f1-score (micro avg) 0.8056
2023-10-13 14:08:57,725 ----------------------------------------------------------------------------------------------------
2023-10-13 14:09:38,784 epoch 7 - iter 77/773 - loss 0.01104098 - time (sec): 41.06 - samples/sec: 301.10 - lr: 0.000069 - momentum: 0.000000
2023-10-13 14:10:21,690 epoch 7 - iter 154/773 - loss 0.01084070 - time (sec): 83.96 - samples/sec: 290.25 - lr: 0.000068 - momentum: 0.000000
2023-10-13 14:11:03,996 epoch 7 - iter 231/773 - loss 0.00993163 - time (sec): 126.27 - samples/sec: 289.09 - lr: 0.000066 - momentum: 0.000000
2023-10-13 14:11:45,648 epoch 7 - iter 308/773 - loss 0.00931498 - time (sec): 167.92 - samples/sec: 297.79 - lr: 0.000064 - momentum: 0.000000
2023-10-13 14:12:25,998 epoch 7 - iter 385/773 - loss 0.00990930 - time (sec): 208.27 - samples/sec: 300.74 - lr: 0.000062 - momentum: 0.000000
2023-10-13 14:13:08,002 epoch 7 - iter 462/773 - loss 0.01091458 - time (sec): 250.27 - samples/sec: 297.45 - lr: 0.000061 - momentum: 0.000000
2023-10-13 14:13:50,331 epoch 7 - iter 539/773 - loss 0.01055709 - time (sec): 292.60 - samples/sec: 295.05 - lr: 0.000059 - momentum: 0.000000
2023-10-13 14:14:31,155 epoch 7 - iter 616/773 - loss 0.01103797 - time (sec): 333.43 - samples/sec: 296.92 - lr: 0.000057 - momentum: 0.000000
2023-10-13 14:15:12,601 epoch 7 - iter 693/773 - loss 0.01095294 - time (sec): 374.87 - samples/sec: 297.76 - lr: 0.000055 - momentum: 0.000000
2023-10-13 14:15:57,007 epoch 7 - iter 770/773 - loss 0.01084300 - time (sec): 419.28 - samples/sec: 295.63 - lr: 0.000054 - momentum: 0.000000
2023-10-13 14:15:58,471 ----------------------------------------------------------------------------------------------------
2023-10-13 14:15:58,471 EPOCH 7 done: loss 0.0112 - lr: 0.000054
2023-10-13 14:16:15,743 DEV : loss 0.08712616562843323 - f1-score (micro avg) 0.8209
2023-10-13 14:16:15,772 saving best model
2023-10-13 14:16:18,511 ----------------------------------------------------------------------------------------------------
2023-10-13 14:16:58,837 epoch 8 - iter 77/773 - loss 0.00793344 - time (sec): 40.32 - samples/sec: 299.32 - lr: 0.000052 - momentum: 0.000000
2023-10-13 14:17:41,619 epoch 8 - iter 154/773 - loss 0.00868923 - time (sec): 83.10 - samples/sec: 302.77 - lr: 0.000050 - momentum: 0.000000
2023-10-13 14:18:22,789 epoch 8 - iter 231/773 - loss 0.00820784 - time (sec): 124.27 - samples/sec: 299.14 - lr: 0.000048 - momentum: 0.000000
2023-10-13 14:19:05,552 epoch 8 - iter 308/773 - loss 0.00790837 - time (sec): 167.04 - samples/sec: 291.71 - lr: 0.000046 - momentum: 0.000000
2023-10-13 14:19:44,332 epoch 8 - iter 385/773 - loss 0.00769877 - time (sec): 205.82 - samples/sec: 291.56 - lr: 0.000045 - momentum: 0.000000
2023-10-13 14:20:24,710 epoch 8 - iter 462/773 - loss 0.00752209 - time (sec): 246.19 - samples/sec: 297.01 - lr: 0.000043 - momentum: 0.000000
2023-10-13 14:21:09,307 epoch 8 - iter 539/773 - loss 0.00819953 - time (sec): 290.79 - samples/sec: 297.36 - lr: 0.000041 - momentum: 0.000000
2023-10-13 14:21:51,152 epoch 8 - iter 616/773 - loss 0.00792088 - time (sec): 332.64 - samples/sec: 298.68 - lr: 0.000039 - momentum: 0.000000
2023-10-13 14:22:31,008 epoch 8 - iter 693/773 - loss 0.00757495 - time (sec): 372.49 - samples/sec: 299.83 - lr: 0.000038 - momentum: 0.000000
2023-10-13 14:23:10,962 epoch 8 - iter 770/773 - loss 0.00734145 - time (sec): 412.45 - samples/sec: 299.81 - lr: 0.000036 - momentum: 0.000000
2023-10-13 14:23:12,536 ----------------------------------------------------------------------------------------------------
2023-10-13 14:23:12,537 EPOCH 8 done: loss 0.0073 - lr: 0.000036
2023-10-13 14:23:29,813 DEV : loss 0.0935787558555603 - f1-score (micro avg) 0.8129
2023-10-13 14:23:29,844 ----------------------------------------------------------------------------------------------------
2023-10-13 14:24:10,868 epoch 9 - iter 77/773 - loss 0.00369340 - time (sec): 41.02 - samples/sec: 304.99 - lr: 0.000034 - momentum: 0.000000
2023-10-13 14:24:52,522 epoch 9 - iter 154/773 - loss 0.00441737 - time (sec): 82.68 - samples/sec: 306.66 - lr: 0.000032 - momentum: 0.000000
2023-10-13 14:25:34,148 epoch 9 - iter 231/773 - loss 0.00418139 - time (sec): 124.30 - samples/sec: 306.22 - lr: 0.000030 - momentum: 0.000000
2023-10-13 14:26:18,115 epoch 9 - iter 308/773 - loss 0.00465322 - time (sec): 168.27 - samples/sec: 297.99 - lr: 0.000029 - momentum: 0.000000
2023-10-13 14:26:58,721 epoch 9 - iter 385/773 - loss 0.00496921 - time (sec): 208.87 - samples/sec: 301.32 - lr: 0.000027 - momentum: 0.000000
2023-10-13 14:27:41,960 epoch 9 - iter 462/773 - loss 0.00516472 - time (sec): 252.11 - samples/sec: 298.31 - lr: 0.000025 - momentum: 0.000000
2023-10-13 14:28:24,233 epoch 9 - iter 539/773 - loss 0.00499233 - time (sec): 294.39 - samples/sec: 294.84 - lr: 0.000023 - momentum: 0.000000
2023-10-13 14:29:05,825 epoch 9 - iter 616/773 - loss 0.00572259 - time (sec): 335.98 - samples/sec: 296.74 - lr: 0.000022 - momentum: 0.000000
2023-10-13 14:29:51,119 epoch 9 - iter 693/773 - loss 0.00573440 - time (sec): 381.27 - samples/sec: 293.32 - lr: 0.000020 - momentum: 0.000000
2023-10-13 14:30:33,427 epoch 9 - iter 770/773 - loss 0.00567544 - time (sec): 423.58 - samples/sec: 292.77 - lr: 0.000018 - momentum: 0.000000
2023-10-13 14:30:34,870 ----------------------------------------------------------------------------------------------------
2023-10-13 14:30:34,871 EPOCH 9 done: loss 0.0057 - lr: 0.000018
2023-10-13 14:30:52,483 DEV : loss 0.09646561741828918 - f1-score (micro avg) 0.816
2023-10-13 14:30:52,518 ----------------------------------------------------------------------------------------------------
2023-10-13 14:31:38,164 epoch 10 - iter 77/773 - loss 0.00271756 - time (sec): 45.64 - samples/sec: 269.06 - lr: 0.000016 - momentum: 0.000000
2023-10-13 14:32:20,990 epoch 10 - iter 154/773 - loss 0.00479762 - time (sec): 88.47 - samples/sec: 284.55 - lr: 0.000014 - momentum: 0.000000
2023-10-13 14:33:05,141 epoch 10 - iter 231/773 - loss 0.00480447 - time (sec): 132.62 - samples/sec: 282.47 - lr: 0.000013 - momentum: 0.000000
2023-10-13 14:33:48,679 epoch 10 - iter 308/773 - loss 0.00412616 - time (sec): 176.16 - samples/sec: 282.82 - lr: 0.000011 - momentum: 0.000000
2023-10-13 14:34:28,762 epoch 10 - iter 385/773 - loss 0.00410371 - time (sec): 216.24 - samples/sec: 287.37 - lr: 0.000009 - momentum: 0.000000
2023-10-13 14:35:08,671 epoch 10 - iter 462/773 - loss 0.00414590 - time (sec): 256.15 - samples/sec: 288.15 - lr: 0.000007 - momentum: 0.000000
2023-10-13 14:35:47,002 epoch 10 - iter 539/773 - loss 0.00433560 - time (sec): 294.48 - samples/sec: 292.35 - lr: 0.000006 - momentum: 0.000000
2023-10-13 14:36:24,903 epoch 10 - iter 616/773 - loss 0.00453624 - time (sec): 332.38 - samples/sec: 294.15 - lr: 0.000004 - momentum: 0.000000
2023-10-13 14:37:04,210 epoch 10 - iter 693/773 - loss 0.00469727 - time (sec): 371.69 - samples/sec: 299.05 - lr: 0.000002 - momentum: 0.000000
2023-10-13 14:37:44,090 epoch 10 - iter 770/773 - loss 0.00452566 - time (sec): 411.57 - samples/sec: 300.46 - lr: 0.000000 - momentum: 0.000000
2023-10-13 14:37:45,747 ----------------------------------------------------------------------------------------------------
2023-10-13 14:37:45,747 EPOCH 10 done: loss 0.0045 - lr: 0.000000
2023-10-13 14:38:02,933 DEV : loss 0.09914136677980423 - f1-score (micro avg) 0.8129
2023-10-13 14:38:03,928 ----------------------------------------------------------------------------------------------------
2023-10-13 14:38:03,930 Loading model from best epoch ...
2023-10-13 14:38:08,507 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-13 14:39:03,429
Results:
- F-score (micro) 0.8076
- F-score (macro) 0.725
- Accuracy 0.6957
By class:
precision recall f1-score support
LOC 0.8396 0.8742 0.8566 946
BUILDING 0.5746 0.5622 0.5683 185
STREET 0.7031 0.8036 0.7500 56
micro avg 0.7935 0.8222 0.8076 1187
macro avg 0.7058 0.7466 0.7250 1187
weighted avg 0.7919 0.8222 0.8066 1187
2023-10-13 14:39:03,429 ----------------------------------------------------------------------------------------------------