|
2023-10-13 13:26:01,621 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:26:01,624 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 13:26:01,624 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:26:01,625 MultiCorpus: 6183 train + 680 dev + 2113 test sentences |
|
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator |
|
2023-10-13 13:26:01,625 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:26:01,625 Train: 6183 sentences |
|
2023-10-13 13:26:01,625 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 13:26:01,625 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:26:01,625 Training Params: |
|
2023-10-13 13:26:01,625 - learning_rate: "0.00016" |
|
2023-10-13 13:26:01,625 - mini_batch_size: "8" |
|
2023-10-13 13:26:01,625 - max_epochs: "10" |
|
2023-10-13 13:26:01,625 - shuffle: "True" |
|
2023-10-13 13:26:01,625 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:26:01,626 Plugins: |
|
2023-10-13 13:26:01,626 - TensorboardLogger |
|
2023-10-13 13:26:01,626 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 13:26:01,626 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:26:01,626 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 13:26:01,626 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 13:26:01,626 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:26:01,626 Computation: |
|
2023-10-13 13:26:01,626 - compute on device: cuda:0 |
|
2023-10-13 13:26:01,626 - embedding storage: none |
|
2023-10-13 13:26:01,626 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:26:01,626 Model training base path: "hmbench-topres19th/en-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-13 13:26:01,627 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:26:01,627 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:26:01,627 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-13 13:26:45,818 epoch 1 - iter 77/773 - loss 2.57062010 - time (sec): 44.19 - samples/sec: 279.83 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 13:27:29,191 epoch 1 - iter 154/773 - loss 2.53541742 - time (sec): 87.56 - samples/sec: 273.18 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 13:28:12,935 epoch 1 - iter 231/773 - loss 2.36209147 - time (sec): 131.31 - samples/sec: 279.40 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 13:28:54,426 epoch 1 - iter 308/773 - loss 2.11589559 - time (sec): 172.80 - samples/sec: 291.48 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-13 13:29:35,243 epoch 1 - iter 385/773 - loss 1.87498350 - time (sec): 213.61 - samples/sec: 294.41 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-13 13:30:15,300 epoch 1 - iter 462/773 - loss 1.66074017 - time (sec): 253.67 - samples/sec: 293.66 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-13 13:30:55,241 epoch 1 - iter 539/773 - loss 1.47343454 - time (sec): 293.61 - samples/sec: 293.37 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-13 13:31:36,180 epoch 1 - iter 616/773 - loss 1.31856275 - time (sec): 334.55 - samples/sec: 293.10 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-13 13:32:18,222 epoch 1 - iter 693/773 - loss 1.18884911 - time (sec): 376.59 - samples/sec: 294.74 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-13 13:32:58,624 epoch 1 - iter 770/773 - loss 1.08342681 - time (sec): 416.99 - samples/sec: 297.00 - lr: 0.000159 - momentum: 0.000000 |
|
2023-10-13 13:33:00,095 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:33:00,095 EPOCH 1 done: loss 1.0799 - lr: 0.000159 |
|
2023-10-13 13:33:17,290 DEV : loss 0.09480314701795578 - f1-score (micro avg) 0.5298 |
|
2023-10-13 13:33:17,321 saving best model |
|
2023-10-13 13:33:18,262 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:34:00,643 epoch 2 - iter 77/773 - loss 0.13616110 - time (sec): 42.38 - samples/sec: 285.85 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-13 13:34:40,741 epoch 2 - iter 154/773 - loss 0.13603970 - time (sec): 82.48 - samples/sec: 302.30 - lr: 0.000156 - momentum: 0.000000 |
|
2023-10-13 13:35:20,081 epoch 2 - iter 231/773 - loss 0.12633571 - time (sec): 121.82 - samples/sec: 305.57 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-13 13:36:00,308 epoch 2 - iter 308/773 - loss 0.11757551 - time (sec): 162.04 - samples/sec: 308.22 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-13 13:36:40,604 epoch 2 - iter 385/773 - loss 0.11330549 - time (sec): 202.34 - samples/sec: 304.24 - lr: 0.000151 - momentum: 0.000000 |
|
2023-10-13 13:37:22,329 epoch 2 - iter 462/773 - loss 0.10747945 - time (sec): 244.07 - samples/sec: 305.52 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-13 13:38:02,731 epoch 2 - iter 539/773 - loss 0.10730665 - time (sec): 284.47 - samples/sec: 307.32 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-13 13:38:43,718 epoch 2 - iter 616/773 - loss 0.10535188 - time (sec): 325.45 - samples/sec: 305.88 - lr: 0.000146 - momentum: 0.000000 |
|
2023-10-13 13:39:28,058 epoch 2 - iter 693/773 - loss 0.10308060 - time (sec): 369.79 - samples/sec: 302.56 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-13 13:40:10,324 epoch 2 - iter 770/773 - loss 0.10039870 - time (sec): 412.06 - samples/sec: 300.66 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-13 13:40:11,973 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:40:11,973 EPOCH 2 done: loss 0.1002 - lr: 0.000142 |
|
2023-10-13 13:40:31,298 DEV : loss 0.057838067412376404 - f1-score (micro avg) 0.7813 |
|
2023-10-13 13:40:31,328 saving best model |
|
2023-10-13 13:40:34,113 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:41:16,472 epoch 3 - iter 77/773 - loss 0.06633658 - time (sec): 42.36 - samples/sec: 301.00 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-13 13:41:58,905 epoch 3 - iter 154/773 - loss 0.06277207 - time (sec): 84.79 - samples/sec: 297.67 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-13 13:42:40,672 epoch 3 - iter 231/773 - loss 0.06457756 - time (sec): 126.55 - samples/sec: 291.46 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-13 13:43:23,276 epoch 3 - iter 308/773 - loss 0.06768297 - time (sec): 169.16 - samples/sec: 295.98 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-13 13:44:02,782 epoch 3 - iter 385/773 - loss 0.06578215 - time (sec): 208.66 - samples/sec: 297.09 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-13 13:44:44,904 epoch 3 - iter 462/773 - loss 0.06565666 - time (sec): 250.79 - samples/sec: 296.04 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-13 13:45:27,591 epoch 3 - iter 539/773 - loss 0.06483917 - time (sec): 293.47 - samples/sec: 296.59 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-13 13:46:07,958 epoch 3 - iter 616/773 - loss 0.06239288 - time (sec): 333.84 - samples/sec: 299.57 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-13 13:46:48,009 epoch 3 - iter 693/773 - loss 0.06147020 - time (sec): 373.89 - samples/sec: 300.51 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-13 13:47:27,430 epoch 3 - iter 770/773 - loss 0.06074472 - time (sec): 413.31 - samples/sec: 299.26 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-13 13:47:29,015 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:47:29,015 EPOCH 3 done: loss 0.0608 - lr: 0.000125 |
|
2023-10-13 13:47:46,060 DEV : loss 0.04795033112168312 - f1-score (micro avg) 0.786 |
|
2023-10-13 13:47:46,091 saving best model |
|
2023-10-13 13:47:48,760 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:48:31,247 epoch 4 - iter 77/773 - loss 0.04570769 - time (sec): 42.48 - samples/sec: 266.33 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-13 13:49:11,886 epoch 4 - iter 154/773 - loss 0.04142650 - time (sec): 83.12 - samples/sec: 293.56 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-13 13:49:50,782 epoch 4 - iter 231/773 - loss 0.04222222 - time (sec): 122.02 - samples/sec: 295.97 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-13 13:50:31,804 epoch 4 - iter 308/773 - loss 0.04243309 - time (sec): 163.04 - samples/sec: 306.86 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-13 13:51:11,088 epoch 4 - iter 385/773 - loss 0.03983764 - time (sec): 202.32 - samples/sec: 306.16 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-13 13:51:51,434 epoch 4 - iter 462/773 - loss 0.03841069 - time (sec): 242.67 - samples/sec: 307.84 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-13 13:52:31,480 epoch 4 - iter 539/773 - loss 0.03825856 - time (sec): 282.72 - samples/sec: 308.09 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-13 13:53:11,592 epoch 4 - iter 616/773 - loss 0.03669827 - time (sec): 322.83 - samples/sec: 309.31 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-13 13:53:51,388 epoch 4 - iter 693/773 - loss 0.03804052 - time (sec): 362.62 - samples/sec: 308.06 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-13 13:54:31,531 epoch 4 - iter 770/773 - loss 0.03792700 - time (sec): 402.77 - samples/sec: 307.42 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-13 13:54:33,017 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:54:33,018 EPOCH 4 done: loss 0.0378 - lr: 0.000107 |
|
2023-10-13 13:54:49,877 DEV : loss 0.06375983357429504 - f1-score (micro avg) 0.8 |
|
2023-10-13 13:54:49,910 saving best model |
|
2023-10-13 13:54:50,939 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:55:30,979 epoch 5 - iter 77/773 - loss 0.02552778 - time (sec): 40.04 - samples/sec: 305.94 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-13 13:56:14,288 epoch 5 - iter 154/773 - loss 0.02578947 - time (sec): 83.35 - samples/sec: 292.86 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-13 13:56:55,531 epoch 5 - iter 231/773 - loss 0.02476470 - time (sec): 124.59 - samples/sec: 303.09 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-13 13:57:36,632 epoch 5 - iter 308/773 - loss 0.02496423 - time (sec): 165.69 - samples/sec: 305.85 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-13 13:58:16,866 epoch 5 - iter 385/773 - loss 0.02406075 - time (sec): 205.92 - samples/sec: 302.56 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-13 13:58:57,095 epoch 5 - iter 462/773 - loss 0.02389150 - time (sec): 246.15 - samples/sec: 303.26 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-13 13:59:36,789 epoch 5 - iter 539/773 - loss 0.02394706 - time (sec): 285.85 - samples/sec: 302.28 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-13 14:00:17,171 epoch 5 - iter 616/773 - loss 0.02525134 - time (sec): 326.23 - samples/sec: 305.03 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-13 14:00:56,105 epoch 5 - iter 693/773 - loss 0.02473743 - time (sec): 365.16 - samples/sec: 305.88 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-13 14:01:35,740 epoch 5 - iter 770/773 - loss 0.02498417 - time (sec): 404.80 - samples/sec: 305.61 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-13 14:01:37,312 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:01:37,312 EPOCH 5 done: loss 0.0251 - lr: 0.000089 |
|
2023-10-13 14:01:54,361 DEV : loss 0.06829117983579636 - f1-score (micro avg) 0.818 |
|
2023-10-13 14:01:54,402 saving best model |
|
2023-10-13 14:01:57,066 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:02:37,096 epoch 6 - iter 77/773 - loss 0.01718220 - time (sec): 40.02 - samples/sec: 279.69 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-13 14:03:16,814 epoch 6 - iter 154/773 - loss 0.01550972 - time (sec): 79.74 - samples/sec: 303.25 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-13 14:03:56,340 epoch 6 - iter 231/773 - loss 0.01618194 - time (sec): 119.27 - samples/sec: 304.29 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-13 14:04:35,237 epoch 6 - iter 308/773 - loss 0.01557615 - time (sec): 158.17 - samples/sec: 306.65 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-13 14:05:14,712 epoch 6 - iter 385/773 - loss 0.01677557 - time (sec): 197.64 - samples/sec: 306.89 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-13 14:05:55,045 epoch 6 - iter 462/773 - loss 0.01629595 - time (sec): 237.97 - samples/sec: 309.87 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-13 14:06:34,474 epoch 6 - iter 539/773 - loss 0.01552389 - time (sec): 277.40 - samples/sec: 310.32 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-13 14:07:15,189 epoch 6 - iter 616/773 - loss 0.01710578 - time (sec): 318.12 - samples/sec: 311.28 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-13 14:07:56,193 epoch 6 - iter 693/773 - loss 0.01669410 - time (sec): 359.12 - samples/sec: 310.20 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-13 14:08:37,693 epoch 6 - iter 770/773 - loss 0.01618705 - time (sec): 400.62 - samples/sec: 308.81 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-13 14:08:39,327 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:08:39,327 EPOCH 6 done: loss 0.0162 - lr: 0.000071 |
|
2023-10-13 14:08:57,695 DEV : loss 0.08230733126401901 - f1-score (micro avg) 0.8056 |
|
2023-10-13 14:08:57,725 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:09:38,784 epoch 7 - iter 77/773 - loss 0.01104098 - time (sec): 41.06 - samples/sec: 301.10 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-13 14:10:21,690 epoch 7 - iter 154/773 - loss 0.01084070 - time (sec): 83.96 - samples/sec: 290.25 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-13 14:11:03,996 epoch 7 - iter 231/773 - loss 0.00993163 - time (sec): 126.27 - samples/sec: 289.09 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-13 14:11:45,648 epoch 7 - iter 308/773 - loss 0.00931498 - time (sec): 167.92 - samples/sec: 297.79 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-13 14:12:25,998 epoch 7 - iter 385/773 - loss 0.00990930 - time (sec): 208.27 - samples/sec: 300.74 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-13 14:13:08,002 epoch 7 - iter 462/773 - loss 0.01091458 - time (sec): 250.27 - samples/sec: 297.45 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-13 14:13:50,331 epoch 7 - iter 539/773 - loss 0.01055709 - time (sec): 292.60 - samples/sec: 295.05 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-13 14:14:31,155 epoch 7 - iter 616/773 - loss 0.01103797 - time (sec): 333.43 - samples/sec: 296.92 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-13 14:15:12,601 epoch 7 - iter 693/773 - loss 0.01095294 - time (sec): 374.87 - samples/sec: 297.76 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-13 14:15:57,007 epoch 7 - iter 770/773 - loss 0.01084300 - time (sec): 419.28 - samples/sec: 295.63 - lr: 0.000054 - momentum: 0.000000 |
|
2023-10-13 14:15:58,471 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:15:58,471 EPOCH 7 done: loss 0.0112 - lr: 0.000054 |
|
2023-10-13 14:16:15,743 DEV : loss 0.08712616562843323 - f1-score (micro avg) 0.8209 |
|
2023-10-13 14:16:15,772 saving best model |
|
2023-10-13 14:16:18,511 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:16:58,837 epoch 8 - iter 77/773 - loss 0.00793344 - time (sec): 40.32 - samples/sec: 299.32 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-13 14:17:41,619 epoch 8 - iter 154/773 - loss 0.00868923 - time (sec): 83.10 - samples/sec: 302.77 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-13 14:18:22,789 epoch 8 - iter 231/773 - loss 0.00820784 - time (sec): 124.27 - samples/sec: 299.14 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 14:19:05,552 epoch 8 - iter 308/773 - loss 0.00790837 - time (sec): 167.04 - samples/sec: 291.71 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 14:19:44,332 epoch 8 - iter 385/773 - loss 0.00769877 - time (sec): 205.82 - samples/sec: 291.56 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 14:20:24,710 epoch 8 - iter 462/773 - loss 0.00752209 - time (sec): 246.19 - samples/sec: 297.01 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 14:21:09,307 epoch 8 - iter 539/773 - loss 0.00819953 - time (sec): 290.79 - samples/sec: 297.36 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 14:21:51,152 epoch 8 - iter 616/773 - loss 0.00792088 - time (sec): 332.64 - samples/sec: 298.68 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 14:22:31,008 epoch 8 - iter 693/773 - loss 0.00757495 - time (sec): 372.49 - samples/sec: 299.83 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 14:23:10,962 epoch 8 - iter 770/773 - loss 0.00734145 - time (sec): 412.45 - samples/sec: 299.81 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-13 14:23:12,536 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:23:12,537 EPOCH 8 done: loss 0.0073 - lr: 0.000036 |
|
2023-10-13 14:23:29,813 DEV : loss 0.0935787558555603 - f1-score (micro avg) 0.8129 |
|
2023-10-13 14:23:29,844 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:24:10,868 epoch 9 - iter 77/773 - loss 0.00369340 - time (sec): 41.02 - samples/sec: 304.99 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-13 14:24:52,522 epoch 9 - iter 154/773 - loss 0.00441737 - time (sec): 82.68 - samples/sec: 306.66 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 14:25:34,148 epoch 9 - iter 231/773 - loss 0.00418139 - time (sec): 124.30 - samples/sec: 306.22 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 14:26:18,115 epoch 9 - iter 308/773 - loss 0.00465322 - time (sec): 168.27 - samples/sec: 297.99 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 14:26:58,721 epoch 9 - iter 385/773 - loss 0.00496921 - time (sec): 208.87 - samples/sec: 301.32 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 14:27:41,960 epoch 9 - iter 462/773 - loss 0.00516472 - time (sec): 252.11 - samples/sec: 298.31 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 14:28:24,233 epoch 9 - iter 539/773 - loss 0.00499233 - time (sec): 294.39 - samples/sec: 294.84 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 14:29:05,825 epoch 9 - iter 616/773 - loss 0.00572259 - time (sec): 335.98 - samples/sec: 296.74 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 14:29:51,119 epoch 9 - iter 693/773 - loss 0.00573440 - time (sec): 381.27 - samples/sec: 293.32 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 14:30:33,427 epoch 9 - iter 770/773 - loss 0.00567544 - time (sec): 423.58 - samples/sec: 292.77 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 14:30:34,870 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:30:34,871 EPOCH 9 done: loss 0.0057 - lr: 0.000018 |
|
2023-10-13 14:30:52,483 DEV : loss 0.09646561741828918 - f1-score (micro avg) 0.816 |
|
2023-10-13 14:30:52,518 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:31:38,164 epoch 10 - iter 77/773 - loss 0.00271756 - time (sec): 45.64 - samples/sec: 269.06 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 14:32:20,990 epoch 10 - iter 154/773 - loss 0.00479762 - time (sec): 88.47 - samples/sec: 284.55 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 14:33:05,141 epoch 10 - iter 231/773 - loss 0.00480447 - time (sec): 132.62 - samples/sec: 282.47 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 14:33:48,679 epoch 10 - iter 308/773 - loss 0.00412616 - time (sec): 176.16 - samples/sec: 282.82 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 14:34:28,762 epoch 10 - iter 385/773 - loss 0.00410371 - time (sec): 216.24 - samples/sec: 287.37 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 14:35:08,671 epoch 10 - iter 462/773 - loss 0.00414590 - time (sec): 256.15 - samples/sec: 288.15 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 14:35:47,002 epoch 10 - iter 539/773 - loss 0.00433560 - time (sec): 294.48 - samples/sec: 292.35 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 14:36:24,903 epoch 10 - iter 616/773 - loss 0.00453624 - time (sec): 332.38 - samples/sec: 294.15 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 14:37:04,210 epoch 10 - iter 693/773 - loss 0.00469727 - time (sec): 371.69 - samples/sec: 299.05 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 14:37:44,090 epoch 10 - iter 770/773 - loss 0.00452566 - time (sec): 411.57 - samples/sec: 300.46 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 14:37:45,747 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:37:45,747 EPOCH 10 done: loss 0.0045 - lr: 0.000000 |
|
2023-10-13 14:38:02,933 DEV : loss 0.09914136677980423 - f1-score (micro avg) 0.8129 |
|
2023-10-13 14:38:03,928 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:38:03,930 Loading model from best epoch ... |
|
2023-10-13 14:38:08,507 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET |
|
2023-10-13 14:39:03,429 |
|
Results: |
|
- F-score (micro) 0.8076 |
|
- F-score (macro) 0.725 |
|
- Accuracy 0.6957 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8396 0.8742 0.8566 946 |
|
BUILDING 0.5746 0.5622 0.5683 185 |
|
STREET 0.7031 0.8036 0.7500 56 |
|
|
|
micro avg 0.7935 0.8222 0.8076 1187 |
|
macro avg 0.7058 0.7466 0.7250 1187 |
|
weighted avg 0.7919 0.8222 0.8066 1187 |
|
|
|
2023-10-13 14:39:03,429 ---------------------------------------------------------------------------------------------------- |
|
|