stefan-it's picture
Upload folder using huggingface_hub
3bf487e
2023-10-13 09:32:31,304 ----------------------------------------------------------------------------------------------------
2023-10-13 09:32:31,306 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 09:32:31,306 ----------------------------------------------------------------------------------------------------
2023-10-13 09:32:31,307 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-13 09:32:31,307 ----------------------------------------------------------------------------------------------------
2023-10-13 09:32:31,307 Train: 6183 sentences
2023-10-13 09:32:31,307 (train_with_dev=False, train_with_test=False)
2023-10-13 09:32:31,307 ----------------------------------------------------------------------------------------------------
2023-10-13 09:32:31,307 Training Params:
2023-10-13 09:32:31,307 - learning_rate: "0.00015"
2023-10-13 09:32:31,307 - mini_batch_size: "4"
2023-10-13 09:32:31,307 - max_epochs: "10"
2023-10-13 09:32:31,307 - shuffle: "True"
2023-10-13 09:32:31,308 ----------------------------------------------------------------------------------------------------
2023-10-13 09:32:31,308 Plugins:
2023-10-13 09:32:31,308 - TensorboardLogger
2023-10-13 09:32:31,308 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 09:32:31,308 ----------------------------------------------------------------------------------------------------
2023-10-13 09:32:31,308 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 09:32:31,308 - metric: "('micro avg', 'f1-score')"
2023-10-13 09:32:31,308 ----------------------------------------------------------------------------------------------------
2023-10-13 09:32:31,308 Computation:
2023-10-13 09:32:31,308 - compute on device: cuda:0
2023-10-13 09:32:31,308 - embedding storage: none
2023-10-13 09:32:31,308 ----------------------------------------------------------------------------------------------------
2023-10-13 09:32:31,308 Model training base path: "hmbench-topres19th/en-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-1"
2023-10-13 09:32:31,309 ----------------------------------------------------------------------------------------------------
2023-10-13 09:32:31,309 ----------------------------------------------------------------------------------------------------
2023-10-13 09:32:31,309 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-13 09:33:15,220 epoch 1 - iter 154/1546 - loss 2.58089173 - time (sec): 43.91 - samples/sec: 283.06 - lr: 0.000015 - momentum: 0.000000
2023-10-13 09:33:57,641 epoch 1 - iter 308/1546 - loss 2.49475826 - time (sec): 86.33 - samples/sec: 283.99 - lr: 0.000030 - momentum: 0.000000
2023-10-13 09:34:40,223 epoch 1 - iter 462/1546 - loss 2.24605833 - time (sec): 128.91 - samples/sec: 280.80 - lr: 0.000045 - momentum: 0.000000
2023-10-13 09:35:23,021 epoch 1 - iter 616/1546 - loss 1.91937604 - time (sec): 171.71 - samples/sec: 288.08 - lr: 0.000060 - momentum: 0.000000
2023-10-13 09:36:05,203 epoch 1 - iter 770/1546 - loss 1.63450044 - time (sec): 213.89 - samples/sec: 287.57 - lr: 0.000075 - momentum: 0.000000
2023-10-13 09:36:48,056 epoch 1 - iter 924/1546 - loss 1.40442989 - time (sec): 256.74 - samples/sec: 286.70 - lr: 0.000090 - momentum: 0.000000
2023-10-13 09:37:31,857 epoch 1 - iter 1078/1546 - loss 1.23099071 - time (sec): 300.55 - samples/sec: 287.02 - lr: 0.000104 - momentum: 0.000000
2023-10-13 09:38:14,433 epoch 1 - iter 1232/1546 - loss 1.10986676 - time (sec): 343.12 - samples/sec: 284.50 - lr: 0.000119 - momentum: 0.000000
2023-10-13 09:38:58,506 epoch 1 - iter 1386/1546 - loss 0.99545808 - time (sec): 387.20 - samples/sec: 287.13 - lr: 0.000134 - momentum: 0.000000
2023-10-13 09:39:41,758 epoch 1 - iter 1540/1546 - loss 0.90547295 - time (sec): 430.45 - samples/sec: 287.74 - lr: 0.000149 - momentum: 0.000000
2023-10-13 09:39:43,282 ----------------------------------------------------------------------------------------------------
2023-10-13 09:39:43,282 EPOCH 1 done: loss 0.9029 - lr: 0.000149
2023-10-13 09:39:59,716 DEV : loss 0.08553236722946167 - f1-score (micro avg) 0.5611
2023-10-13 09:39:59,746 saving best model
2023-10-13 09:40:00,659 ----------------------------------------------------------------------------------------------------
2023-10-13 09:40:42,617 epoch 2 - iter 154/1546 - loss 0.12200903 - time (sec): 41.96 - samples/sec: 278.77 - lr: 0.000148 - momentum: 0.000000
2023-10-13 09:41:25,024 epoch 2 - iter 308/1546 - loss 0.11893143 - time (sec): 84.36 - samples/sec: 279.44 - lr: 0.000147 - momentum: 0.000000
2023-10-13 09:42:09,579 epoch 2 - iter 462/1546 - loss 0.11299889 - time (sec): 128.92 - samples/sec: 283.78 - lr: 0.000145 - momentum: 0.000000
2023-10-13 09:42:52,516 epoch 2 - iter 616/1546 - loss 0.10688952 - time (sec): 171.85 - samples/sec: 287.48 - lr: 0.000143 - momentum: 0.000000
2023-10-13 09:43:35,911 epoch 2 - iter 770/1546 - loss 0.10165840 - time (sec): 215.25 - samples/sec: 288.56 - lr: 0.000142 - momentum: 0.000000
2023-10-13 09:44:18,065 epoch 2 - iter 924/1546 - loss 0.09938803 - time (sec): 257.40 - samples/sec: 285.76 - lr: 0.000140 - momentum: 0.000000
2023-10-13 09:45:01,291 epoch 2 - iter 1078/1546 - loss 0.09941699 - time (sec): 300.63 - samples/sec: 284.82 - lr: 0.000138 - momentum: 0.000000
2023-10-13 09:45:44,201 epoch 2 - iter 1232/1546 - loss 0.09981071 - time (sec): 343.54 - samples/sec: 284.64 - lr: 0.000137 - momentum: 0.000000
2023-10-13 09:46:28,346 epoch 2 - iter 1386/1546 - loss 0.09642655 - time (sec): 387.69 - samples/sec: 287.09 - lr: 0.000135 - momentum: 0.000000
2023-10-13 09:47:11,211 epoch 2 - iter 1540/1546 - loss 0.09319186 - time (sec): 430.55 - samples/sec: 287.59 - lr: 0.000133 - momentum: 0.000000
2023-10-13 09:47:12,875 ----------------------------------------------------------------------------------------------------
2023-10-13 09:47:12,876 EPOCH 2 done: loss 0.0930 - lr: 0.000133
2023-10-13 09:47:30,832 DEV : loss 0.05797132849693298 - f1-score (micro avg) 0.7951
2023-10-13 09:47:30,866 saving best model
2023-10-13 09:47:33,588 ----------------------------------------------------------------------------------------------------
2023-10-13 09:48:17,153 epoch 3 - iter 154/1546 - loss 0.07139060 - time (sec): 43.56 - samples/sec: 279.60 - lr: 0.000132 - momentum: 0.000000
2023-10-13 09:49:00,659 epoch 3 - iter 308/1546 - loss 0.06425129 - time (sec): 87.06 - samples/sec: 279.79 - lr: 0.000130 - momentum: 0.000000
2023-10-13 09:49:43,640 epoch 3 - iter 462/1546 - loss 0.05602698 - time (sec): 130.05 - samples/sec: 287.03 - lr: 0.000128 - momentum: 0.000000
2023-10-13 09:50:25,253 epoch 3 - iter 616/1546 - loss 0.05576436 - time (sec): 171.66 - samples/sec: 281.80 - lr: 0.000127 - momentum: 0.000000
2023-10-13 09:51:08,706 epoch 3 - iter 770/1546 - loss 0.05520624 - time (sec): 215.11 - samples/sec: 282.93 - lr: 0.000125 - momentum: 0.000000
2023-10-13 09:51:51,867 epoch 3 - iter 924/1546 - loss 0.05690583 - time (sec): 258.27 - samples/sec: 284.51 - lr: 0.000123 - momentum: 0.000000
2023-10-13 09:52:34,906 epoch 3 - iter 1078/1546 - loss 0.05578666 - time (sec): 301.31 - samples/sec: 286.31 - lr: 0.000122 - momentum: 0.000000
2023-10-13 09:53:18,728 epoch 3 - iter 1232/1546 - loss 0.05491332 - time (sec): 345.13 - samples/sec: 286.89 - lr: 0.000120 - momentum: 0.000000
2023-10-13 09:54:02,189 epoch 3 - iter 1386/1546 - loss 0.05522112 - time (sec): 388.59 - samples/sec: 288.15 - lr: 0.000118 - momentum: 0.000000
2023-10-13 09:54:44,026 epoch 3 - iter 1540/1546 - loss 0.05495390 - time (sec): 430.43 - samples/sec: 287.77 - lr: 0.000117 - momentum: 0.000000
2023-10-13 09:54:45,676 ----------------------------------------------------------------------------------------------------
2023-10-13 09:54:45,677 EPOCH 3 done: loss 0.0548 - lr: 0.000117
2023-10-13 09:55:03,127 DEV : loss 0.059545643627643585 - f1-score (micro avg) 0.7921
2023-10-13 09:55:03,161 ----------------------------------------------------------------------------------------------------
2023-10-13 09:55:47,609 epoch 4 - iter 154/1546 - loss 0.03703647 - time (sec): 44.45 - samples/sec: 284.50 - lr: 0.000115 - momentum: 0.000000
2023-10-13 09:56:30,722 epoch 4 - iter 308/1546 - loss 0.03106991 - time (sec): 87.56 - samples/sec: 283.25 - lr: 0.000113 - momentum: 0.000000
2023-10-13 09:57:14,974 epoch 4 - iter 462/1546 - loss 0.02898605 - time (sec): 131.81 - samples/sec: 285.23 - lr: 0.000112 - momentum: 0.000000
2023-10-13 09:57:58,608 epoch 4 - iter 616/1546 - loss 0.03074237 - time (sec): 175.44 - samples/sec: 276.89 - lr: 0.000110 - momentum: 0.000000
2023-10-13 09:58:43,273 epoch 4 - iter 770/1546 - loss 0.03301947 - time (sec): 220.11 - samples/sec: 277.51 - lr: 0.000108 - momentum: 0.000000
2023-10-13 09:59:30,913 epoch 4 - iter 924/1546 - loss 0.03399197 - time (sec): 267.75 - samples/sec: 275.87 - lr: 0.000107 - momentum: 0.000000
2023-10-13 10:00:17,635 epoch 4 - iter 1078/1546 - loss 0.03359479 - time (sec): 314.47 - samples/sec: 275.40 - lr: 0.000105 - momentum: 0.000000
2023-10-13 10:01:04,322 epoch 4 - iter 1232/1546 - loss 0.03330922 - time (sec): 361.16 - samples/sec: 274.56 - lr: 0.000103 - momentum: 0.000000
2023-10-13 10:01:50,860 epoch 4 - iter 1386/1546 - loss 0.03362633 - time (sec): 407.70 - samples/sec: 274.21 - lr: 0.000102 - momentum: 0.000000
2023-10-13 10:02:37,520 epoch 4 - iter 1540/1546 - loss 0.03347558 - time (sec): 454.36 - samples/sec: 272.59 - lr: 0.000100 - momentum: 0.000000
2023-10-13 10:02:39,239 ----------------------------------------------------------------------------------------------------
2023-10-13 10:02:39,239 EPOCH 4 done: loss 0.0334 - lr: 0.000100
2023-10-13 10:02:57,010 DEV : loss 0.07305894047021866 - f1-score (micro avg) 0.7929
2023-10-13 10:02:57,040 ----------------------------------------------------------------------------------------------------
2023-10-13 10:03:44,621 epoch 5 - iter 154/1546 - loss 0.01538529 - time (sec): 47.58 - samples/sec: 265.85 - lr: 0.000098 - momentum: 0.000000
2023-10-13 10:04:30,440 epoch 5 - iter 308/1546 - loss 0.02039276 - time (sec): 93.40 - samples/sec: 275.59 - lr: 0.000097 - momentum: 0.000000
2023-10-13 10:05:13,779 epoch 5 - iter 462/1546 - loss 0.02114600 - time (sec): 136.74 - samples/sec: 273.31 - lr: 0.000095 - momentum: 0.000000
2023-10-13 10:05:55,130 epoch 5 - iter 616/1546 - loss 0.02172052 - time (sec): 178.09 - samples/sec: 280.56 - lr: 0.000093 - momentum: 0.000000
2023-10-13 10:06:35,667 epoch 5 - iter 770/1546 - loss 0.02047965 - time (sec): 218.62 - samples/sec: 283.38 - lr: 0.000092 - momentum: 0.000000
2023-10-13 10:07:16,245 epoch 5 - iter 924/1546 - loss 0.02169642 - time (sec): 259.20 - samples/sec: 286.55 - lr: 0.000090 - momentum: 0.000000
2023-10-13 10:07:59,279 epoch 5 - iter 1078/1546 - loss 0.02168417 - time (sec): 302.24 - samples/sec: 285.58 - lr: 0.000088 - momentum: 0.000000
2023-10-13 10:08:42,699 epoch 5 - iter 1232/1546 - loss 0.02271991 - time (sec): 345.66 - samples/sec: 284.75 - lr: 0.000087 - momentum: 0.000000
2023-10-13 10:09:25,956 epoch 5 - iter 1386/1546 - loss 0.02322415 - time (sec): 388.91 - samples/sec: 285.85 - lr: 0.000085 - momentum: 0.000000
2023-10-13 10:10:09,101 epoch 5 - iter 1540/1546 - loss 0.02247912 - time (sec): 432.06 - samples/sec: 286.30 - lr: 0.000083 - momentum: 0.000000
2023-10-13 10:10:10,819 ----------------------------------------------------------------------------------------------------
2023-10-13 10:10:10,820 EPOCH 5 done: loss 0.0225 - lr: 0.000083
2023-10-13 10:10:27,483 DEV : loss 0.08502887934446335 - f1-score (micro avg) 0.7992
2023-10-13 10:10:27,511 saving best model
2023-10-13 10:10:30,134 ----------------------------------------------------------------------------------------------------
2023-10-13 10:11:15,341 epoch 6 - iter 154/1546 - loss 0.01540326 - time (sec): 45.20 - samples/sec: 291.20 - lr: 0.000082 - momentum: 0.000000
2023-10-13 10:11:59,889 epoch 6 - iter 308/1546 - loss 0.01726855 - time (sec): 89.75 - samples/sec: 287.52 - lr: 0.000080 - momentum: 0.000000
2023-10-13 10:12:45,995 epoch 6 - iter 462/1546 - loss 0.01478350 - time (sec): 135.86 - samples/sec: 285.80 - lr: 0.000078 - momentum: 0.000000
2023-10-13 10:13:28,634 epoch 6 - iter 616/1546 - loss 0.01486102 - time (sec): 178.50 - samples/sec: 284.86 - lr: 0.000077 - momentum: 0.000000
2023-10-13 10:14:13,366 epoch 6 - iter 770/1546 - loss 0.01271067 - time (sec): 223.23 - samples/sec: 283.34 - lr: 0.000075 - momentum: 0.000000
2023-10-13 10:14:56,982 epoch 6 - iter 924/1546 - loss 0.01374191 - time (sec): 266.84 - samples/sec: 277.93 - lr: 0.000073 - momentum: 0.000000
2023-10-13 10:15:40,790 epoch 6 - iter 1078/1546 - loss 0.01309584 - time (sec): 310.65 - samples/sec: 279.56 - lr: 0.000072 - momentum: 0.000000
2023-10-13 10:16:24,499 epoch 6 - iter 1232/1546 - loss 0.01254562 - time (sec): 354.36 - samples/sec: 278.96 - lr: 0.000070 - momentum: 0.000000
2023-10-13 10:17:08,743 epoch 6 - iter 1386/1546 - loss 0.01378174 - time (sec): 398.60 - samples/sec: 277.93 - lr: 0.000068 - momentum: 0.000000
2023-10-13 10:17:53,163 epoch 6 - iter 1540/1546 - loss 0.01418016 - time (sec): 443.02 - samples/sec: 279.19 - lr: 0.000067 - momentum: 0.000000
2023-10-13 10:17:54,953 ----------------------------------------------------------------------------------------------------
2023-10-13 10:17:54,953 EPOCH 6 done: loss 0.0141 - lr: 0.000067
2023-10-13 10:18:12,876 DEV : loss 0.09621559828519821 - f1-score (micro avg) 0.7873
2023-10-13 10:18:12,919 ----------------------------------------------------------------------------------------------------
2023-10-13 10:18:57,224 epoch 7 - iter 154/1546 - loss 0.00910352 - time (sec): 44.30 - samples/sec: 265.40 - lr: 0.000065 - momentum: 0.000000
2023-10-13 10:19:42,915 epoch 7 - iter 308/1546 - loss 0.01161357 - time (sec): 89.99 - samples/sec: 277.88 - lr: 0.000063 - momentum: 0.000000
2023-10-13 10:20:28,122 epoch 7 - iter 462/1546 - loss 0.01227135 - time (sec): 135.20 - samples/sec: 272.24 - lr: 0.000062 - momentum: 0.000000
2023-10-13 10:21:12,787 epoch 7 - iter 616/1546 - loss 0.01136743 - time (sec): 179.87 - samples/sec: 270.08 - lr: 0.000060 - momentum: 0.000000
2023-10-13 10:21:56,983 epoch 7 - iter 770/1546 - loss 0.01071208 - time (sec): 224.06 - samples/sec: 266.68 - lr: 0.000058 - momentum: 0.000000
2023-10-13 10:22:42,738 epoch 7 - iter 924/1546 - loss 0.01094035 - time (sec): 269.82 - samples/sec: 269.52 - lr: 0.000057 - momentum: 0.000000
2023-10-13 10:23:28,286 epoch 7 - iter 1078/1546 - loss 0.01073439 - time (sec): 315.36 - samples/sec: 273.01 - lr: 0.000055 - momentum: 0.000000
2023-10-13 10:24:12,816 epoch 7 - iter 1232/1546 - loss 0.01035952 - time (sec): 359.89 - samples/sec: 274.29 - lr: 0.000053 - momentum: 0.000000
2023-10-13 10:24:58,624 epoch 7 - iter 1386/1546 - loss 0.00984261 - time (sec): 405.70 - samples/sec: 277.29 - lr: 0.000052 - momentum: 0.000000
2023-10-13 10:25:42,975 epoch 7 - iter 1540/1546 - loss 0.00997254 - time (sec): 450.05 - samples/sec: 275.15 - lr: 0.000050 - momentum: 0.000000
2023-10-13 10:25:44,642 ----------------------------------------------------------------------------------------------------
2023-10-13 10:25:44,642 EPOCH 7 done: loss 0.0100 - lr: 0.000050
2023-10-13 10:26:02,375 DEV : loss 0.10020222514867783 - f1-score (micro avg) 0.7886
2023-10-13 10:26:02,405 ----------------------------------------------------------------------------------------------------
2023-10-13 10:26:47,796 epoch 8 - iter 154/1546 - loss 0.01244001 - time (sec): 45.39 - samples/sec: 258.30 - lr: 0.000048 - momentum: 0.000000
2023-10-13 10:27:31,481 epoch 8 - iter 308/1546 - loss 0.00842998 - time (sec): 89.07 - samples/sec: 271.57 - lr: 0.000047 - momentum: 0.000000
2023-10-13 10:28:17,362 epoch 8 - iter 462/1546 - loss 0.00627451 - time (sec): 134.95 - samples/sec: 270.72 - lr: 0.000045 - momentum: 0.000000
2023-10-13 10:29:01,594 epoch 8 - iter 616/1546 - loss 0.00697601 - time (sec): 179.19 - samples/sec: 271.98 - lr: 0.000043 - momentum: 0.000000
2023-10-13 10:29:46,681 epoch 8 - iter 770/1546 - loss 0.00681597 - time (sec): 224.27 - samples/sec: 274.03 - lr: 0.000042 - momentum: 0.000000
2023-10-13 10:30:31,390 epoch 8 - iter 924/1546 - loss 0.00609977 - time (sec): 268.98 - samples/sec: 273.00 - lr: 0.000040 - momentum: 0.000000
2023-10-13 10:31:17,039 epoch 8 - iter 1078/1546 - loss 0.00559074 - time (sec): 314.63 - samples/sec: 273.38 - lr: 0.000038 - momentum: 0.000000
2023-10-13 10:32:01,870 epoch 8 - iter 1232/1546 - loss 0.00526804 - time (sec): 359.46 - samples/sec: 274.25 - lr: 0.000037 - momentum: 0.000000
2023-10-13 10:32:47,288 epoch 8 - iter 1386/1546 - loss 0.00531994 - time (sec): 404.88 - samples/sec: 275.89 - lr: 0.000035 - momentum: 0.000000
2023-10-13 10:33:31,162 epoch 8 - iter 1540/1546 - loss 0.00552273 - time (sec): 448.75 - samples/sec: 275.71 - lr: 0.000033 - momentum: 0.000000
2023-10-13 10:33:32,970 ----------------------------------------------------------------------------------------------------
2023-10-13 10:33:32,970 EPOCH 8 done: loss 0.0055 - lr: 0.000033
2023-10-13 10:33:50,993 DEV : loss 0.11565513908863068 - f1-score (micro avg) 0.7714
2023-10-13 10:33:51,025 ----------------------------------------------------------------------------------------------------
2023-10-13 10:34:35,062 epoch 9 - iter 154/1546 - loss 0.00353921 - time (sec): 44.03 - samples/sec: 288.53 - lr: 0.000032 - momentum: 0.000000
2023-10-13 10:35:19,747 epoch 9 - iter 308/1546 - loss 0.00481950 - time (sec): 88.72 - samples/sec: 293.87 - lr: 0.000030 - momentum: 0.000000
2023-10-13 10:36:02,894 epoch 9 - iter 462/1546 - loss 0.00429851 - time (sec): 131.87 - samples/sec: 287.58 - lr: 0.000028 - momentum: 0.000000
2023-10-13 10:36:47,612 epoch 9 - iter 616/1546 - loss 0.00413043 - time (sec): 176.58 - samples/sec: 287.83 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:37:30,752 epoch 9 - iter 770/1546 - loss 0.00537386 - time (sec): 219.72 - samples/sec: 282.70 - lr: 0.000025 - momentum: 0.000000
2023-10-13 10:38:15,023 epoch 9 - iter 924/1546 - loss 0.00487878 - time (sec): 264.00 - samples/sec: 282.14 - lr: 0.000023 - momentum: 0.000000
2023-10-13 10:38:59,330 epoch 9 - iter 1078/1546 - loss 0.00467215 - time (sec): 308.30 - samples/sec: 281.03 - lr: 0.000022 - momentum: 0.000000
2023-10-13 10:39:43,130 epoch 9 - iter 1232/1546 - loss 0.00481155 - time (sec): 352.10 - samples/sec: 281.30 - lr: 0.000020 - momentum: 0.000000
2023-10-13 10:40:27,690 epoch 9 - iter 1386/1546 - loss 0.00475832 - time (sec): 396.66 - samples/sec: 280.66 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:41:12,637 epoch 9 - iter 1540/1546 - loss 0.00468018 - time (sec): 441.61 - samples/sec: 280.11 - lr: 0.000017 - momentum: 0.000000
2023-10-13 10:41:14,448 ----------------------------------------------------------------------------------------------------
2023-10-13 10:41:14,449 EPOCH 9 done: loss 0.0047 - lr: 0.000017
2023-10-13 10:41:31,579 DEV : loss 0.11798277497291565 - f1-score (micro avg) 0.7903
2023-10-13 10:41:31,609 ----------------------------------------------------------------------------------------------------
2023-10-13 10:42:16,010 epoch 10 - iter 154/1546 - loss 0.00481621 - time (sec): 44.40 - samples/sec: 269.92 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:43:02,187 epoch 10 - iter 308/1546 - loss 0.00357679 - time (sec): 90.58 - samples/sec: 265.04 - lr: 0.000013 - momentum: 0.000000
2023-10-13 10:43:51,217 epoch 10 - iter 462/1546 - loss 0.00374515 - time (sec): 139.61 - samples/sec: 269.97 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:44:37,368 epoch 10 - iter 616/1546 - loss 0.00379115 - time (sec): 185.76 - samples/sec: 267.93 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:45:22,324 epoch 10 - iter 770/1546 - loss 0.00346057 - time (sec): 230.71 - samples/sec: 274.58 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:46:06,661 epoch 10 - iter 924/1546 - loss 0.00318152 - time (sec): 275.05 - samples/sec: 272.36 - lr: 0.000007 - momentum: 0.000000
2023-10-13 10:46:51,187 epoch 10 - iter 1078/1546 - loss 0.00287153 - time (sec): 319.58 - samples/sec: 270.85 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:47:35,782 epoch 10 - iter 1232/1546 - loss 0.00283244 - time (sec): 364.17 - samples/sec: 271.84 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:48:20,263 epoch 10 - iter 1386/1546 - loss 0.00268580 - time (sec): 408.65 - samples/sec: 273.03 - lr: 0.000002 - momentum: 0.000000
2023-10-13 10:49:04,820 epoch 10 - iter 1540/1546 - loss 0.00279265 - time (sec): 453.21 - samples/sec: 273.24 - lr: 0.000000 - momentum: 0.000000
2023-10-13 10:49:06,424 ----------------------------------------------------------------------------------------------------
2023-10-13 10:49:06,424 EPOCH 10 done: loss 0.0028 - lr: 0.000000
2023-10-13 10:49:24,834 DEV : loss 0.11898898333311081 - f1-score (micro avg) 0.7871
2023-10-13 10:49:25,906 ----------------------------------------------------------------------------------------------------
2023-10-13 10:49:25,908 Loading model from best epoch ...
2023-10-13 10:49:30,245 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-13 10:50:25,764
Results:
- F-score (micro) 0.8119
- F-score (macro) 0.7384
- Accuracy 0.7023
By class:
precision recall f1-score support
LOC 0.8253 0.8742 0.8491 946
BUILDING 0.6571 0.6216 0.6389 185
STREET 0.6769 0.7857 0.7273 56
micro avg 0.7939 0.8307 0.8119 1187
macro avg 0.7198 0.7605 0.7384 1187
weighted avg 0.7921 0.8307 0.8106 1187
2023-10-13 10:50:25,764 ----------------------------------------------------------------------------------------------------