stefan-it's picture
Upload folder using huggingface_hub
75ae156
raw
history blame
23.8 kB
2023-10-18 22:08:20,553 ----------------------------------------------------------------------------------------------------
2023-10-18 22:08:20,553 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 22:08:20,553 ----------------------------------------------------------------------------------------------------
2023-10-18 22:08:20,554 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-18 22:08:20,554 ----------------------------------------------------------------------------------------------------
2023-10-18 22:08:20,554 Train: 5777 sentences
2023-10-18 22:08:20,554 (train_with_dev=False, train_with_test=False)
2023-10-18 22:08:20,554 ----------------------------------------------------------------------------------------------------
2023-10-18 22:08:20,554 Training Params:
2023-10-18 22:08:20,554 - learning_rate: "3e-05"
2023-10-18 22:08:20,554 - mini_batch_size: "8"
2023-10-18 22:08:20,554 - max_epochs: "10"
2023-10-18 22:08:20,554 - shuffle: "True"
2023-10-18 22:08:20,554 ----------------------------------------------------------------------------------------------------
2023-10-18 22:08:20,554 Plugins:
2023-10-18 22:08:20,554 - TensorboardLogger
2023-10-18 22:08:20,554 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 22:08:20,554 ----------------------------------------------------------------------------------------------------
2023-10-18 22:08:20,554 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 22:08:20,554 - metric: "('micro avg', 'f1-score')"
2023-10-18 22:08:20,554 ----------------------------------------------------------------------------------------------------
2023-10-18 22:08:20,554 Computation:
2023-10-18 22:08:20,554 - compute on device: cuda:0
2023-10-18 22:08:20,554 - embedding storage: none
2023-10-18 22:08:20,554 ----------------------------------------------------------------------------------------------------
2023-10-18 22:08:20,554 Model training base path: "hmbench-icdar/nl-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-18 22:08:20,554 ----------------------------------------------------------------------------------------------------
2023-10-18 22:08:20,554 ----------------------------------------------------------------------------------------------------
2023-10-18 22:08:20,554 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 22:08:22,339 epoch 1 - iter 72/723 - loss 3.19628147 - time (sec): 1.78 - samples/sec: 9675.62 - lr: 0.000003 - momentum: 0.000000
2023-10-18 22:08:24,206 epoch 1 - iter 144/723 - loss 2.98886129 - time (sec): 3.65 - samples/sec: 9785.27 - lr: 0.000006 - momentum: 0.000000
2023-10-18 22:08:26,021 epoch 1 - iter 216/723 - loss 2.66211908 - time (sec): 5.47 - samples/sec: 9773.41 - lr: 0.000009 - momentum: 0.000000
2023-10-18 22:08:27,869 epoch 1 - iter 288/723 - loss 2.30691520 - time (sec): 7.31 - samples/sec: 9694.54 - lr: 0.000012 - momentum: 0.000000
2023-10-18 22:08:29,724 epoch 1 - iter 360/723 - loss 1.94819460 - time (sec): 9.17 - samples/sec: 9712.82 - lr: 0.000015 - momentum: 0.000000
2023-10-18 22:08:31,493 epoch 1 - iter 432/723 - loss 1.67747636 - time (sec): 10.94 - samples/sec: 9797.48 - lr: 0.000018 - momentum: 0.000000
2023-10-18 22:08:33,327 epoch 1 - iter 504/723 - loss 1.48735944 - time (sec): 12.77 - samples/sec: 9773.89 - lr: 0.000021 - momentum: 0.000000
2023-10-18 22:08:35,091 epoch 1 - iter 576/723 - loss 1.34523933 - time (sec): 14.54 - samples/sec: 9791.10 - lr: 0.000024 - momentum: 0.000000
2023-10-18 22:08:36,824 epoch 1 - iter 648/723 - loss 1.23775011 - time (sec): 16.27 - samples/sec: 9748.59 - lr: 0.000027 - momentum: 0.000000
2023-10-18 22:08:38,548 epoch 1 - iter 720/723 - loss 1.14588291 - time (sec): 17.99 - samples/sec: 9754.09 - lr: 0.000030 - momentum: 0.000000
2023-10-18 22:08:38,637 ----------------------------------------------------------------------------------------------------
2023-10-18 22:08:38,637 EPOCH 1 done: loss 1.1433 - lr: 0.000030
2023-10-18 22:08:39,957 DEV : loss 0.36958980560302734 - f1-score (micro avg) 0.0
2023-10-18 22:08:39,971 ----------------------------------------------------------------------------------------------------
2023-10-18 22:08:41,819 epoch 2 - iter 72/723 - loss 0.27018856 - time (sec): 1.85 - samples/sec: 10070.33 - lr: 0.000030 - momentum: 0.000000
2023-10-18 22:08:43,579 epoch 2 - iter 144/723 - loss 0.28145989 - time (sec): 3.61 - samples/sec: 9918.45 - lr: 0.000029 - momentum: 0.000000
2023-10-18 22:08:45,396 epoch 2 - iter 216/723 - loss 0.28554002 - time (sec): 5.42 - samples/sec: 9927.00 - lr: 0.000029 - momentum: 0.000000
2023-10-18 22:08:47,278 epoch 2 - iter 288/723 - loss 0.27514712 - time (sec): 7.31 - samples/sec: 9805.31 - lr: 0.000029 - momentum: 0.000000
2023-10-18 22:08:49,104 epoch 2 - iter 360/723 - loss 0.26170040 - time (sec): 9.13 - samples/sec: 9725.21 - lr: 0.000028 - momentum: 0.000000
2023-10-18 22:08:50,949 epoch 2 - iter 432/723 - loss 0.25484094 - time (sec): 10.98 - samples/sec: 9792.59 - lr: 0.000028 - momentum: 0.000000
2023-10-18 22:08:52,671 epoch 2 - iter 504/723 - loss 0.25302933 - time (sec): 12.70 - samples/sec: 9736.45 - lr: 0.000028 - momentum: 0.000000
2023-10-18 22:08:54,386 epoch 2 - iter 576/723 - loss 0.24981194 - time (sec): 14.42 - samples/sec: 9726.09 - lr: 0.000027 - momentum: 0.000000
2023-10-18 22:08:56,178 epoch 2 - iter 648/723 - loss 0.24711782 - time (sec): 16.21 - samples/sec: 9733.59 - lr: 0.000027 - momentum: 0.000000
2023-10-18 22:08:57,963 epoch 2 - iter 720/723 - loss 0.24099947 - time (sec): 17.99 - samples/sec: 9770.09 - lr: 0.000027 - momentum: 0.000000
2023-10-18 22:08:58,019 ----------------------------------------------------------------------------------------------------
2023-10-18 22:08:58,020 EPOCH 2 done: loss 0.2412 - lr: 0.000027
2023-10-18 22:09:00,129 DEV : loss 0.24794618785381317 - f1-score (micro avg) 0.1606
2023-10-18 22:09:00,145 saving best model
2023-10-18 22:09:00,182 ----------------------------------------------------------------------------------------------------
2023-10-18 22:09:02,154 epoch 3 - iter 72/723 - loss 0.21367677 - time (sec): 1.97 - samples/sec: 9104.26 - lr: 0.000026 - momentum: 0.000000
2023-10-18 22:09:03,878 epoch 3 - iter 144/723 - loss 0.21394777 - time (sec): 3.70 - samples/sec: 9404.23 - lr: 0.000026 - momentum: 0.000000
2023-10-18 22:09:05,631 epoch 3 - iter 216/723 - loss 0.20709447 - time (sec): 5.45 - samples/sec: 9586.09 - lr: 0.000026 - momentum: 0.000000
2023-10-18 22:09:07,509 epoch 3 - iter 288/723 - loss 0.19573277 - time (sec): 7.33 - samples/sec: 9663.93 - lr: 0.000025 - momentum: 0.000000
2023-10-18 22:09:09,309 epoch 3 - iter 360/723 - loss 0.19576608 - time (sec): 9.13 - samples/sec: 9638.03 - lr: 0.000025 - momentum: 0.000000
2023-10-18 22:09:11,120 epoch 3 - iter 432/723 - loss 0.19557293 - time (sec): 10.94 - samples/sec: 9644.08 - lr: 0.000025 - momentum: 0.000000
2023-10-18 22:09:12,825 epoch 3 - iter 504/723 - loss 0.19629812 - time (sec): 12.64 - samples/sec: 9660.73 - lr: 0.000024 - momentum: 0.000000
2023-10-18 22:09:14,649 epoch 3 - iter 576/723 - loss 0.19836773 - time (sec): 14.47 - samples/sec: 9671.57 - lr: 0.000024 - momentum: 0.000000
2023-10-18 22:09:16,513 epoch 3 - iter 648/723 - loss 0.19587120 - time (sec): 16.33 - samples/sec: 9673.08 - lr: 0.000024 - momentum: 0.000000
2023-10-18 22:09:18,375 epoch 3 - iter 720/723 - loss 0.19640335 - time (sec): 18.19 - samples/sec: 9661.70 - lr: 0.000023 - momentum: 0.000000
2023-10-18 22:09:18,427 ----------------------------------------------------------------------------------------------------
2023-10-18 22:09:18,427 EPOCH 3 done: loss 0.1963 - lr: 0.000023
2023-10-18 22:09:20,180 DEV : loss 0.23398783802986145 - f1-score (micro avg) 0.2566
2023-10-18 22:09:20,195 saving best model
2023-10-18 22:09:20,231 ----------------------------------------------------------------------------------------------------
2023-10-18 22:09:22,057 epoch 4 - iter 72/723 - loss 0.17513429 - time (sec): 1.82 - samples/sec: 9662.51 - lr: 0.000023 - momentum: 0.000000
2023-10-18 22:09:23,842 epoch 4 - iter 144/723 - loss 0.17318933 - time (sec): 3.61 - samples/sec: 9551.33 - lr: 0.000023 - momentum: 0.000000
2023-10-18 22:09:25,544 epoch 4 - iter 216/723 - loss 0.18058374 - time (sec): 5.31 - samples/sec: 9862.22 - lr: 0.000022 - momentum: 0.000000
2023-10-18 22:09:27,220 epoch 4 - iter 288/723 - loss 0.17700677 - time (sec): 6.99 - samples/sec: 10223.72 - lr: 0.000022 - momentum: 0.000000
2023-10-18 22:09:28,987 epoch 4 - iter 360/723 - loss 0.17605842 - time (sec): 8.75 - samples/sec: 10119.42 - lr: 0.000022 - momentum: 0.000000
2023-10-18 22:09:30,862 epoch 4 - iter 432/723 - loss 0.18006832 - time (sec): 10.63 - samples/sec: 10053.19 - lr: 0.000021 - momentum: 0.000000
2023-10-18 22:09:32,700 epoch 4 - iter 504/723 - loss 0.17871246 - time (sec): 12.47 - samples/sec: 10026.68 - lr: 0.000021 - momentum: 0.000000
2023-10-18 22:09:34,452 epoch 4 - iter 576/723 - loss 0.17713673 - time (sec): 14.22 - samples/sec: 10011.82 - lr: 0.000021 - momentum: 0.000000
2023-10-18 22:09:36,217 epoch 4 - iter 648/723 - loss 0.17726623 - time (sec): 15.98 - samples/sec: 9931.35 - lr: 0.000020 - momentum: 0.000000
2023-10-18 22:09:37,995 epoch 4 - iter 720/723 - loss 0.18135238 - time (sec): 17.76 - samples/sec: 9890.23 - lr: 0.000020 - momentum: 0.000000
2023-10-18 22:09:38,053 ----------------------------------------------------------------------------------------------------
2023-10-18 22:09:38,053 EPOCH 4 done: loss 0.1813 - lr: 0.000020
2023-10-18 22:09:40,150 DEV : loss 0.21179497241973877 - f1-score (micro avg) 0.4558
2023-10-18 22:09:40,164 saving best model
2023-10-18 22:09:40,198 ----------------------------------------------------------------------------------------------------
2023-10-18 22:09:42,041 epoch 5 - iter 72/723 - loss 0.19369549 - time (sec): 1.84 - samples/sec: 9869.29 - lr: 0.000020 - momentum: 0.000000
2023-10-18 22:09:43,801 epoch 5 - iter 144/723 - loss 0.18409304 - time (sec): 3.60 - samples/sec: 9974.14 - lr: 0.000019 - momentum: 0.000000
2023-10-18 22:09:45,514 epoch 5 - iter 216/723 - loss 0.18004779 - time (sec): 5.32 - samples/sec: 9724.56 - lr: 0.000019 - momentum: 0.000000
2023-10-18 22:09:47,330 epoch 5 - iter 288/723 - loss 0.17894839 - time (sec): 7.13 - samples/sec: 9574.37 - lr: 0.000019 - momentum: 0.000000
2023-10-18 22:09:49,078 epoch 5 - iter 360/723 - loss 0.17522117 - time (sec): 8.88 - samples/sec: 9572.50 - lr: 0.000018 - momentum: 0.000000
2023-10-18 22:09:50,902 epoch 5 - iter 432/723 - loss 0.17234762 - time (sec): 10.70 - samples/sec: 9692.67 - lr: 0.000018 - momentum: 0.000000
2023-10-18 22:09:52,702 epoch 5 - iter 504/723 - loss 0.17119392 - time (sec): 12.50 - samples/sec: 9721.98 - lr: 0.000018 - momentum: 0.000000
2023-10-18 22:09:54,427 epoch 5 - iter 576/723 - loss 0.17011314 - time (sec): 14.23 - samples/sec: 9782.07 - lr: 0.000017 - momentum: 0.000000
2023-10-18 22:09:56,281 epoch 5 - iter 648/723 - loss 0.17258346 - time (sec): 16.08 - samples/sec: 9755.92 - lr: 0.000017 - momentum: 0.000000
2023-10-18 22:09:58,160 epoch 5 - iter 720/723 - loss 0.17140290 - time (sec): 17.96 - samples/sec: 9775.89 - lr: 0.000017 - momentum: 0.000000
2023-10-18 22:09:58,222 ----------------------------------------------------------------------------------------------------
2023-10-18 22:09:58,223 EPOCH 5 done: loss 0.1715 - lr: 0.000017
2023-10-18 22:09:59,986 DEV : loss 0.20946592092514038 - f1-score (micro avg) 0.4311
2023-10-18 22:10:00,001 ----------------------------------------------------------------------------------------------------
2023-10-18 22:10:01,737 epoch 6 - iter 72/723 - loss 0.15932427 - time (sec): 1.74 - samples/sec: 9819.08 - lr: 0.000016 - momentum: 0.000000
2023-10-18 22:10:03,525 epoch 6 - iter 144/723 - loss 0.16318147 - time (sec): 3.52 - samples/sec: 9748.74 - lr: 0.000016 - momentum: 0.000000
2023-10-18 22:10:05,353 epoch 6 - iter 216/723 - loss 0.17376298 - time (sec): 5.35 - samples/sec: 9741.66 - lr: 0.000016 - momentum: 0.000000
2023-10-18 22:10:07,055 epoch 6 - iter 288/723 - loss 0.17504093 - time (sec): 7.05 - samples/sec: 9699.27 - lr: 0.000015 - momentum: 0.000000
2023-10-18 22:10:08,797 epoch 6 - iter 360/723 - loss 0.16835316 - time (sec): 8.79 - samples/sec: 9839.54 - lr: 0.000015 - momentum: 0.000000
2023-10-18 22:10:10,614 epoch 6 - iter 432/723 - loss 0.16479689 - time (sec): 10.61 - samples/sec: 9759.78 - lr: 0.000015 - momentum: 0.000000
2023-10-18 22:10:12,459 epoch 6 - iter 504/723 - loss 0.16660146 - time (sec): 12.46 - samples/sec: 9848.56 - lr: 0.000014 - momentum: 0.000000
2023-10-18 22:10:14,270 epoch 6 - iter 576/723 - loss 0.16492361 - time (sec): 14.27 - samples/sec: 9864.81 - lr: 0.000014 - momentum: 0.000000
2023-10-18 22:10:16,409 epoch 6 - iter 648/723 - loss 0.16587764 - time (sec): 16.41 - samples/sec: 9696.86 - lr: 0.000014 - momentum: 0.000000
2023-10-18 22:10:18,149 epoch 6 - iter 720/723 - loss 0.16381985 - time (sec): 18.15 - samples/sec: 9670.91 - lr: 0.000013 - momentum: 0.000000
2023-10-18 22:10:18,222 ----------------------------------------------------------------------------------------------------
2023-10-18 22:10:18,222 EPOCH 6 done: loss 0.1633 - lr: 0.000013
2023-10-18 22:10:19,990 DEV : loss 0.20532798767089844 - f1-score (micro avg) 0.437
2023-10-18 22:10:20,004 ----------------------------------------------------------------------------------------------------
2023-10-18 22:10:21,747 epoch 7 - iter 72/723 - loss 0.15945249 - time (sec): 1.74 - samples/sec: 9697.69 - lr: 0.000013 - momentum: 0.000000
2023-10-18 22:10:23,545 epoch 7 - iter 144/723 - loss 0.15900563 - time (sec): 3.54 - samples/sec: 9956.07 - lr: 0.000013 - momentum: 0.000000
2023-10-18 22:10:25,277 epoch 7 - iter 216/723 - loss 0.15756915 - time (sec): 5.27 - samples/sec: 9983.42 - lr: 0.000012 - momentum: 0.000000
2023-10-18 22:10:27,035 epoch 7 - iter 288/723 - loss 0.16114798 - time (sec): 7.03 - samples/sec: 9914.44 - lr: 0.000012 - momentum: 0.000000
2023-10-18 22:10:28,803 epoch 7 - iter 360/723 - loss 0.15938604 - time (sec): 8.80 - samples/sec: 9841.78 - lr: 0.000012 - momentum: 0.000000
2023-10-18 22:10:30,596 epoch 7 - iter 432/723 - loss 0.15944278 - time (sec): 10.59 - samples/sec: 9940.46 - lr: 0.000011 - momentum: 0.000000
2023-10-18 22:10:32,312 epoch 7 - iter 504/723 - loss 0.15758147 - time (sec): 12.31 - samples/sec: 9961.91 - lr: 0.000011 - momentum: 0.000000
2023-10-18 22:10:34,029 epoch 7 - iter 576/723 - loss 0.15825257 - time (sec): 14.02 - samples/sec: 9904.92 - lr: 0.000011 - momentum: 0.000000
2023-10-18 22:10:35,849 epoch 7 - iter 648/723 - loss 0.15932809 - time (sec): 15.84 - samples/sec: 9917.65 - lr: 0.000010 - momentum: 0.000000
2023-10-18 22:10:37,706 epoch 7 - iter 720/723 - loss 0.15736768 - time (sec): 17.70 - samples/sec: 9919.43 - lr: 0.000010 - momentum: 0.000000
2023-10-18 22:10:37,771 ----------------------------------------------------------------------------------------------------
2023-10-18 22:10:37,772 EPOCH 7 done: loss 0.1571 - lr: 0.000010
2023-10-18 22:10:39,536 DEV : loss 0.20203644037246704 - f1-score (micro avg) 0.435
2023-10-18 22:10:39,551 ----------------------------------------------------------------------------------------------------
2023-10-18 22:10:41,250 epoch 8 - iter 72/723 - loss 0.14938503 - time (sec): 1.70 - samples/sec: 9489.78 - lr: 0.000010 - momentum: 0.000000
2023-10-18 22:10:43,030 epoch 8 - iter 144/723 - loss 0.17061855 - time (sec): 3.48 - samples/sec: 9818.98 - lr: 0.000009 - momentum: 0.000000
2023-10-18 22:10:44,814 epoch 8 - iter 216/723 - loss 0.15925070 - time (sec): 5.26 - samples/sec: 10070.94 - lr: 0.000009 - momentum: 0.000000
2023-10-18 22:10:46,559 epoch 8 - iter 288/723 - loss 0.15425474 - time (sec): 7.01 - samples/sec: 10052.28 - lr: 0.000009 - momentum: 0.000000
2023-10-18 22:10:48,290 epoch 8 - iter 360/723 - loss 0.15406306 - time (sec): 8.74 - samples/sec: 10093.66 - lr: 0.000008 - momentum: 0.000000
2023-10-18 22:10:50,448 epoch 8 - iter 432/723 - loss 0.14990555 - time (sec): 10.90 - samples/sec: 9787.11 - lr: 0.000008 - momentum: 0.000000
2023-10-18 22:10:52,136 epoch 8 - iter 504/723 - loss 0.14942113 - time (sec): 12.59 - samples/sec: 9789.64 - lr: 0.000008 - momentum: 0.000000
2023-10-18 22:10:53,938 epoch 8 - iter 576/723 - loss 0.14926547 - time (sec): 14.39 - samples/sec: 9816.21 - lr: 0.000007 - momentum: 0.000000
2023-10-18 22:10:55,695 epoch 8 - iter 648/723 - loss 0.15071738 - time (sec): 16.14 - samples/sec: 9794.35 - lr: 0.000007 - momentum: 0.000000
2023-10-18 22:10:57,488 epoch 8 - iter 720/723 - loss 0.15325254 - time (sec): 17.94 - samples/sec: 9801.83 - lr: 0.000007 - momentum: 0.000000
2023-10-18 22:10:57,544 ----------------------------------------------------------------------------------------------------
2023-10-18 22:10:57,544 EPOCH 8 done: loss 0.1530 - lr: 0.000007
2023-10-18 22:10:59,325 DEV : loss 0.1937786489725113 - f1-score (micro avg) 0.4817
2023-10-18 22:10:59,340 saving best model
2023-10-18 22:10:59,377 ----------------------------------------------------------------------------------------------------
2023-10-18 22:11:01,186 epoch 9 - iter 72/723 - loss 0.13945487 - time (sec): 1.81 - samples/sec: 10785.70 - lr: 0.000006 - momentum: 0.000000
2023-10-18 22:11:02,934 epoch 9 - iter 144/723 - loss 0.13316947 - time (sec): 3.56 - samples/sec: 10339.76 - lr: 0.000006 - momentum: 0.000000
2023-10-18 22:11:04,648 epoch 9 - iter 216/723 - loss 0.13722754 - time (sec): 5.27 - samples/sec: 10152.37 - lr: 0.000006 - momentum: 0.000000
2023-10-18 22:11:06,400 epoch 9 - iter 288/723 - loss 0.14328557 - time (sec): 7.02 - samples/sec: 10075.44 - lr: 0.000005 - momentum: 0.000000
2023-10-18 22:11:08,177 epoch 9 - iter 360/723 - loss 0.14615266 - time (sec): 8.80 - samples/sec: 10049.16 - lr: 0.000005 - momentum: 0.000000
2023-10-18 22:11:09,894 epoch 9 - iter 432/723 - loss 0.14984252 - time (sec): 10.52 - samples/sec: 9960.70 - lr: 0.000005 - momentum: 0.000000
2023-10-18 22:11:11,597 epoch 9 - iter 504/723 - loss 0.15143593 - time (sec): 12.22 - samples/sec: 9905.91 - lr: 0.000004 - momentum: 0.000000
2023-10-18 22:11:13,468 epoch 9 - iter 576/723 - loss 0.15073005 - time (sec): 14.09 - samples/sec: 9992.68 - lr: 0.000004 - momentum: 0.000000
2023-10-18 22:11:15,257 epoch 9 - iter 648/723 - loss 0.15148546 - time (sec): 15.88 - samples/sec: 9978.29 - lr: 0.000004 - momentum: 0.000000
2023-10-18 22:11:16,977 epoch 9 - iter 720/723 - loss 0.15040735 - time (sec): 17.60 - samples/sec: 9978.12 - lr: 0.000003 - momentum: 0.000000
2023-10-18 22:11:17,043 ----------------------------------------------------------------------------------------------------
2023-10-18 22:11:17,043 EPOCH 9 done: loss 0.1503 - lr: 0.000003
2023-10-18 22:11:18,810 DEV : loss 0.1968904435634613 - f1-score (micro avg) 0.4678
2023-10-18 22:11:18,825 ----------------------------------------------------------------------------------------------------
2023-10-18 22:11:20,567 epoch 10 - iter 72/723 - loss 0.13221108 - time (sec): 1.74 - samples/sec: 9813.73 - lr: 0.000003 - momentum: 0.000000
2023-10-18 22:11:22,334 epoch 10 - iter 144/723 - loss 0.15218809 - time (sec): 3.51 - samples/sec: 9666.30 - lr: 0.000003 - momentum: 0.000000
2023-10-18 22:11:24,466 epoch 10 - iter 216/723 - loss 0.14769919 - time (sec): 5.64 - samples/sec: 9253.33 - lr: 0.000002 - momentum: 0.000000
2023-10-18 22:11:26,295 epoch 10 - iter 288/723 - loss 0.15037349 - time (sec): 7.47 - samples/sec: 9296.52 - lr: 0.000002 - momentum: 0.000000
2023-10-18 22:11:28,116 epoch 10 - iter 360/723 - loss 0.15714138 - time (sec): 9.29 - samples/sec: 9508.97 - lr: 0.000002 - momentum: 0.000000
2023-10-18 22:11:29,886 epoch 10 - iter 432/723 - loss 0.15652764 - time (sec): 11.06 - samples/sec: 9531.39 - lr: 0.000001 - momentum: 0.000000
2023-10-18 22:11:31,644 epoch 10 - iter 504/723 - loss 0.15303564 - time (sec): 12.82 - samples/sec: 9644.01 - lr: 0.000001 - momentum: 0.000000
2023-10-18 22:11:33,414 epoch 10 - iter 576/723 - loss 0.15136324 - time (sec): 14.59 - samples/sec: 9663.06 - lr: 0.000001 - momentum: 0.000000
2023-10-18 22:11:35,141 epoch 10 - iter 648/723 - loss 0.14914005 - time (sec): 16.32 - samples/sec: 9668.13 - lr: 0.000000 - momentum: 0.000000
2023-10-18 22:11:36,887 epoch 10 - iter 720/723 - loss 0.15021989 - time (sec): 18.06 - samples/sec: 9722.40 - lr: 0.000000 - momentum: 0.000000
2023-10-18 22:11:36,951 ----------------------------------------------------------------------------------------------------
2023-10-18 22:11:36,951 EPOCH 10 done: loss 0.1504 - lr: 0.000000
2023-10-18 22:11:38,720 DEV : loss 0.19765476882457733 - f1-score (micro avg) 0.4656
2023-10-18 22:11:38,766 ----------------------------------------------------------------------------------------------------
2023-10-18 22:11:38,766 Loading model from best epoch ...
2023-10-18 22:11:38,851 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-18 22:11:40,203
Results:
- F-score (micro) 0.4758
- F-score (macro) 0.3261
- Accuracy 0.3258
By class:
precision recall f1-score support
LOC 0.5020 0.5611 0.5299 458
PER 0.6822 0.3340 0.4485 482
ORG 0.0000 0.0000 0.0000 69
micro avg 0.5588 0.4143 0.4758 1009
macro avg 0.3947 0.2984 0.3261 1009
weighted avg 0.5537 0.4143 0.4548 1009
2023-10-18 22:11:40,203 ----------------------------------------------------------------------------------------------------