stefan-it's picture
Upload folder using huggingface_hub
c28c9eb
raw
history blame
24.2 kB
2023-10-18 20:21:09,554 ----------------------------------------------------------------------------------------------------
2023-10-18 20:21:09,555 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
2023-10-18 20:21:09,555 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
2023-10-18 20:21:09,555 Train: 7936 sentences
2023-10-18 20:21:09,555 (train_with_dev=False, train_with_test=False)
2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
2023-10-18 20:21:09,555 Training Params:
2023-10-18 20:21:09,555 - learning_rate: "3e-05"
2023-10-18 20:21:09,555 - mini_batch_size: "4"
2023-10-18 20:21:09,555 - max_epochs: "10"
2023-10-18 20:21:09,555 - shuffle: "True"
2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
2023-10-18 20:21:09,555 Plugins:
2023-10-18 20:21:09,555 - TensorboardLogger
2023-10-18 20:21:09,555 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
2023-10-18 20:21:09,555 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 20:21:09,555 - metric: "('micro avg', 'f1-score')"
2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
2023-10-18 20:21:09,555 Computation:
2023-10-18 20:21:09,555 - compute on device: cuda:0
2023-10-18 20:21:09,555 - embedding storage: none
2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
2023-10-18 20:21:09,556 Model training base path: "hmbench-icdar/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-18 20:21:09,556 ----------------------------------------------------------------------------------------------------
2023-10-18 20:21:09,556 ----------------------------------------------------------------------------------------------------
2023-10-18 20:21:09,556 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 20:21:13,160 epoch 1 - iter 198/1984 - loss 3.19239065 - time (sec): 3.60 - samples/sec: 4520.00 - lr: 0.000003 - momentum: 0.000000
2023-10-18 20:21:16,212 epoch 1 - iter 396/1984 - loss 2.81648826 - time (sec): 6.66 - samples/sec: 4879.36 - lr: 0.000006 - momentum: 0.000000
2023-10-18 20:21:19,275 epoch 1 - iter 594/1984 - loss 2.28737400 - time (sec): 9.72 - samples/sec: 5062.60 - lr: 0.000009 - momentum: 0.000000
2023-10-18 20:21:22,346 epoch 1 - iter 792/1984 - loss 1.86592392 - time (sec): 12.79 - samples/sec: 5140.12 - lr: 0.000012 - momentum: 0.000000
2023-10-18 20:21:25,305 epoch 1 - iter 990/1984 - loss 1.59866230 - time (sec): 15.75 - samples/sec: 5171.07 - lr: 0.000015 - momentum: 0.000000
2023-10-18 20:21:28,333 epoch 1 - iter 1188/1984 - loss 1.40641142 - time (sec): 18.78 - samples/sec: 5213.98 - lr: 0.000018 - momentum: 0.000000
2023-10-18 20:21:31,352 epoch 1 - iter 1386/1984 - loss 1.26714919 - time (sec): 21.80 - samples/sec: 5230.21 - lr: 0.000021 - momentum: 0.000000
2023-10-18 20:21:34,376 epoch 1 - iter 1584/1984 - loss 1.15707396 - time (sec): 24.82 - samples/sec: 5276.51 - lr: 0.000024 - momentum: 0.000000
2023-10-18 20:21:37,250 epoch 1 - iter 1782/1984 - loss 1.07044664 - time (sec): 27.69 - samples/sec: 5313.14 - lr: 0.000027 - momentum: 0.000000
2023-10-18 20:21:40,102 epoch 1 - iter 1980/1984 - loss 0.99619007 - time (sec): 30.55 - samples/sec: 5356.70 - lr: 0.000030 - momentum: 0.000000
2023-10-18 20:21:40,161 ----------------------------------------------------------------------------------------------------
2023-10-18 20:21:40,161 EPOCH 1 done: loss 0.9947 - lr: 0.000030
2023-10-18 20:21:41,606 DEV : loss 0.23625816404819489 - f1-score (micro avg) 0.2282
2023-10-18 20:21:41,623 saving best model
2023-10-18 20:21:41,659 ----------------------------------------------------------------------------------------------------
2023-10-18 20:21:44,713 epoch 2 - iter 198/1984 - loss 0.34878745 - time (sec): 3.05 - samples/sec: 5568.75 - lr: 0.000030 - momentum: 0.000000
2023-10-18 20:21:47,768 epoch 2 - iter 396/1984 - loss 0.32539556 - time (sec): 6.11 - samples/sec: 5539.51 - lr: 0.000029 - momentum: 0.000000
2023-10-18 20:21:51,158 epoch 2 - iter 594/1984 - loss 0.31675399 - time (sec): 9.50 - samples/sec: 5273.19 - lr: 0.000029 - momentum: 0.000000
2023-10-18 20:21:54,182 epoch 2 - iter 792/1984 - loss 0.30310641 - time (sec): 12.52 - samples/sec: 5269.51 - lr: 0.000029 - momentum: 0.000000
2023-10-18 20:21:57,250 epoch 2 - iter 990/1984 - loss 0.29671952 - time (sec): 15.59 - samples/sec: 5326.88 - lr: 0.000028 - momentum: 0.000000
2023-10-18 20:22:00,316 epoch 2 - iter 1188/1984 - loss 0.29371486 - time (sec): 18.66 - samples/sec: 5332.97 - lr: 0.000028 - momentum: 0.000000
2023-10-18 20:22:03,364 epoch 2 - iter 1386/1984 - loss 0.28559705 - time (sec): 21.70 - samples/sec: 5357.01 - lr: 0.000028 - momentum: 0.000000
2023-10-18 20:22:06,430 epoch 2 - iter 1584/1984 - loss 0.28450858 - time (sec): 24.77 - samples/sec: 5353.75 - lr: 0.000027 - momentum: 0.000000
2023-10-18 20:22:09,535 epoch 2 - iter 1782/1984 - loss 0.28203045 - time (sec): 27.87 - samples/sec: 5316.94 - lr: 0.000027 - momentum: 0.000000
2023-10-18 20:22:12,574 epoch 2 - iter 1980/1984 - loss 0.27933585 - time (sec): 30.91 - samples/sec: 5293.09 - lr: 0.000027 - momentum: 0.000000
2023-10-18 20:22:12,634 ----------------------------------------------------------------------------------------------------
2023-10-18 20:22:12,634 EPOCH 2 done: loss 0.2791 - lr: 0.000027
2023-10-18 20:22:14,454 DEV : loss 0.17796094715595245 - f1-score (micro avg) 0.3782
2023-10-18 20:22:14,472 saving best model
2023-10-18 20:22:14,505 ----------------------------------------------------------------------------------------------------
2023-10-18 20:22:17,635 epoch 3 - iter 198/1984 - loss 0.21732311 - time (sec): 3.13 - samples/sec: 5209.34 - lr: 0.000026 - momentum: 0.000000
2023-10-18 20:22:20,674 epoch 3 - iter 396/1984 - loss 0.21382431 - time (sec): 6.17 - samples/sec: 5318.22 - lr: 0.000026 - momentum: 0.000000
2023-10-18 20:22:23,712 epoch 3 - iter 594/1984 - loss 0.23635522 - time (sec): 9.21 - samples/sec: 5292.83 - lr: 0.000026 - momentum: 0.000000
2023-10-18 20:22:26,797 epoch 3 - iter 792/1984 - loss 0.23384899 - time (sec): 12.29 - samples/sec: 5342.49 - lr: 0.000025 - momentum: 0.000000
2023-10-18 20:22:29,630 epoch 3 - iter 990/1984 - loss 0.23319557 - time (sec): 15.12 - samples/sec: 5373.24 - lr: 0.000025 - momentum: 0.000000
2023-10-18 20:22:32,703 epoch 3 - iter 1188/1984 - loss 0.23261618 - time (sec): 18.20 - samples/sec: 5391.85 - lr: 0.000025 - momentum: 0.000000
2023-10-18 20:22:35,702 epoch 3 - iter 1386/1984 - loss 0.23393194 - time (sec): 21.20 - samples/sec: 5382.13 - lr: 0.000024 - momentum: 0.000000
2023-10-18 20:22:38,761 epoch 3 - iter 1584/1984 - loss 0.23335714 - time (sec): 24.26 - samples/sec: 5394.11 - lr: 0.000024 - momentum: 0.000000
2023-10-18 20:22:41,803 epoch 3 - iter 1782/1984 - loss 0.23015129 - time (sec): 27.30 - samples/sec: 5396.62 - lr: 0.000024 - momentum: 0.000000
2023-10-18 20:22:44,850 epoch 3 - iter 1980/1984 - loss 0.22849262 - time (sec): 30.34 - samples/sec: 5389.37 - lr: 0.000023 - momentum: 0.000000
2023-10-18 20:22:44,920 ----------------------------------------------------------------------------------------------------
2023-10-18 20:22:44,920 EPOCH 3 done: loss 0.2286 - lr: 0.000023
2023-10-18 20:22:46,719 DEV : loss 0.15816588699817657 - f1-score (micro avg) 0.446
2023-10-18 20:22:46,736 saving best model
2023-10-18 20:22:46,766 ----------------------------------------------------------------------------------------------------
2023-10-18 20:22:49,823 epoch 4 - iter 198/1984 - loss 0.21892155 - time (sec): 3.06 - samples/sec: 5353.62 - lr: 0.000023 - momentum: 0.000000
2023-10-18 20:22:52,844 epoch 4 - iter 396/1984 - loss 0.22023973 - time (sec): 6.08 - samples/sec: 5341.75 - lr: 0.000023 - momentum: 0.000000
2023-10-18 20:22:55,849 epoch 4 - iter 594/1984 - loss 0.21519154 - time (sec): 9.08 - samples/sec: 5280.36 - lr: 0.000022 - momentum: 0.000000
2023-10-18 20:22:58,885 epoch 4 - iter 792/1984 - loss 0.21938168 - time (sec): 12.12 - samples/sec: 5231.69 - lr: 0.000022 - momentum: 0.000000
2023-10-18 20:23:01,883 epoch 4 - iter 990/1984 - loss 0.21550684 - time (sec): 15.12 - samples/sec: 5252.24 - lr: 0.000022 - momentum: 0.000000
2023-10-18 20:23:04,896 epoch 4 - iter 1188/1984 - loss 0.20933788 - time (sec): 18.13 - samples/sec: 5284.31 - lr: 0.000021 - momentum: 0.000000
2023-10-18 20:23:07,904 epoch 4 - iter 1386/1984 - loss 0.20913435 - time (sec): 21.14 - samples/sec: 5366.26 - lr: 0.000021 - momentum: 0.000000
2023-10-18 20:23:10,953 epoch 4 - iter 1584/1984 - loss 0.20611639 - time (sec): 24.19 - samples/sec: 5381.99 - lr: 0.000021 - momentum: 0.000000
2023-10-18 20:23:13,935 epoch 4 - iter 1782/1984 - loss 0.20643644 - time (sec): 27.17 - samples/sec: 5373.80 - lr: 0.000020 - momentum: 0.000000
2023-10-18 20:23:16,968 epoch 4 - iter 1980/1984 - loss 0.20253333 - time (sec): 30.20 - samples/sec: 5418.56 - lr: 0.000020 - momentum: 0.000000
2023-10-18 20:23:17,030 ----------------------------------------------------------------------------------------------------
2023-10-18 20:23:17,030 EPOCH 4 done: loss 0.2025 - lr: 0.000020
2023-10-18 20:23:18,839 DEV : loss 0.15379515290260315 - f1-score (micro avg) 0.5326
2023-10-18 20:23:18,856 saving best model
2023-10-18 20:23:18,889 ----------------------------------------------------------------------------------------------------
2023-10-18 20:23:21,899 epoch 5 - iter 198/1984 - loss 0.23060341 - time (sec): 3.01 - samples/sec: 4955.51 - lr: 0.000020 - momentum: 0.000000
2023-10-18 20:23:25,011 epoch 5 - iter 396/1984 - loss 0.19610837 - time (sec): 6.12 - samples/sec: 5314.25 - lr: 0.000019 - momentum: 0.000000
2023-10-18 20:23:28,034 epoch 5 - iter 594/1984 - loss 0.19599719 - time (sec): 9.14 - samples/sec: 5362.28 - lr: 0.000019 - momentum: 0.000000
2023-10-18 20:23:31,070 epoch 5 - iter 792/1984 - loss 0.19167242 - time (sec): 12.18 - samples/sec: 5419.94 - lr: 0.000019 - momentum: 0.000000
2023-10-18 20:23:34,098 epoch 5 - iter 990/1984 - loss 0.18849708 - time (sec): 15.21 - samples/sec: 5363.90 - lr: 0.000018 - momentum: 0.000000
2023-10-18 20:23:37,129 epoch 5 - iter 1188/1984 - loss 0.18868575 - time (sec): 18.24 - samples/sec: 5392.29 - lr: 0.000018 - momentum: 0.000000
2023-10-18 20:23:40,183 epoch 5 - iter 1386/1984 - loss 0.18996251 - time (sec): 21.29 - samples/sec: 5399.95 - lr: 0.000018 - momentum: 0.000000
2023-10-18 20:23:43,230 epoch 5 - iter 1584/1984 - loss 0.18960056 - time (sec): 24.34 - samples/sec: 5407.03 - lr: 0.000017 - momentum: 0.000000
2023-10-18 20:23:46,294 epoch 5 - iter 1782/1984 - loss 0.18960480 - time (sec): 27.40 - samples/sec: 5392.96 - lr: 0.000017 - momentum: 0.000000
2023-10-18 20:23:49,245 epoch 5 - iter 1980/1984 - loss 0.18902609 - time (sec): 30.35 - samples/sec: 5390.97 - lr: 0.000017 - momentum: 0.000000
2023-10-18 20:23:49,308 ----------------------------------------------------------------------------------------------------
2023-10-18 20:23:49,308 EPOCH 5 done: loss 0.1890 - lr: 0.000017
2023-10-18 20:23:51,124 DEV : loss 0.14417240023612976 - f1-score (micro avg) 0.5654
2023-10-18 20:23:51,141 saving best model
2023-10-18 20:23:51,174 ----------------------------------------------------------------------------------------------------
2023-10-18 20:23:54,192 epoch 6 - iter 198/1984 - loss 0.20691944 - time (sec): 3.02 - samples/sec: 5108.94 - lr: 0.000016 - momentum: 0.000000
2023-10-18 20:23:57,214 epoch 6 - iter 396/1984 - loss 0.18917630 - time (sec): 6.04 - samples/sec: 5249.12 - lr: 0.000016 - momentum: 0.000000
2023-10-18 20:24:00,217 epoch 6 - iter 594/1984 - loss 0.18665542 - time (sec): 9.04 - samples/sec: 5247.48 - lr: 0.000016 - momentum: 0.000000
2023-10-18 20:24:03,320 epoch 6 - iter 792/1984 - loss 0.18917319 - time (sec): 12.15 - samples/sec: 5235.56 - lr: 0.000015 - momentum: 0.000000
2023-10-18 20:24:06,415 epoch 6 - iter 990/1984 - loss 0.18919754 - time (sec): 15.24 - samples/sec: 5275.00 - lr: 0.000015 - momentum: 0.000000
2023-10-18 20:24:09,474 epoch 6 - iter 1188/1984 - loss 0.18272315 - time (sec): 18.30 - samples/sec: 5320.83 - lr: 0.000015 - momentum: 0.000000
2023-10-18 20:24:12,446 epoch 6 - iter 1386/1984 - loss 0.17852437 - time (sec): 21.27 - samples/sec: 5378.84 - lr: 0.000014 - momentum: 0.000000
2023-10-18 20:24:15,144 epoch 6 - iter 1584/1984 - loss 0.17782946 - time (sec): 23.97 - samples/sec: 5458.78 - lr: 0.000014 - momentum: 0.000000
2023-10-18 20:24:18,117 epoch 6 - iter 1782/1984 - loss 0.17992881 - time (sec): 26.94 - samples/sec: 5465.89 - lr: 0.000014 - momentum: 0.000000
2023-10-18 20:24:21,064 epoch 6 - iter 1980/1984 - loss 0.17716357 - time (sec): 29.89 - samples/sec: 5475.25 - lr: 0.000013 - momentum: 0.000000
2023-10-18 20:24:21,123 ----------------------------------------------------------------------------------------------------
2023-10-18 20:24:21,123 EPOCH 6 done: loss 0.1771 - lr: 0.000013
2023-10-18 20:24:22,948 DEV : loss 0.14219804108142853 - f1-score (micro avg) 0.5755
2023-10-18 20:24:22,965 saving best model
2023-10-18 20:24:22,998 ----------------------------------------------------------------------------------------------------
2023-10-18 20:24:26,033 epoch 7 - iter 198/1984 - loss 0.21070221 - time (sec): 3.03 - samples/sec: 5279.32 - lr: 0.000013 - momentum: 0.000000
2023-10-18 20:24:28,996 epoch 7 - iter 396/1984 - loss 0.18898161 - time (sec): 6.00 - samples/sec: 5395.21 - lr: 0.000013 - momentum: 0.000000
2023-10-18 20:24:32,004 epoch 7 - iter 594/1984 - loss 0.18467547 - time (sec): 9.00 - samples/sec: 5381.83 - lr: 0.000012 - momentum: 0.000000
2023-10-18 20:24:35,059 epoch 7 - iter 792/1984 - loss 0.17499372 - time (sec): 12.06 - samples/sec: 5446.69 - lr: 0.000012 - momentum: 0.000000
2023-10-18 20:24:38,049 epoch 7 - iter 990/1984 - loss 0.17390168 - time (sec): 15.05 - samples/sec: 5481.36 - lr: 0.000012 - momentum: 0.000000
2023-10-18 20:24:41,030 epoch 7 - iter 1188/1984 - loss 0.17296841 - time (sec): 18.03 - samples/sec: 5453.01 - lr: 0.000011 - momentum: 0.000000
2023-10-18 20:24:44,215 epoch 7 - iter 1386/1984 - loss 0.16940126 - time (sec): 21.22 - samples/sec: 5422.00 - lr: 0.000011 - momentum: 0.000000
2023-10-18 20:24:47,216 epoch 7 - iter 1584/1984 - loss 0.16927956 - time (sec): 24.22 - samples/sec: 5404.10 - lr: 0.000011 - momentum: 0.000000
2023-10-18 20:24:50,282 epoch 7 - iter 1782/1984 - loss 0.16921141 - time (sec): 27.28 - samples/sec: 5385.26 - lr: 0.000010 - momentum: 0.000000
2023-10-18 20:24:53,363 epoch 7 - iter 1980/1984 - loss 0.16921911 - time (sec): 30.36 - samples/sec: 5393.77 - lr: 0.000010 - momentum: 0.000000
2023-10-18 20:24:53,427 ----------------------------------------------------------------------------------------------------
2023-10-18 20:24:53,427 EPOCH 7 done: loss 0.1692 - lr: 0.000010
2023-10-18 20:24:55,548 DEV : loss 0.1411110907793045 - f1-score (micro avg) 0.5873
2023-10-18 20:24:55,564 saving best model
2023-10-18 20:24:55,599 ----------------------------------------------------------------------------------------------------
2023-10-18 20:24:58,631 epoch 8 - iter 198/1984 - loss 0.16894551 - time (sec): 3.03 - samples/sec: 5308.14 - lr: 0.000010 - momentum: 0.000000
2023-10-18 20:25:01,643 epoch 8 - iter 396/1984 - loss 0.16387924 - time (sec): 6.04 - samples/sec: 5258.74 - lr: 0.000009 - momentum: 0.000000
2023-10-18 20:25:04,674 epoch 8 - iter 594/1984 - loss 0.16514882 - time (sec): 9.07 - samples/sec: 5202.23 - lr: 0.000009 - momentum: 0.000000
2023-10-18 20:25:07,774 epoch 8 - iter 792/1984 - loss 0.16780131 - time (sec): 12.18 - samples/sec: 5274.47 - lr: 0.000009 - momentum: 0.000000
2023-10-18 20:25:10,830 epoch 8 - iter 990/1984 - loss 0.16733526 - time (sec): 15.23 - samples/sec: 5256.42 - lr: 0.000008 - momentum: 0.000000
2023-10-18 20:25:13,904 epoch 8 - iter 1188/1984 - loss 0.16373840 - time (sec): 18.30 - samples/sec: 5340.86 - lr: 0.000008 - momentum: 0.000000
2023-10-18 20:25:16,887 epoch 8 - iter 1386/1984 - loss 0.16181785 - time (sec): 21.29 - samples/sec: 5329.93 - lr: 0.000008 - momentum: 0.000000
2023-10-18 20:25:19,959 epoch 8 - iter 1584/1984 - loss 0.16227133 - time (sec): 24.36 - samples/sec: 5355.25 - lr: 0.000007 - momentum: 0.000000
2023-10-18 20:25:23,008 epoch 8 - iter 1782/1984 - loss 0.16335469 - time (sec): 27.41 - samples/sec: 5361.81 - lr: 0.000007 - momentum: 0.000000
2023-10-18 20:25:26,049 epoch 8 - iter 1980/1984 - loss 0.16311903 - time (sec): 30.45 - samples/sec: 5376.89 - lr: 0.000007 - momentum: 0.000000
2023-10-18 20:25:26,105 ----------------------------------------------------------------------------------------------------
2023-10-18 20:25:26,105 EPOCH 8 done: loss 0.1630 - lr: 0.000007
2023-10-18 20:25:27,913 DEV : loss 0.14169389009475708 - f1-score (micro avg) 0.6026
2023-10-18 20:25:27,932 saving best model
2023-10-18 20:25:27,966 ----------------------------------------------------------------------------------------------------
2023-10-18 20:25:31,010 epoch 9 - iter 198/1984 - loss 0.14872512 - time (sec): 3.04 - samples/sec: 5336.13 - lr: 0.000006 - momentum: 0.000000
2023-10-18 20:25:34,053 epoch 9 - iter 396/1984 - loss 0.16471370 - time (sec): 6.09 - samples/sec: 5390.47 - lr: 0.000006 - momentum: 0.000000
2023-10-18 20:25:37,062 epoch 9 - iter 594/1984 - loss 0.16359773 - time (sec): 9.10 - samples/sec: 5307.53 - lr: 0.000006 - momentum: 0.000000
2023-10-18 20:25:40,148 epoch 9 - iter 792/1984 - loss 0.16213124 - time (sec): 12.18 - samples/sec: 5246.00 - lr: 0.000005 - momentum: 0.000000
2023-10-18 20:25:43,182 epoch 9 - iter 990/1984 - loss 0.16071329 - time (sec): 15.21 - samples/sec: 5310.46 - lr: 0.000005 - momentum: 0.000000
2023-10-18 20:25:46,211 epoch 9 - iter 1188/1984 - loss 0.16040075 - time (sec): 18.24 - samples/sec: 5307.42 - lr: 0.000005 - momentum: 0.000000
2023-10-18 20:25:49,259 epoch 9 - iter 1386/1984 - loss 0.15955613 - time (sec): 21.29 - samples/sec: 5367.69 - lr: 0.000004 - momentum: 0.000000
2023-10-18 20:25:52,294 epoch 9 - iter 1584/1984 - loss 0.16153404 - time (sec): 24.33 - samples/sec: 5350.28 - lr: 0.000004 - momentum: 0.000000
2023-10-18 20:25:55,337 epoch 9 - iter 1782/1984 - loss 0.16026350 - time (sec): 27.37 - samples/sec: 5356.94 - lr: 0.000004 - momentum: 0.000000
2023-10-18 20:25:58,420 epoch 9 - iter 1980/1984 - loss 0.16022030 - time (sec): 30.45 - samples/sec: 5375.18 - lr: 0.000003 - momentum: 0.000000
2023-10-18 20:25:58,483 ----------------------------------------------------------------------------------------------------
2023-10-18 20:25:58,483 EPOCH 9 done: loss 0.1602 - lr: 0.000003
2023-10-18 20:26:00,280 DEV : loss 0.14077164232730865 - f1-score (micro avg) 0.5999
2023-10-18 20:26:00,296 ----------------------------------------------------------------------------------------------------
2023-10-18 20:26:03,366 epoch 10 - iter 198/1984 - loss 0.15429908 - time (sec): 3.07 - samples/sec: 5117.20 - lr: 0.000003 - momentum: 0.000000
2023-10-18 20:26:06,590 epoch 10 - iter 396/1984 - loss 0.16153646 - time (sec): 6.29 - samples/sec: 5099.32 - lr: 0.000003 - momentum: 0.000000
2023-10-18 20:26:09,606 epoch 10 - iter 594/1984 - loss 0.16009665 - time (sec): 9.31 - samples/sec: 5150.65 - lr: 0.000002 - momentum: 0.000000
2023-10-18 20:26:12,646 epoch 10 - iter 792/1984 - loss 0.15349866 - time (sec): 12.35 - samples/sec: 5233.90 - lr: 0.000002 - momentum: 0.000000
2023-10-18 20:26:15,665 epoch 10 - iter 990/1984 - loss 0.15500561 - time (sec): 15.37 - samples/sec: 5272.80 - lr: 0.000002 - momentum: 0.000000
2023-10-18 20:26:18,672 epoch 10 - iter 1188/1984 - loss 0.15624539 - time (sec): 18.37 - samples/sec: 5268.47 - lr: 0.000001 - momentum: 0.000000
2023-10-18 20:26:21,752 epoch 10 - iter 1386/1984 - loss 0.15463573 - time (sec): 21.46 - samples/sec: 5328.03 - lr: 0.000001 - momentum: 0.000000
2023-10-18 20:26:24,848 epoch 10 - iter 1584/1984 - loss 0.15358641 - time (sec): 24.55 - samples/sec: 5300.75 - lr: 0.000001 - momentum: 0.000000
2023-10-18 20:26:27,969 epoch 10 - iter 1782/1984 - loss 0.15337495 - time (sec): 27.67 - samples/sec: 5306.87 - lr: 0.000000 - momentum: 0.000000
2023-10-18 20:26:30,832 epoch 10 - iter 1980/1984 - loss 0.15484121 - time (sec): 30.54 - samples/sec: 5357.87 - lr: 0.000000 - momentum: 0.000000
2023-10-18 20:26:30,891 ----------------------------------------------------------------------------------------------------
2023-10-18 20:26:30,891 EPOCH 10 done: loss 0.1550 - lr: 0.000000
2023-10-18 20:26:32,699 DEV : loss 0.13979580998420715 - f1-score (micro avg) 0.6034
2023-10-18 20:26:32,718 saving best model
2023-10-18 20:26:32,781 ----------------------------------------------------------------------------------------------------
2023-10-18 20:26:32,782 Loading model from best epoch ...
2023-10-18 20:26:32,869 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-18 20:26:34,366
Results:
- F-score (micro) 0.6068
- F-score (macro) 0.436
- Accuracy 0.4819
By class:
precision recall f1-score support
LOC 0.7375 0.6992 0.7179 655
PER 0.4011 0.6547 0.4974 223
ORG 0.2917 0.0551 0.0927 127
micro avg 0.6056 0.6080 0.6068 1005
macro avg 0.4768 0.4697 0.4360 1005
weighted avg 0.6065 0.6080 0.5900 1005
2023-10-18 20:26:34,366 ----------------------------------------------------------------------------------------------------