stefan-it's picture
Upload folder using huggingface_hub
5601a64
2023-10-18 14:38:14,540 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:14,540 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 14:38:14,540 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:14,540 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-18 14:38:14,540 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:14,540 Train: 1100 sentences
2023-10-18 14:38:14,540 (train_with_dev=False, train_with_test=False)
2023-10-18 14:38:14,541 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:14,541 Training Params:
2023-10-18 14:38:14,541 - learning_rate: "3e-05"
2023-10-18 14:38:14,541 - mini_batch_size: "8"
2023-10-18 14:38:14,541 - max_epochs: "10"
2023-10-18 14:38:14,541 - shuffle: "True"
2023-10-18 14:38:14,541 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:14,541 Plugins:
2023-10-18 14:38:14,541 - TensorboardLogger
2023-10-18 14:38:14,541 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 14:38:14,541 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:14,541 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 14:38:14,541 - metric: "('micro avg', 'f1-score')"
2023-10-18 14:38:14,541 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:14,541 Computation:
2023-10-18 14:38:14,541 - compute on device: cuda:0
2023-10-18 14:38:14,541 - embedding storage: none
2023-10-18 14:38:14,541 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:14,541 Model training base path: "hmbench-ajmc/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-18 14:38:14,541 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:14,541 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:14,541 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 14:38:14,848 epoch 1 - iter 13/138 - loss 3.42787215 - time (sec): 0.31 - samples/sec: 6987.95 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:38:15,162 epoch 1 - iter 26/138 - loss 3.48251826 - time (sec): 0.62 - samples/sec: 6967.48 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:38:15,452 epoch 1 - iter 39/138 - loss 3.45707520 - time (sec): 0.91 - samples/sec: 7358.72 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:38:15,773 epoch 1 - iter 52/138 - loss 3.40115221 - time (sec): 1.23 - samples/sec: 7204.58 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:38:16,061 epoch 1 - iter 65/138 - loss 3.33021941 - time (sec): 1.52 - samples/sec: 7112.47 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:38:16,357 epoch 1 - iter 78/138 - loss 3.22500797 - time (sec): 1.82 - samples/sec: 7222.11 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:38:16,650 epoch 1 - iter 91/138 - loss 3.12530247 - time (sec): 2.11 - samples/sec: 7265.17 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:38:16,930 epoch 1 - iter 104/138 - loss 2.99224979 - time (sec): 2.39 - samples/sec: 7299.44 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:38:17,214 epoch 1 - iter 117/138 - loss 2.86007327 - time (sec): 2.67 - samples/sec: 7325.59 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:38:17,497 epoch 1 - iter 130/138 - loss 2.73494472 - time (sec): 2.96 - samples/sec: 7314.81 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:38:17,666 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:17,666 EPOCH 1 done: loss 2.6554 - lr: 0.000028
2023-10-18 14:38:17,915 DEV : loss 0.9606797099113464 - f1-score (micro avg) 0.0
2023-10-18 14:38:17,919 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:18,204 epoch 2 - iter 13/138 - loss 1.28115198 - time (sec): 0.28 - samples/sec: 8112.17 - lr: 0.000030 - momentum: 0.000000
2023-10-18 14:38:18,481 epoch 2 - iter 26/138 - loss 1.22247167 - time (sec): 0.56 - samples/sec: 8217.08 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:38:18,751 epoch 2 - iter 39/138 - loss 1.17529456 - time (sec): 0.83 - samples/sec: 8114.55 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:38:19,030 epoch 2 - iter 52/138 - loss 1.19301558 - time (sec): 1.11 - samples/sec: 8079.87 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:38:19,311 epoch 2 - iter 65/138 - loss 1.15054120 - time (sec): 1.39 - samples/sec: 8019.54 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:38:19,583 epoch 2 - iter 78/138 - loss 1.08896784 - time (sec): 1.66 - samples/sec: 7931.71 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:38:19,853 epoch 2 - iter 91/138 - loss 1.06603856 - time (sec): 1.93 - samples/sec: 8000.84 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:38:20,127 epoch 2 - iter 104/138 - loss 1.04691388 - time (sec): 2.21 - samples/sec: 7997.04 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:38:20,399 epoch 2 - iter 117/138 - loss 1.02432136 - time (sec): 2.48 - samples/sec: 7882.77 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:38:20,673 epoch 2 - iter 130/138 - loss 1.00577462 - time (sec): 2.75 - samples/sec: 7873.38 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:38:20,848 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:20,848 EPOCH 2 done: loss 0.9953 - lr: 0.000027
2023-10-18 14:38:21,205 DEV : loss 0.7449945211410522 - f1-score (micro avg) 0.0
2023-10-18 14:38:21,209 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:21,477 epoch 3 - iter 13/138 - loss 0.76711907 - time (sec): 0.27 - samples/sec: 7577.44 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:38:21,747 epoch 3 - iter 26/138 - loss 0.79251460 - time (sec): 0.54 - samples/sec: 8039.92 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:38:22,025 epoch 3 - iter 39/138 - loss 0.80136781 - time (sec): 0.81 - samples/sec: 8007.48 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:38:22,298 epoch 3 - iter 52/138 - loss 0.81736818 - time (sec): 1.09 - samples/sec: 7967.11 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:38:22,575 epoch 3 - iter 65/138 - loss 0.82356278 - time (sec): 1.37 - samples/sec: 7890.39 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:38:22,849 epoch 3 - iter 78/138 - loss 0.79742494 - time (sec): 1.64 - samples/sec: 7934.41 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:38:23,126 epoch 3 - iter 91/138 - loss 0.78597508 - time (sec): 1.92 - samples/sec: 8009.42 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:38:23,388 epoch 3 - iter 104/138 - loss 0.78269442 - time (sec): 2.18 - samples/sec: 7921.74 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:38:23,673 epoch 3 - iter 117/138 - loss 0.78202263 - time (sec): 2.46 - samples/sec: 7945.78 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:38:23,954 epoch 3 - iter 130/138 - loss 0.77811493 - time (sec): 2.74 - samples/sec: 7871.90 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:38:24,118 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:24,119 EPOCH 3 done: loss 0.7888 - lr: 0.000024
2023-10-18 14:38:24,478 DEV : loss 0.6074901223182678 - f1-score (micro avg) 0.1916
2023-10-18 14:38:24,482 saving best model
2023-10-18 14:38:24,516 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:24,792 epoch 4 - iter 13/138 - loss 0.69111074 - time (sec): 0.28 - samples/sec: 7517.06 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:38:25,068 epoch 4 - iter 26/138 - loss 0.68967978 - time (sec): 0.55 - samples/sec: 7575.09 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:38:25,349 epoch 4 - iter 39/138 - loss 0.73796138 - time (sec): 0.83 - samples/sec: 7452.54 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:38:25,638 epoch 4 - iter 52/138 - loss 0.76281028 - time (sec): 1.12 - samples/sec: 7718.55 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:38:25,904 epoch 4 - iter 65/138 - loss 0.76142657 - time (sec): 1.39 - samples/sec: 7592.98 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:38:26,185 epoch 4 - iter 78/138 - loss 0.75061708 - time (sec): 1.67 - samples/sec: 7565.53 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:38:26,457 epoch 4 - iter 91/138 - loss 0.72383312 - time (sec): 1.94 - samples/sec: 7630.62 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:38:26,737 epoch 4 - iter 104/138 - loss 0.71866770 - time (sec): 2.22 - samples/sec: 7798.58 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:38:27,012 epoch 4 - iter 117/138 - loss 0.69964491 - time (sec): 2.50 - samples/sec: 7795.88 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:38:27,304 epoch 4 - iter 130/138 - loss 0.68823102 - time (sec): 2.79 - samples/sec: 7658.98 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:38:27,468 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:27,468 EPOCH 4 done: loss 0.6930 - lr: 0.000020
2023-10-18 14:38:27,946 DEV : loss 0.5647075772285461 - f1-score (micro avg) 0.2111
2023-10-18 14:38:27,950 saving best model
2023-10-18 14:38:27,993 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:28,281 epoch 5 - iter 13/138 - loss 0.64787288 - time (sec): 0.29 - samples/sec: 7293.55 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:38:28,558 epoch 5 - iter 26/138 - loss 0.63466903 - time (sec): 0.56 - samples/sec: 7449.34 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:38:28,849 epoch 5 - iter 39/138 - loss 0.64914643 - time (sec): 0.86 - samples/sec: 7544.42 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:38:29,135 epoch 5 - iter 52/138 - loss 0.64648724 - time (sec): 1.14 - samples/sec: 7655.75 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:38:29,409 epoch 5 - iter 65/138 - loss 0.63321568 - time (sec): 1.42 - samples/sec: 7525.80 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:38:29,683 epoch 5 - iter 78/138 - loss 0.63635691 - time (sec): 1.69 - samples/sec: 7620.68 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:38:29,961 epoch 5 - iter 91/138 - loss 0.63333452 - time (sec): 1.97 - samples/sec: 7583.05 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:38:30,250 epoch 5 - iter 104/138 - loss 0.63451704 - time (sec): 2.26 - samples/sec: 7616.08 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:38:30,523 epoch 5 - iter 117/138 - loss 0.64073674 - time (sec): 2.53 - samples/sec: 7716.98 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:38:30,804 epoch 5 - iter 130/138 - loss 0.63456250 - time (sec): 2.81 - samples/sec: 7713.56 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:38:30,961 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:30,961 EPOCH 5 done: loss 0.6263 - lr: 0.000017
2023-10-18 14:38:31,328 DEV : loss 0.48392239212989807 - f1-score (micro avg) 0.21
2023-10-18 14:38:31,332 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:31,617 epoch 6 - iter 13/138 - loss 0.48725556 - time (sec): 0.28 - samples/sec: 7742.71 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:38:31,890 epoch 6 - iter 26/138 - loss 0.56195688 - time (sec): 0.56 - samples/sec: 7511.27 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:38:32,160 epoch 6 - iter 39/138 - loss 0.53655390 - time (sec): 0.83 - samples/sec: 7723.74 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:38:32,442 epoch 6 - iter 52/138 - loss 0.56449056 - time (sec): 1.11 - samples/sec: 7644.72 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:38:32,731 epoch 6 - iter 65/138 - loss 0.56395550 - time (sec): 1.40 - samples/sec: 7771.83 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:38:33,027 epoch 6 - iter 78/138 - loss 0.55574844 - time (sec): 1.69 - samples/sec: 7734.52 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:38:33,308 epoch 6 - iter 91/138 - loss 0.55629215 - time (sec): 1.98 - samples/sec: 7726.92 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:38:33,594 epoch 6 - iter 104/138 - loss 0.56090428 - time (sec): 2.26 - samples/sec: 7659.56 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:38:33,870 epoch 6 - iter 117/138 - loss 0.57315331 - time (sec): 2.54 - samples/sec: 7722.16 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:38:34,148 epoch 6 - iter 130/138 - loss 0.57638650 - time (sec): 2.82 - samples/sec: 7720.57 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:38:34,309 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:34,309 EPOCH 6 done: loss 0.5776 - lr: 0.000014
2023-10-18 14:38:34,668 DEV : loss 0.4442043900489807 - f1-score (micro avg) 0.2294
2023-10-18 14:38:34,672 saving best model
2023-10-18 14:38:34,705 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:34,992 epoch 7 - iter 13/138 - loss 0.69764514 - time (sec): 0.29 - samples/sec: 8248.57 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:38:35,270 epoch 7 - iter 26/138 - loss 0.60658754 - time (sec): 0.56 - samples/sec: 7754.72 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:38:35,546 epoch 7 - iter 39/138 - loss 0.59882902 - time (sec): 0.84 - samples/sec: 7708.27 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:38:35,840 epoch 7 - iter 52/138 - loss 0.58612153 - time (sec): 1.13 - samples/sec: 7617.98 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:38:36,130 epoch 7 - iter 65/138 - loss 0.58244181 - time (sec): 1.42 - samples/sec: 7586.58 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:38:36,402 epoch 7 - iter 78/138 - loss 0.59223038 - time (sec): 1.70 - samples/sec: 7568.04 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:38:36,680 epoch 7 - iter 91/138 - loss 0.59171737 - time (sec): 1.97 - samples/sec: 7658.35 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:38:36,950 epoch 7 - iter 104/138 - loss 0.57494005 - time (sec): 2.24 - samples/sec: 7548.89 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:38:37,228 epoch 7 - iter 117/138 - loss 0.56569790 - time (sec): 2.52 - samples/sec: 7610.53 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:38:37,510 epoch 7 - iter 130/138 - loss 0.56322947 - time (sec): 2.80 - samples/sec: 7668.91 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:38:37,679 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:37,679 EPOCH 7 done: loss 0.5532 - lr: 0.000010
2023-10-18 14:38:38,043 DEV : loss 0.43072161078453064 - f1-score (micro avg) 0.2346
2023-10-18 14:38:38,047 saving best model
2023-10-18 14:38:38,079 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:38,348 epoch 8 - iter 13/138 - loss 0.57153390 - time (sec): 0.27 - samples/sec: 8201.63 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:38:38,630 epoch 8 - iter 26/138 - loss 0.53540769 - time (sec): 0.55 - samples/sec: 8186.02 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:38:38,894 epoch 8 - iter 39/138 - loss 0.53884416 - time (sec): 0.81 - samples/sec: 8111.99 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:38:39,178 epoch 8 - iter 52/138 - loss 0.53857478 - time (sec): 1.10 - samples/sec: 8224.36 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:38:39,461 epoch 8 - iter 65/138 - loss 0.52192734 - time (sec): 1.38 - samples/sec: 8146.33 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:38:39,754 epoch 8 - iter 78/138 - loss 0.51102128 - time (sec): 1.67 - samples/sec: 8028.46 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:38:40,024 epoch 8 - iter 91/138 - loss 0.52513268 - time (sec): 1.94 - samples/sec: 7901.44 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:38:40,298 epoch 8 - iter 104/138 - loss 0.52277626 - time (sec): 2.22 - samples/sec: 7781.01 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:38:40,591 epoch 8 - iter 117/138 - loss 0.53243215 - time (sec): 2.51 - samples/sec: 7743.90 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:38:40,865 epoch 8 - iter 130/138 - loss 0.52420580 - time (sec): 2.79 - samples/sec: 7764.33 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:38:41,047 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:41,048 EPOCH 8 done: loss 0.5247 - lr: 0.000007
2023-10-18 14:38:41,423 DEV : loss 0.41292256116867065 - f1-score (micro avg) 0.2802
2023-10-18 14:38:41,427 saving best model
2023-10-18 14:38:41,460 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:41,755 epoch 9 - iter 13/138 - loss 0.54045131 - time (sec): 0.29 - samples/sec: 6805.70 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:38:42,030 epoch 9 - iter 26/138 - loss 0.56127177 - time (sec): 0.57 - samples/sec: 7432.30 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:38:42,309 epoch 9 - iter 39/138 - loss 0.53449289 - time (sec): 0.85 - samples/sec: 7488.30 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:38:42,574 epoch 9 - iter 52/138 - loss 0.56653102 - time (sec): 1.11 - samples/sec: 7537.04 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:38:42,858 epoch 9 - iter 65/138 - loss 0.55371709 - time (sec): 1.40 - samples/sec: 7683.74 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:38:43,137 epoch 9 - iter 78/138 - loss 0.56519871 - time (sec): 1.68 - samples/sec: 7687.46 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:38:43,408 epoch 9 - iter 91/138 - loss 0.54495227 - time (sec): 1.95 - samples/sec: 7770.18 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:38:43,673 epoch 9 - iter 104/138 - loss 0.54330485 - time (sec): 2.21 - samples/sec: 7790.25 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:38:43,955 epoch 9 - iter 117/138 - loss 0.52681779 - time (sec): 2.49 - samples/sec: 7844.58 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:38:44,226 epoch 9 - iter 130/138 - loss 0.52333961 - time (sec): 2.77 - samples/sec: 7833.96 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:38:44,385 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:44,385 EPOCH 9 done: loss 0.5205 - lr: 0.000004
2023-10-18 14:38:44,768 DEV : loss 0.40466219186782837 - f1-score (micro avg) 0.3118
2023-10-18 14:38:44,774 saving best model
2023-10-18 14:38:44,814 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:45,136 epoch 10 - iter 13/138 - loss 0.45236464 - time (sec): 0.32 - samples/sec: 6698.15 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:38:45,429 epoch 10 - iter 26/138 - loss 0.47710264 - time (sec): 0.61 - samples/sec: 7512.50 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:38:45,695 epoch 10 - iter 39/138 - loss 0.49120506 - time (sec): 0.88 - samples/sec: 7463.97 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:38:45,978 epoch 10 - iter 52/138 - loss 0.49226869 - time (sec): 1.16 - samples/sec: 7361.35 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:38:46,258 epoch 10 - iter 65/138 - loss 0.49369741 - time (sec): 1.44 - samples/sec: 7360.41 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:38:46,545 epoch 10 - iter 78/138 - loss 0.49007079 - time (sec): 1.73 - samples/sec: 7356.63 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:38:46,839 epoch 10 - iter 91/138 - loss 0.48900916 - time (sec): 2.02 - samples/sec: 7475.96 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:38:47,132 epoch 10 - iter 104/138 - loss 0.50180782 - time (sec): 2.32 - samples/sec: 7478.18 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:38:47,401 epoch 10 - iter 117/138 - loss 0.50564597 - time (sec): 2.59 - samples/sec: 7498.13 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:38:47,685 epoch 10 - iter 130/138 - loss 0.51106164 - time (sec): 2.87 - samples/sec: 7487.58 - lr: 0.000000 - momentum: 0.000000
2023-10-18 14:38:47,850 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:47,850 EPOCH 10 done: loss 0.5089 - lr: 0.000000
2023-10-18 14:38:48,216 DEV : loss 0.4004524350166321 - f1-score (micro avg) 0.3173
2023-10-18 14:38:48,220 saving best model
2023-10-18 14:38:48,285 ----------------------------------------------------------------------------------------------------
2023-10-18 14:38:48,286 Loading model from best epoch ...
2023-10-18 14:38:48,365 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-18 14:38:48,653
Results:
- F-score (micro) 0.414
- F-score (macro) 0.2141
- Accuracy 0.262
By class:
precision recall f1-score support
scope 0.5795 0.5795 0.5795 176
work 0.1531 0.2027 0.1744 74
pers 0.8333 0.1953 0.3165 128
object 0.0000 0.0000 0.0000 2
loc 0.0000 0.0000 0.0000 2
micro avg 0.4671 0.3717 0.4140 382
macro avg 0.3132 0.1955 0.2141 382
weighted avg 0.5759 0.3717 0.4068 382
2023-10-18 14:38:48,653 ----------------------------------------------------------------------------------------------------