stefan-it's picture
Upload folder using huggingface_hub
7ef46ac
raw
history blame
24.2 kB
2023-10-18 19:16:10,441 ----------------------------------------------------------------------------------------------------
2023-10-18 19:16:10,441 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 19:16:10,441 ----------------------------------------------------------------------------------------------------
2023-10-18 19:16:10,442 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
2023-10-18 19:16:10,442 ----------------------------------------------------------------------------------------------------
2023-10-18 19:16:10,442 Train: 5901 sentences
2023-10-18 19:16:10,442 (train_with_dev=False, train_with_test=False)
2023-10-18 19:16:10,442 ----------------------------------------------------------------------------------------------------
2023-10-18 19:16:10,442 Training Params:
2023-10-18 19:16:10,442 - learning_rate: "5e-05"
2023-10-18 19:16:10,442 - mini_batch_size: "8"
2023-10-18 19:16:10,442 - max_epochs: "10"
2023-10-18 19:16:10,442 - shuffle: "True"
2023-10-18 19:16:10,442 ----------------------------------------------------------------------------------------------------
2023-10-18 19:16:10,442 Plugins:
2023-10-18 19:16:10,442 - TensorboardLogger
2023-10-18 19:16:10,442 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 19:16:10,442 ----------------------------------------------------------------------------------------------------
2023-10-18 19:16:10,442 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 19:16:10,442 - metric: "('micro avg', 'f1-score')"
2023-10-18 19:16:10,442 ----------------------------------------------------------------------------------------------------
2023-10-18 19:16:10,442 Computation:
2023-10-18 19:16:10,442 - compute on device: cuda:0
2023-10-18 19:16:10,442 - embedding storage: none
2023-10-18 19:16:10,442 ----------------------------------------------------------------------------------------------------
2023-10-18 19:16:10,442 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-18 19:16:10,442 ----------------------------------------------------------------------------------------------------
2023-10-18 19:16:10,442 ----------------------------------------------------------------------------------------------------
2023-10-18 19:16:10,442 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 19:16:12,152 epoch 1 - iter 73/738 - loss 3.07586690 - time (sec): 1.71 - samples/sec: 8803.91 - lr: 0.000005 - momentum: 0.000000
2023-10-18 19:16:14,385 epoch 1 - iter 146/738 - loss 2.77274838 - time (sec): 3.94 - samples/sec: 8359.91 - lr: 0.000010 - momentum: 0.000000
2023-10-18 19:16:16,258 epoch 1 - iter 219/738 - loss 2.37894957 - time (sec): 5.82 - samples/sec: 8573.50 - lr: 0.000015 - momentum: 0.000000
2023-10-18 19:16:18,002 epoch 1 - iter 292/738 - loss 2.02832146 - time (sec): 7.56 - samples/sec: 8683.64 - lr: 0.000020 - momentum: 0.000000
2023-10-18 19:16:19,668 epoch 1 - iter 365/738 - loss 1.77711491 - time (sec): 9.23 - samples/sec: 8794.98 - lr: 0.000025 - momentum: 0.000000
2023-10-18 19:16:21,343 epoch 1 - iter 438/738 - loss 1.59379009 - time (sec): 10.90 - samples/sec: 8915.26 - lr: 0.000030 - momentum: 0.000000
2023-10-18 19:16:23,048 epoch 1 - iter 511/738 - loss 1.46113750 - time (sec): 12.60 - samples/sec: 8930.48 - lr: 0.000035 - momentum: 0.000000
2023-10-18 19:16:24,891 epoch 1 - iter 584/738 - loss 1.33512940 - time (sec): 14.45 - samples/sec: 9118.78 - lr: 0.000039 - momentum: 0.000000
2023-10-18 19:16:26,602 epoch 1 - iter 657/738 - loss 1.24694827 - time (sec): 16.16 - samples/sec: 9194.50 - lr: 0.000044 - momentum: 0.000000
2023-10-18 19:16:28,278 epoch 1 - iter 730/738 - loss 1.18415498 - time (sec): 17.84 - samples/sec: 9146.63 - lr: 0.000049 - momentum: 0.000000
2023-10-18 19:16:28,542 ----------------------------------------------------------------------------------------------------
2023-10-18 19:16:28,542 EPOCH 1 done: loss 1.1698 - lr: 0.000049
2023-10-18 19:16:31,032 DEV : loss 0.3898785710334778 - f1-score (micro avg) 0.151
2023-10-18 19:16:31,060 saving best model
2023-10-18 19:16:31,087 ----------------------------------------------------------------------------------------------------
2023-10-18 19:16:32,754 epoch 2 - iter 73/738 - loss 0.44379610 - time (sec): 1.67 - samples/sec: 8957.60 - lr: 0.000049 - momentum: 0.000000
2023-10-18 19:16:34,463 epoch 2 - iter 146/738 - loss 0.45149541 - time (sec): 3.38 - samples/sec: 9164.70 - lr: 0.000049 - momentum: 0.000000
2023-10-18 19:16:36,377 epoch 2 - iter 219/738 - loss 0.44816036 - time (sec): 5.29 - samples/sec: 9320.47 - lr: 0.000048 - momentum: 0.000000
2023-10-18 19:16:38,107 epoch 2 - iter 292/738 - loss 0.44332964 - time (sec): 7.02 - samples/sec: 9149.40 - lr: 0.000048 - momentum: 0.000000
2023-10-18 19:16:39,802 epoch 2 - iter 365/738 - loss 0.44861383 - time (sec): 8.71 - samples/sec: 9231.97 - lr: 0.000047 - momentum: 0.000000
2023-10-18 19:16:41,523 epoch 2 - iter 438/738 - loss 0.44417937 - time (sec): 10.44 - samples/sec: 9186.53 - lr: 0.000047 - momentum: 0.000000
2023-10-18 19:16:43,310 epoch 2 - iter 511/738 - loss 0.43943622 - time (sec): 12.22 - samples/sec: 9272.15 - lr: 0.000046 - momentum: 0.000000
2023-10-18 19:16:45,089 epoch 2 - iter 584/738 - loss 0.43496346 - time (sec): 14.00 - samples/sec: 9296.38 - lr: 0.000046 - momentum: 0.000000
2023-10-18 19:16:47,412 epoch 2 - iter 657/738 - loss 0.42916078 - time (sec): 16.32 - samples/sec: 9104.09 - lr: 0.000045 - momentum: 0.000000
2023-10-18 19:16:49,139 epoch 2 - iter 730/738 - loss 0.42612713 - time (sec): 18.05 - samples/sec: 9107.29 - lr: 0.000045 - momentum: 0.000000
2023-10-18 19:16:49,331 ----------------------------------------------------------------------------------------------------
2023-10-18 19:16:49,331 EPOCH 2 done: loss 0.4258 - lr: 0.000045
2023-10-18 19:16:56,554 DEV : loss 0.30439963936805725 - f1-score (micro avg) 0.3835
2023-10-18 19:16:56,582 saving best model
2023-10-18 19:16:56,615 ----------------------------------------------------------------------------------------------------
2023-10-18 19:16:58,341 epoch 3 - iter 73/738 - loss 0.38212163 - time (sec): 1.73 - samples/sec: 9041.67 - lr: 0.000044 - momentum: 0.000000
2023-10-18 19:17:00,082 epoch 3 - iter 146/738 - loss 0.37123255 - time (sec): 3.47 - samples/sec: 8893.99 - lr: 0.000043 - momentum: 0.000000
2023-10-18 19:17:01,813 epoch 3 - iter 219/738 - loss 0.38122082 - time (sec): 5.20 - samples/sec: 8881.29 - lr: 0.000043 - momentum: 0.000000
2023-10-18 19:17:03,613 epoch 3 - iter 292/738 - loss 0.37054219 - time (sec): 7.00 - samples/sec: 9093.43 - lr: 0.000042 - momentum: 0.000000
2023-10-18 19:17:05,407 epoch 3 - iter 365/738 - loss 0.36975075 - time (sec): 8.79 - samples/sec: 9310.43 - lr: 0.000042 - momentum: 0.000000
2023-10-18 19:17:07,251 epoch 3 - iter 438/738 - loss 0.36348837 - time (sec): 10.64 - samples/sec: 9318.23 - lr: 0.000041 - momentum: 0.000000
2023-10-18 19:17:09,105 epoch 3 - iter 511/738 - loss 0.35957823 - time (sec): 12.49 - samples/sec: 9327.22 - lr: 0.000041 - momentum: 0.000000
2023-10-18 19:17:10,798 epoch 3 - iter 584/738 - loss 0.35718051 - time (sec): 14.18 - samples/sec: 9308.65 - lr: 0.000040 - momentum: 0.000000
2023-10-18 19:17:12,488 epoch 3 - iter 657/738 - loss 0.35703417 - time (sec): 15.87 - samples/sec: 9334.93 - lr: 0.000040 - momentum: 0.000000
2023-10-18 19:17:14,156 epoch 3 - iter 730/738 - loss 0.35392752 - time (sec): 17.54 - samples/sec: 9316.96 - lr: 0.000039 - momentum: 0.000000
2023-10-18 19:17:14,395 ----------------------------------------------------------------------------------------------------
2023-10-18 19:17:14,395 EPOCH 3 done: loss 0.3546 - lr: 0.000039
2023-10-18 19:17:21,627 DEV : loss 0.2697567641735077 - f1-score (micro avg) 0.4385
2023-10-18 19:17:21,654 saving best model
2023-10-18 19:17:21,689 ----------------------------------------------------------------------------------------------------
2023-10-18 19:17:23,414 epoch 4 - iter 73/738 - loss 0.31534660 - time (sec): 1.72 - samples/sec: 9312.37 - lr: 0.000038 - momentum: 0.000000
2023-10-18 19:17:25,190 epoch 4 - iter 146/738 - loss 0.31583539 - time (sec): 3.50 - samples/sec: 9459.36 - lr: 0.000038 - momentum: 0.000000
2023-10-18 19:17:26,931 epoch 4 - iter 219/738 - loss 0.31563059 - time (sec): 5.24 - samples/sec: 9568.13 - lr: 0.000037 - momentum: 0.000000
2023-10-18 19:17:28,602 epoch 4 - iter 292/738 - loss 0.32238830 - time (sec): 6.91 - samples/sec: 9481.86 - lr: 0.000037 - momentum: 0.000000
2023-10-18 19:17:30,332 epoch 4 - iter 365/738 - loss 0.32299176 - time (sec): 8.64 - samples/sec: 9408.69 - lr: 0.000036 - momentum: 0.000000
2023-10-18 19:17:32,244 epoch 4 - iter 438/738 - loss 0.32181537 - time (sec): 10.55 - samples/sec: 9483.55 - lr: 0.000036 - momentum: 0.000000
2023-10-18 19:17:33,999 epoch 4 - iter 511/738 - loss 0.31852674 - time (sec): 12.31 - samples/sec: 9396.46 - lr: 0.000035 - momentum: 0.000000
2023-10-18 19:17:35,735 epoch 4 - iter 584/738 - loss 0.31761959 - time (sec): 14.05 - samples/sec: 9372.05 - lr: 0.000035 - momentum: 0.000000
2023-10-18 19:17:37,460 epoch 4 - iter 657/738 - loss 0.31666181 - time (sec): 15.77 - samples/sec: 9381.59 - lr: 0.000034 - momentum: 0.000000
2023-10-18 19:17:39,337 epoch 4 - iter 730/738 - loss 0.31100660 - time (sec): 17.65 - samples/sec: 9344.17 - lr: 0.000033 - momentum: 0.000000
2023-10-18 19:17:39,517 ----------------------------------------------------------------------------------------------------
2023-10-18 19:17:39,518 EPOCH 4 done: loss 0.3105 - lr: 0.000033
2023-10-18 19:17:46,781 DEV : loss 0.24989160895347595 - f1-score (micro avg) 0.4738
2023-10-18 19:17:46,809 saving best model
2023-10-18 19:17:46,842 ----------------------------------------------------------------------------------------------------
2023-10-18 19:17:48,549 epoch 5 - iter 73/738 - loss 0.33889059 - time (sec): 1.71 - samples/sec: 9758.31 - lr: 0.000033 - momentum: 0.000000
2023-10-18 19:17:50,329 epoch 5 - iter 146/738 - loss 0.30240545 - time (sec): 3.49 - samples/sec: 9865.12 - lr: 0.000032 - momentum: 0.000000
2023-10-18 19:17:52,054 epoch 5 - iter 219/738 - loss 0.29939293 - time (sec): 5.21 - samples/sec: 9604.33 - lr: 0.000032 - momentum: 0.000000
2023-10-18 19:17:53,799 epoch 5 - iter 292/738 - loss 0.29440439 - time (sec): 6.96 - samples/sec: 9486.80 - lr: 0.000031 - momentum: 0.000000
2023-10-18 19:17:55,494 epoch 5 - iter 365/738 - loss 0.29326400 - time (sec): 8.65 - samples/sec: 9425.14 - lr: 0.000031 - momentum: 0.000000
2023-10-18 19:17:57,245 epoch 5 - iter 438/738 - loss 0.29384822 - time (sec): 10.40 - samples/sec: 9429.18 - lr: 0.000030 - momentum: 0.000000
2023-10-18 19:17:58,964 epoch 5 - iter 511/738 - loss 0.29217563 - time (sec): 12.12 - samples/sec: 9447.95 - lr: 0.000030 - momentum: 0.000000
2023-10-18 19:18:00,728 epoch 5 - iter 584/738 - loss 0.28778672 - time (sec): 13.89 - samples/sec: 9410.97 - lr: 0.000029 - momentum: 0.000000
2023-10-18 19:18:02,548 epoch 5 - iter 657/738 - loss 0.28493520 - time (sec): 15.71 - samples/sec: 9352.08 - lr: 0.000028 - momentum: 0.000000
2023-10-18 19:18:04,328 epoch 5 - iter 730/738 - loss 0.28263463 - time (sec): 17.49 - samples/sec: 9403.28 - lr: 0.000028 - momentum: 0.000000
2023-10-18 19:18:04,519 ----------------------------------------------------------------------------------------------------
2023-10-18 19:18:04,519 EPOCH 5 done: loss 0.2819 - lr: 0.000028
2023-10-18 19:18:11,788 DEV : loss 0.24010828137397766 - f1-score (micro avg) 0.5029
2023-10-18 19:18:11,815 saving best model
2023-10-18 19:18:11,849 ----------------------------------------------------------------------------------------------------
2023-10-18 19:18:13,695 epoch 6 - iter 73/738 - loss 0.29763220 - time (sec): 1.85 - samples/sec: 9990.87 - lr: 0.000027 - momentum: 0.000000
2023-10-18 19:18:15,418 epoch 6 - iter 146/738 - loss 0.28553332 - time (sec): 3.57 - samples/sec: 9318.60 - lr: 0.000027 - momentum: 0.000000
2023-10-18 19:18:17,163 epoch 6 - iter 219/738 - loss 0.27904513 - time (sec): 5.31 - samples/sec: 9345.50 - lr: 0.000026 - momentum: 0.000000
2023-10-18 19:18:18,941 epoch 6 - iter 292/738 - loss 0.26679853 - time (sec): 7.09 - samples/sec: 9280.39 - lr: 0.000026 - momentum: 0.000000
2023-10-18 19:18:21,255 epoch 6 - iter 365/738 - loss 0.26616402 - time (sec): 9.41 - samples/sec: 8894.81 - lr: 0.000025 - momentum: 0.000000
2023-10-18 19:18:23,013 epoch 6 - iter 438/738 - loss 0.26698244 - time (sec): 11.16 - samples/sec: 8904.33 - lr: 0.000025 - momentum: 0.000000
2023-10-18 19:18:24,785 epoch 6 - iter 511/738 - loss 0.26719563 - time (sec): 12.94 - samples/sec: 8828.95 - lr: 0.000024 - momentum: 0.000000
2023-10-18 19:18:26,509 epoch 6 - iter 584/738 - loss 0.26332397 - time (sec): 14.66 - samples/sec: 8892.33 - lr: 0.000023 - momentum: 0.000000
2023-10-18 19:18:28,257 epoch 6 - iter 657/738 - loss 0.26303038 - time (sec): 16.41 - samples/sec: 8972.92 - lr: 0.000023 - momentum: 0.000000
2023-10-18 19:18:29,999 epoch 6 - iter 730/738 - loss 0.26036240 - time (sec): 18.15 - samples/sec: 9064.22 - lr: 0.000022 - momentum: 0.000000
2023-10-18 19:18:30,189 ----------------------------------------------------------------------------------------------------
2023-10-18 19:18:30,189 EPOCH 6 done: loss 0.2592 - lr: 0.000022
2023-10-18 19:18:37,482 DEV : loss 0.23162847757339478 - f1-score (micro avg) 0.5326
2023-10-18 19:18:37,509 saving best model
2023-10-18 19:18:37,542 ----------------------------------------------------------------------------------------------------
2023-10-18 19:18:39,340 epoch 7 - iter 73/738 - loss 0.26477648 - time (sec): 1.80 - samples/sec: 8805.94 - lr: 0.000022 - momentum: 0.000000
2023-10-18 19:18:41,055 epoch 7 - iter 146/738 - loss 0.25666080 - time (sec): 3.51 - samples/sec: 9218.25 - lr: 0.000021 - momentum: 0.000000
2023-10-18 19:18:42,803 epoch 7 - iter 219/738 - loss 0.25357439 - time (sec): 5.26 - samples/sec: 9349.12 - lr: 0.000021 - momentum: 0.000000
2023-10-18 19:18:44,536 epoch 7 - iter 292/738 - loss 0.25298532 - time (sec): 6.99 - samples/sec: 9290.53 - lr: 0.000020 - momentum: 0.000000
2023-10-18 19:18:46,336 epoch 7 - iter 365/738 - loss 0.24890974 - time (sec): 8.79 - samples/sec: 9295.19 - lr: 0.000020 - momentum: 0.000000
2023-10-18 19:18:48,116 epoch 7 - iter 438/738 - loss 0.24820722 - time (sec): 10.57 - samples/sec: 9226.06 - lr: 0.000019 - momentum: 0.000000
2023-10-18 19:18:49,919 epoch 7 - iter 511/738 - loss 0.25007629 - time (sec): 12.38 - samples/sec: 9260.81 - lr: 0.000018 - momentum: 0.000000
2023-10-18 19:18:51,614 epoch 7 - iter 584/738 - loss 0.25047113 - time (sec): 14.07 - samples/sec: 9270.02 - lr: 0.000018 - momentum: 0.000000
2023-10-18 19:18:53,422 epoch 7 - iter 657/738 - loss 0.24687633 - time (sec): 15.88 - samples/sec: 9360.24 - lr: 0.000017 - momentum: 0.000000
2023-10-18 19:18:55,163 epoch 7 - iter 730/738 - loss 0.24542372 - time (sec): 17.62 - samples/sec: 9350.51 - lr: 0.000017 - momentum: 0.000000
2023-10-18 19:18:55,353 ----------------------------------------------------------------------------------------------------
2023-10-18 19:18:55,353 EPOCH 7 done: loss 0.2448 - lr: 0.000017
2023-10-18 19:19:02,608 DEV : loss 0.22602201998233795 - f1-score (micro avg) 0.5256
2023-10-18 19:19:02,637 ----------------------------------------------------------------------------------------------------
2023-10-18 19:19:04,311 epoch 8 - iter 73/738 - loss 0.24423595 - time (sec): 1.67 - samples/sec: 9745.79 - lr: 0.000016 - momentum: 0.000000
2023-10-18 19:19:06,038 epoch 8 - iter 146/738 - loss 0.23121395 - time (sec): 3.40 - samples/sec: 9164.06 - lr: 0.000016 - momentum: 0.000000
2023-10-18 19:19:07,781 epoch 8 - iter 219/738 - loss 0.22578664 - time (sec): 5.14 - samples/sec: 9300.66 - lr: 0.000015 - momentum: 0.000000
2023-10-18 19:19:09,520 epoch 8 - iter 292/738 - loss 0.22931143 - time (sec): 6.88 - samples/sec: 9270.00 - lr: 0.000015 - momentum: 0.000000
2023-10-18 19:19:11,191 epoch 8 - iter 365/738 - loss 0.23131383 - time (sec): 8.55 - samples/sec: 9206.24 - lr: 0.000014 - momentum: 0.000000
2023-10-18 19:19:13,033 epoch 8 - iter 438/738 - loss 0.23177533 - time (sec): 10.39 - samples/sec: 9213.72 - lr: 0.000013 - momentum: 0.000000
2023-10-18 19:19:14,746 epoch 8 - iter 511/738 - loss 0.23071403 - time (sec): 12.11 - samples/sec: 9212.74 - lr: 0.000013 - momentum: 0.000000
2023-10-18 19:19:16,470 epoch 8 - iter 584/738 - loss 0.22836249 - time (sec): 13.83 - samples/sec: 9299.07 - lr: 0.000012 - momentum: 0.000000
2023-10-18 19:19:18,250 epoch 8 - iter 657/738 - loss 0.22818409 - time (sec): 15.61 - samples/sec: 9358.07 - lr: 0.000012 - momentum: 0.000000
2023-10-18 19:19:20,147 epoch 8 - iter 730/738 - loss 0.23096993 - time (sec): 17.51 - samples/sec: 9427.60 - lr: 0.000011 - momentum: 0.000000
2023-10-18 19:19:20,335 ----------------------------------------------------------------------------------------------------
2023-10-18 19:19:20,335 EPOCH 8 done: loss 0.2311 - lr: 0.000011
2023-10-18 19:19:27,642 DEV : loss 0.22594498097896576 - f1-score (micro avg) 0.5422
2023-10-18 19:19:27,671 saving best model
2023-10-18 19:19:27,703 ----------------------------------------------------------------------------------------------------
2023-10-18 19:19:29,484 epoch 9 - iter 73/738 - loss 0.18455478 - time (sec): 1.78 - samples/sec: 8727.95 - lr: 0.000011 - momentum: 0.000000
2023-10-18 19:19:31,203 epoch 9 - iter 146/738 - loss 0.20931099 - time (sec): 3.50 - samples/sec: 9088.75 - lr: 0.000010 - momentum: 0.000000
2023-10-18 19:19:32,887 epoch 9 - iter 219/738 - loss 0.20975404 - time (sec): 5.18 - samples/sec: 9297.14 - lr: 0.000010 - momentum: 0.000000
2023-10-18 19:19:34,653 epoch 9 - iter 292/738 - loss 0.22411280 - time (sec): 6.95 - samples/sec: 9389.15 - lr: 0.000009 - momentum: 0.000000
2023-10-18 19:19:36,405 epoch 9 - iter 365/738 - loss 0.22562147 - time (sec): 8.70 - samples/sec: 9441.59 - lr: 0.000008 - momentum: 0.000000
2023-10-18 19:19:38,076 epoch 9 - iter 438/738 - loss 0.22642686 - time (sec): 10.37 - samples/sec: 9345.82 - lr: 0.000008 - momentum: 0.000000
2023-10-18 19:19:39,738 epoch 9 - iter 511/738 - loss 0.22750691 - time (sec): 12.03 - samples/sec: 9401.08 - lr: 0.000007 - momentum: 0.000000
2023-10-18 19:19:41,543 epoch 9 - iter 584/738 - loss 0.22803447 - time (sec): 13.84 - samples/sec: 9450.52 - lr: 0.000007 - momentum: 0.000000
2023-10-18 19:19:43,400 epoch 9 - iter 657/738 - loss 0.22389397 - time (sec): 15.70 - samples/sec: 9480.63 - lr: 0.000006 - momentum: 0.000000
2023-10-18 19:19:45,138 epoch 9 - iter 730/738 - loss 0.22337068 - time (sec): 17.43 - samples/sec: 9453.91 - lr: 0.000006 - momentum: 0.000000
2023-10-18 19:19:45,332 ----------------------------------------------------------------------------------------------------
2023-10-18 19:19:45,332 EPOCH 9 done: loss 0.2240 - lr: 0.000006
2023-10-18 19:19:52,605 DEV : loss 0.22624309360980988 - f1-score (micro avg) 0.5362
2023-10-18 19:19:52,632 ----------------------------------------------------------------------------------------------------
2023-10-18 19:19:54,345 epoch 10 - iter 73/738 - loss 0.24146970 - time (sec): 1.71 - samples/sec: 9516.11 - lr: 0.000005 - momentum: 0.000000
2023-10-18 19:19:56,147 epoch 10 - iter 146/738 - loss 0.23503637 - time (sec): 3.51 - samples/sec: 9734.76 - lr: 0.000004 - momentum: 0.000000
2023-10-18 19:19:57,897 epoch 10 - iter 219/738 - loss 0.23583201 - time (sec): 5.26 - samples/sec: 9531.90 - lr: 0.000004 - momentum: 0.000000
2023-10-18 19:20:00,098 epoch 10 - iter 292/738 - loss 0.22826275 - time (sec): 7.47 - samples/sec: 9052.37 - lr: 0.000003 - momentum: 0.000000
2023-10-18 19:20:01,827 epoch 10 - iter 365/738 - loss 0.22769440 - time (sec): 9.19 - samples/sec: 8993.55 - lr: 0.000003 - momentum: 0.000000
2023-10-18 19:20:03,517 epoch 10 - iter 438/738 - loss 0.22412354 - time (sec): 10.88 - samples/sec: 8956.87 - lr: 0.000002 - momentum: 0.000000
2023-10-18 19:20:05,191 epoch 10 - iter 511/738 - loss 0.22446585 - time (sec): 12.56 - samples/sec: 9001.05 - lr: 0.000002 - momentum: 0.000000
2023-10-18 19:20:06,948 epoch 10 - iter 584/738 - loss 0.22796532 - time (sec): 14.31 - samples/sec: 9050.23 - lr: 0.000001 - momentum: 0.000000
2023-10-18 19:20:08,702 epoch 10 - iter 657/738 - loss 0.22207363 - time (sec): 16.07 - samples/sec: 9167.06 - lr: 0.000001 - momentum: 0.000000
2023-10-18 19:20:10,373 epoch 10 - iter 730/738 - loss 0.22025921 - time (sec): 17.74 - samples/sec: 9298.67 - lr: 0.000000 - momentum: 0.000000
2023-10-18 19:20:10,543 ----------------------------------------------------------------------------------------------------
2023-10-18 19:20:10,543 EPOCH 10 done: loss 0.2207 - lr: 0.000000
2023-10-18 19:20:17,848 DEV : loss 0.2269122153520584 - f1-score (micro avg) 0.5457
2023-10-18 19:20:17,877 saving best model
2023-10-18 19:20:17,943 ----------------------------------------------------------------------------------------------------
2023-10-18 19:20:17,943 Loading model from best epoch ...
2023-10-18 19:20:18,024 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
2023-10-18 19:20:20,721
Results:
- F-score (micro) 0.5456
- F-score (macro) 0.3348
- Accuracy 0.3993
By class:
precision recall f1-score support
loc 0.5688 0.7657 0.6528 858
pers 0.4237 0.5121 0.4637 537
org 0.2000 0.0530 0.0838 132
time 0.4500 0.5000 0.4737 54
prod 0.0000 0.0000 0.0000 61
micro avg 0.5087 0.5883 0.5456 1642
macro avg 0.3285 0.3662 0.3348 1642
weighted avg 0.4667 0.5883 0.5151 1642
2023-10-18 19:20:20,721 ----------------------------------------------------------------------------------------------------