|
2023-10-13 13:52:18,187 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:52:18,188 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 13:52:18,188 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:52:18,188 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-13 13:52:18,188 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:52:18,188 Train: 3575 sentences |
|
2023-10-13 13:52:18,188 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 13:52:18,188 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:52:18,188 Training Params: |
|
2023-10-13 13:52:18,188 - learning_rate: "5e-05" |
|
2023-10-13 13:52:18,188 - mini_batch_size: "8" |
|
2023-10-13 13:52:18,188 - max_epochs: "10" |
|
2023-10-13 13:52:18,188 - shuffle: "True" |
|
2023-10-13 13:52:18,188 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:52:18,188 Plugins: |
|
2023-10-13 13:52:18,188 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 13:52:18,188 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:52:18,188 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 13:52:18,188 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 13:52:18,188 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:52:18,188 Computation: |
|
2023-10-13 13:52:18,189 - compute on device: cuda:0 |
|
2023-10-13 13:52:18,189 - embedding storage: none |
|
2023-10-13 13:52:18,189 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:52:18,189 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-13 13:52:18,189 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:52:18,189 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:52:21,622 epoch 1 - iter 44/447 - loss 2.80304392 - time (sec): 3.43 - samples/sec: 2855.55 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 13:52:24,408 epoch 1 - iter 88/447 - loss 1.89690895 - time (sec): 6.22 - samples/sec: 3020.50 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 13:52:27,111 epoch 1 - iter 132/447 - loss 1.47044904 - time (sec): 8.92 - samples/sec: 3039.37 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 13:52:30,098 epoch 1 - iter 176/447 - loss 1.20785199 - time (sec): 11.91 - samples/sec: 3003.46 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 13:52:32,766 epoch 1 - iter 220/447 - loss 1.03327036 - time (sec): 14.58 - samples/sec: 3040.46 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 13:52:35,528 epoch 1 - iter 264/447 - loss 0.91266769 - time (sec): 17.34 - samples/sec: 3041.19 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 13:52:38,240 epoch 1 - iter 308/447 - loss 0.82567581 - time (sec): 20.05 - samples/sec: 3043.77 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-13 13:52:40,937 epoch 1 - iter 352/447 - loss 0.75655968 - time (sec): 22.75 - samples/sec: 3047.57 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 13:52:43,500 epoch 1 - iter 396/447 - loss 0.70566723 - time (sec): 25.31 - samples/sec: 3048.34 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-13 13:52:46,292 epoch 1 - iter 440/447 - loss 0.65893566 - time (sec): 28.10 - samples/sec: 3042.80 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 13:52:46,683 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:52:46,683 EPOCH 1 done: loss 0.6536 - lr: 0.000049 |
|
2023-10-13 13:52:51,632 DEV : loss 0.1805853396654129 - f1-score (micro avg) 0.5981 |
|
2023-10-13 13:52:51,663 saving best model |
|
2023-10-13 13:52:51,967 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:52:54,970 epoch 2 - iter 44/447 - loss 0.21737154 - time (sec): 3.00 - samples/sec: 3049.14 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 13:52:57,708 epoch 2 - iter 88/447 - loss 0.19897717 - time (sec): 5.74 - samples/sec: 3056.43 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 13:53:00,436 epoch 2 - iter 132/447 - loss 0.18115887 - time (sec): 8.47 - samples/sec: 3036.94 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 13:53:03,011 epoch 2 - iter 176/447 - loss 0.17788084 - time (sec): 11.04 - samples/sec: 3009.71 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 13:53:05,790 epoch 2 - iter 220/447 - loss 0.17534041 - time (sec): 13.82 - samples/sec: 2995.56 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 13:53:08,645 epoch 2 - iter 264/447 - loss 0.17589414 - time (sec): 16.68 - samples/sec: 3002.72 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 13:53:11,430 epoch 2 - iter 308/447 - loss 0.17338298 - time (sec): 19.46 - samples/sec: 3022.60 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 13:53:14,245 epoch 2 - iter 352/447 - loss 0.16761658 - time (sec): 22.28 - samples/sec: 3022.69 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 13:53:17,241 epoch 2 - iter 396/447 - loss 0.16105209 - time (sec): 25.27 - samples/sec: 3022.47 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 13:53:20,143 epoch 2 - iter 440/447 - loss 0.15642468 - time (sec): 28.17 - samples/sec: 3017.36 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 13:53:20,611 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:53:20,611 EPOCH 2 done: loss 0.1550 - lr: 0.000045 |
|
2023-10-13 13:53:29,109 DEV : loss 0.15057964622974396 - f1-score (micro avg) 0.7063 |
|
2023-10-13 13:53:29,142 saving best model |
|
2023-10-13 13:53:29,537 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:53:32,262 epoch 3 - iter 44/447 - loss 0.07579220 - time (sec): 2.72 - samples/sec: 3332.73 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-13 13:53:34,966 epoch 3 - iter 88/447 - loss 0.07941142 - time (sec): 5.43 - samples/sec: 3273.81 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 13:53:38,158 epoch 3 - iter 132/447 - loss 0.08409869 - time (sec): 8.62 - samples/sec: 3137.62 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 13:53:41,107 epoch 3 - iter 176/447 - loss 0.08454449 - time (sec): 11.57 - samples/sec: 3110.22 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 13:53:43,996 epoch 3 - iter 220/447 - loss 0.08496677 - time (sec): 14.46 - samples/sec: 3039.05 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 13:53:46,740 epoch 3 - iter 264/447 - loss 0.08464658 - time (sec): 17.20 - samples/sec: 3043.20 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 13:53:49,549 epoch 3 - iter 308/447 - loss 0.08359817 - time (sec): 20.01 - samples/sec: 3002.88 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 13:53:52,567 epoch 3 - iter 352/447 - loss 0.08430508 - time (sec): 23.03 - samples/sec: 2967.62 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 13:53:55,292 epoch 3 - iter 396/447 - loss 0.08382783 - time (sec): 25.75 - samples/sec: 2995.30 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 13:53:57,974 epoch 3 - iter 440/447 - loss 0.08420380 - time (sec): 28.43 - samples/sec: 2998.20 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 13:53:58,449 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:53:58,450 EPOCH 3 done: loss 0.0847 - lr: 0.000039 |
|
2023-10-13 13:54:07,082 DEV : loss 0.1508309543132782 - f1-score (micro avg) 0.7244 |
|
2023-10-13 13:54:07,114 saving best model |
|
2023-10-13 13:54:07,523 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:54:10,479 epoch 4 - iter 44/447 - loss 0.05304508 - time (sec): 2.95 - samples/sec: 3103.87 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 13:54:13,204 epoch 4 - iter 88/447 - loss 0.05490691 - time (sec): 5.68 - samples/sec: 3049.47 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 13:54:16,460 epoch 4 - iter 132/447 - loss 0.05603952 - time (sec): 8.93 - samples/sec: 3062.78 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 13:54:19,244 epoch 4 - iter 176/447 - loss 0.05430011 - time (sec): 11.72 - samples/sec: 3015.42 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 13:54:22,000 epoch 4 - iter 220/447 - loss 0.05473667 - time (sec): 14.47 - samples/sec: 3016.35 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-13 13:54:25,103 epoch 4 - iter 264/447 - loss 0.05343876 - time (sec): 17.58 - samples/sec: 3017.62 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-13 13:54:27,815 epoch 4 - iter 308/447 - loss 0.05373041 - time (sec): 20.29 - samples/sec: 3016.27 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 13:54:30,485 epoch 4 - iter 352/447 - loss 0.05298263 - time (sec): 22.96 - samples/sec: 3009.46 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 13:54:33,206 epoch 4 - iter 396/447 - loss 0.05489649 - time (sec): 25.68 - samples/sec: 2998.60 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-13 13:54:36,028 epoch 4 - iter 440/447 - loss 0.05438705 - time (sec): 28.50 - samples/sec: 2992.42 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-13 13:54:36,485 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:54:36,485 EPOCH 4 done: loss 0.0544 - lr: 0.000033 |
|
2023-10-13 13:54:45,181 DEV : loss 0.1490735560655594 - f1-score (micro avg) 0.7535 |
|
2023-10-13 13:54:45,214 saving best model |
|
2023-10-13 13:54:45,613 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:54:48,582 epoch 5 - iter 44/447 - loss 0.03436720 - time (sec): 2.97 - samples/sec: 2904.86 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-13 13:54:51,323 epoch 5 - iter 88/447 - loss 0.03358143 - time (sec): 5.71 - samples/sec: 2951.13 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 13:54:53,986 epoch 5 - iter 132/447 - loss 0.03188978 - time (sec): 8.37 - samples/sec: 2969.83 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 13:54:56,951 epoch 5 - iter 176/447 - loss 0.03397266 - time (sec): 11.34 - samples/sec: 3032.31 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-13 13:54:59,853 epoch 5 - iter 220/447 - loss 0.03755977 - time (sec): 14.24 - samples/sec: 3047.91 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-13 13:55:02,845 epoch 5 - iter 264/447 - loss 0.03734981 - time (sec): 17.23 - samples/sec: 3036.76 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 13:55:05,483 epoch 5 - iter 308/447 - loss 0.03737218 - time (sec): 19.87 - samples/sec: 3041.74 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 13:55:08,295 epoch 5 - iter 352/447 - loss 0.03758080 - time (sec): 22.68 - samples/sec: 3046.49 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 13:55:10,968 epoch 5 - iter 396/447 - loss 0.03764803 - time (sec): 25.35 - samples/sec: 3037.80 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 13:55:13,794 epoch 5 - iter 440/447 - loss 0.03690605 - time (sec): 28.18 - samples/sec: 3024.48 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 13:55:14,221 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:55:14,221 EPOCH 5 done: loss 0.0367 - lr: 0.000028 |
|
2023-10-13 13:55:22,677 DEV : loss 0.1890304833650589 - f1-score (micro avg) 0.7566 |
|
2023-10-13 13:55:22,709 saving best model |
|
2023-10-13 13:55:23,122 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:55:25,741 epoch 6 - iter 44/447 - loss 0.01994334 - time (sec): 2.62 - samples/sec: 3158.60 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 13:55:28,554 epoch 6 - iter 88/447 - loss 0.02422846 - time (sec): 5.43 - samples/sec: 3070.83 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 13:55:31,909 epoch 6 - iter 132/447 - loss 0.02376942 - time (sec): 8.79 - samples/sec: 3058.05 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 13:55:34,772 epoch 6 - iter 176/447 - loss 0.02490342 - time (sec): 11.65 - samples/sec: 3048.64 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 13:55:37,643 epoch 6 - iter 220/447 - loss 0.02396279 - time (sec): 14.52 - samples/sec: 3083.56 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 13:55:40,417 epoch 6 - iter 264/447 - loss 0.02424836 - time (sec): 17.29 - samples/sec: 3065.45 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 13:55:43,124 epoch 6 - iter 308/447 - loss 0.02495930 - time (sec): 20.00 - samples/sec: 3046.22 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 13:55:45,698 epoch 6 - iter 352/447 - loss 0.02502682 - time (sec): 22.57 - samples/sec: 3065.48 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 13:55:48,469 epoch 6 - iter 396/447 - loss 0.02486617 - time (sec): 25.35 - samples/sec: 3063.90 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 13:55:51,039 epoch 6 - iter 440/447 - loss 0.02471361 - time (sec): 27.92 - samples/sec: 3051.50 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 13:55:51,467 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:55:51,467 EPOCH 6 done: loss 0.0245 - lr: 0.000022 |
|
2023-10-13 13:56:00,211 DEV : loss 0.20025420188903809 - f1-score (micro avg) 0.7608 |
|
2023-10-13 13:56:00,243 saving best model |
|
2023-10-13 13:56:00,662 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:56:03,244 epoch 7 - iter 44/447 - loss 0.03159210 - time (sec): 2.58 - samples/sec: 2982.07 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 13:56:05,871 epoch 7 - iter 88/447 - loss 0.02117470 - time (sec): 5.21 - samples/sec: 2970.34 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 13:56:09,050 epoch 7 - iter 132/447 - loss 0.01682505 - time (sec): 8.39 - samples/sec: 2985.75 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 13:56:11,821 epoch 7 - iter 176/447 - loss 0.01583735 - time (sec): 11.16 - samples/sec: 3021.13 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 13:56:14,739 epoch 7 - iter 220/447 - loss 0.01568590 - time (sec): 14.07 - samples/sec: 3019.51 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 13:56:17,684 epoch 7 - iter 264/447 - loss 0.01455990 - time (sec): 17.02 - samples/sec: 2983.40 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 13:56:20,385 epoch 7 - iter 308/447 - loss 0.01564589 - time (sec): 19.72 - samples/sec: 2994.82 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 13:56:23,491 epoch 7 - iter 352/447 - loss 0.01560294 - time (sec): 22.83 - samples/sec: 2982.24 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 13:56:26,186 epoch 7 - iter 396/447 - loss 0.01510732 - time (sec): 25.52 - samples/sec: 2999.36 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 13:56:28,972 epoch 7 - iter 440/447 - loss 0.01441825 - time (sec): 28.31 - samples/sec: 3009.18 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 13:56:29,419 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:56:29,419 EPOCH 7 done: loss 0.0143 - lr: 0.000017 |
|
2023-10-13 13:56:38,590 DEV : loss 0.22453096508979797 - f1-score (micro avg) 0.7614 |
|
2023-10-13 13:56:38,637 saving best model |
|
2023-10-13 13:56:39,102 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:56:42,027 epoch 8 - iter 44/447 - loss 0.02136768 - time (sec): 2.92 - samples/sec: 2850.76 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 13:56:44,869 epoch 8 - iter 88/447 - loss 0.01238681 - time (sec): 5.76 - samples/sec: 2874.37 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 13:56:47,878 epoch 8 - iter 132/447 - loss 0.01270474 - time (sec): 8.77 - samples/sec: 2944.00 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 13:56:50,949 epoch 8 - iter 176/447 - loss 0.01180657 - time (sec): 11.84 - samples/sec: 2964.03 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 13:56:53,931 epoch 8 - iter 220/447 - loss 0.01182640 - time (sec): 14.83 - samples/sec: 2944.03 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 13:56:56,513 epoch 8 - iter 264/447 - loss 0.01218369 - time (sec): 17.41 - samples/sec: 2972.81 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 13:56:59,336 epoch 8 - iter 308/447 - loss 0.01137181 - time (sec): 20.23 - samples/sec: 2969.57 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 13:57:02,018 epoch 8 - iter 352/447 - loss 0.01080205 - time (sec): 22.91 - samples/sec: 2986.53 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 13:57:04,779 epoch 8 - iter 396/447 - loss 0.00997545 - time (sec): 25.68 - samples/sec: 2988.77 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 13:57:07,732 epoch 8 - iter 440/447 - loss 0.00947122 - time (sec): 28.63 - samples/sec: 2981.97 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 13:57:08,116 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:57:08,116 EPOCH 8 done: loss 0.0094 - lr: 0.000011 |
|
2023-10-13 13:57:16,674 DEV : loss 0.23534463346004486 - f1-score (micro avg) 0.783 |
|
2023-10-13 13:57:16,705 saving best model |
|
2023-10-13 13:57:17,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:57:19,655 epoch 9 - iter 44/447 - loss 0.00423546 - time (sec): 2.60 - samples/sec: 3143.55 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 13:57:22,730 epoch 9 - iter 88/447 - loss 0.00445111 - time (sec): 5.68 - samples/sec: 3081.95 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 13:57:25,724 epoch 9 - iter 132/447 - loss 0.00390275 - time (sec): 8.67 - samples/sec: 3066.27 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 13:57:28,490 epoch 9 - iter 176/447 - loss 0.00514473 - time (sec): 11.44 - samples/sec: 3064.22 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 13:57:31,331 epoch 9 - iter 220/447 - loss 0.00557898 - time (sec): 14.28 - samples/sec: 3024.88 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 13:57:34,094 epoch 9 - iter 264/447 - loss 0.00642635 - time (sec): 17.04 - samples/sec: 3018.71 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 13:57:36,796 epoch 9 - iter 308/447 - loss 0.00721538 - time (sec): 19.74 - samples/sec: 3036.37 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 13:57:39,544 epoch 9 - iter 352/447 - loss 0.00668215 - time (sec): 22.49 - samples/sec: 3049.79 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 13:57:42,501 epoch 9 - iter 396/447 - loss 0.00679635 - time (sec): 25.45 - samples/sec: 3024.59 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 13:57:45,295 epoch 9 - iter 440/447 - loss 0.00653626 - time (sec): 28.24 - samples/sec: 3020.38 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 13:57:45,715 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:57:45,715 EPOCH 9 done: loss 0.0064 - lr: 0.000006 |
|
2023-10-13 13:57:54,102 DEV : loss 0.23335954546928406 - f1-score (micro avg) 0.7839 |
|
2023-10-13 13:57:54,149 saving best model |
|
2023-10-13 13:57:54,575 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:57:57,353 epoch 10 - iter 44/447 - loss 0.00288759 - time (sec): 2.78 - samples/sec: 3073.92 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 13:58:00,065 epoch 10 - iter 88/447 - loss 0.00330904 - time (sec): 5.49 - samples/sec: 2960.13 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 13:58:03,373 epoch 10 - iter 132/447 - loss 0.00277460 - time (sec): 8.80 - samples/sec: 2822.37 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 13:58:05,952 epoch 10 - iter 176/447 - loss 0.00487730 - time (sec): 11.38 - samples/sec: 2875.90 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 13:58:09,197 epoch 10 - iter 220/447 - loss 0.00445612 - time (sec): 14.62 - samples/sec: 2900.22 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 13:58:12,356 epoch 10 - iter 264/447 - loss 0.00440312 - time (sec): 17.78 - samples/sec: 2903.11 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 13:58:14,993 epoch 10 - iter 308/447 - loss 0.00421309 - time (sec): 20.42 - samples/sec: 2931.56 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 13:58:17,590 epoch 10 - iter 352/447 - loss 0.00405346 - time (sec): 23.01 - samples/sec: 2934.16 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 13:58:20,514 epoch 10 - iter 396/447 - loss 0.00477773 - time (sec): 25.94 - samples/sec: 2967.92 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 13:58:23,187 epoch 10 - iter 440/447 - loss 0.00456760 - time (sec): 28.61 - samples/sec: 2983.33 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 13:58:23,592 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:58:23,592 EPOCH 10 done: loss 0.0045 - lr: 0.000000 |
|
2023-10-13 13:58:31,942 DEV : loss 0.23924623429775238 - f1-score (micro avg) 0.7853 |
|
2023-10-13 13:58:31,976 saving best model |
|
2023-10-13 13:58:32,683 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:58:32,684 Loading model from best epoch ... |
|
2023-10-13 13:58:34,091 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-13 13:58:39,814 |
|
Results: |
|
- F-score (micro) 0.7463 |
|
- F-score (macro) 0.6655 |
|
- Accuracy 0.6118 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.8367 0.8423 0.8395 596 |
|
pers 0.6567 0.7928 0.7184 333 |
|
org 0.5039 0.4848 0.4942 132 |
|
prod 0.6038 0.4848 0.5378 66 |
|
time 0.7037 0.7755 0.7379 49 |
|
|
|
micro avg 0.7282 0.7653 0.7463 1176 |
|
macro avg 0.6610 0.6761 0.6655 1176 |
|
weighted avg 0.7298 0.7653 0.7453 1176 |
|
|
|
2023-10-13 13:58:39,814 ---------------------------------------------------------------------------------------------------- |
|
|