stefan-it's picture
Upload folder using huggingface_hub
349f79d
2023-10-13 16:07:31,904 ----------------------------------------------------------------------------------------------------
2023-10-13 16:07:31,905 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 16:07:31,905 ----------------------------------------------------------------------------------------------------
2023-10-13 16:07:31,905 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
2023-10-13 16:07:31,905 ----------------------------------------------------------------------------------------------------
2023-10-13 16:07:31,905 Train: 5901 sentences
2023-10-13 16:07:31,905 (train_with_dev=False, train_with_test=False)
2023-10-13 16:07:31,905 ----------------------------------------------------------------------------------------------------
2023-10-13 16:07:31,905 Training Params:
2023-10-13 16:07:31,905 - learning_rate: "5e-05"
2023-10-13 16:07:31,905 - mini_batch_size: "8"
2023-10-13 16:07:31,905 - max_epochs: "10"
2023-10-13 16:07:31,905 - shuffle: "True"
2023-10-13 16:07:31,906 ----------------------------------------------------------------------------------------------------
2023-10-13 16:07:31,906 Plugins:
2023-10-13 16:07:31,906 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 16:07:31,906 ----------------------------------------------------------------------------------------------------
2023-10-13 16:07:31,906 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 16:07:31,906 - metric: "('micro avg', 'f1-score')"
2023-10-13 16:07:31,906 ----------------------------------------------------------------------------------------------------
2023-10-13 16:07:31,906 Computation:
2023-10-13 16:07:31,906 - compute on device: cuda:0
2023-10-13 16:07:31,906 - embedding storage: none
2023-10-13 16:07:31,906 ----------------------------------------------------------------------------------------------------
2023-10-13 16:07:31,906 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-13 16:07:31,906 ----------------------------------------------------------------------------------------------------
2023-10-13 16:07:31,906 ----------------------------------------------------------------------------------------------------
2023-10-13 16:07:37,143 epoch 1 - iter 73/738 - loss 2.76598445 - time (sec): 5.24 - samples/sec: 3360.36 - lr: 0.000005 - momentum: 0.000000
2023-10-13 16:07:42,169 epoch 1 - iter 146/738 - loss 1.66719703 - time (sec): 10.26 - samples/sec: 3487.80 - lr: 0.000010 - momentum: 0.000000
2023-10-13 16:07:47,178 epoch 1 - iter 219/738 - loss 1.27255728 - time (sec): 15.27 - samples/sec: 3386.07 - lr: 0.000015 - momentum: 0.000000
2023-10-13 16:07:52,113 epoch 1 - iter 292/738 - loss 1.03524511 - time (sec): 20.21 - samples/sec: 3381.44 - lr: 0.000020 - momentum: 0.000000
2023-10-13 16:07:57,063 epoch 1 - iter 365/738 - loss 0.89357834 - time (sec): 25.16 - samples/sec: 3379.11 - lr: 0.000025 - momentum: 0.000000
2023-10-13 16:08:01,878 epoch 1 - iter 438/738 - loss 0.78652245 - time (sec): 29.97 - samples/sec: 3393.22 - lr: 0.000030 - momentum: 0.000000
2023-10-13 16:08:06,712 epoch 1 - iter 511/738 - loss 0.71094911 - time (sec): 34.80 - samples/sec: 3385.50 - lr: 0.000035 - momentum: 0.000000
2023-10-13 16:08:11,369 epoch 1 - iter 584/738 - loss 0.65439322 - time (sec): 39.46 - samples/sec: 3354.63 - lr: 0.000039 - momentum: 0.000000
2023-10-13 16:08:16,348 epoch 1 - iter 657/738 - loss 0.60420483 - time (sec): 44.44 - samples/sec: 3343.22 - lr: 0.000044 - momentum: 0.000000
2023-10-13 16:08:21,114 epoch 1 - iter 730/738 - loss 0.56171741 - time (sec): 49.21 - samples/sec: 3351.84 - lr: 0.000049 - momentum: 0.000000
2023-10-13 16:08:21,588 ----------------------------------------------------------------------------------------------------
2023-10-13 16:08:21,588 EPOCH 1 done: loss 0.5581 - lr: 0.000049
2023-10-13 16:08:27,352 DEV : loss 0.14290109276771545 - f1-score (micro avg) 0.7015
2023-10-13 16:08:27,384 saving best model
2023-10-13 16:08:27,872 ----------------------------------------------------------------------------------------------------
2023-10-13 16:08:32,167 epoch 2 - iter 73/738 - loss 0.12716274 - time (sec): 4.29 - samples/sec: 3521.57 - lr: 0.000049 - momentum: 0.000000
2023-10-13 16:08:36,924 epoch 2 - iter 146/738 - loss 0.13405283 - time (sec): 9.05 - samples/sec: 3459.97 - lr: 0.000049 - momentum: 0.000000
2023-10-13 16:08:41,641 epoch 2 - iter 219/738 - loss 0.13759252 - time (sec): 13.77 - samples/sec: 3447.39 - lr: 0.000048 - momentum: 0.000000
2023-10-13 16:08:47,028 epoch 2 - iter 292/738 - loss 0.13374106 - time (sec): 19.15 - samples/sec: 3295.50 - lr: 0.000048 - momentum: 0.000000
2023-10-13 16:08:51,691 epoch 2 - iter 365/738 - loss 0.13472976 - time (sec): 23.82 - samples/sec: 3300.76 - lr: 0.000047 - momentum: 0.000000
2023-10-13 16:08:56,701 epoch 2 - iter 438/738 - loss 0.13143665 - time (sec): 28.83 - samples/sec: 3311.21 - lr: 0.000047 - momentum: 0.000000
2023-10-13 16:09:02,238 epoch 2 - iter 511/738 - loss 0.12819648 - time (sec): 34.37 - samples/sec: 3312.44 - lr: 0.000046 - momentum: 0.000000
2023-10-13 16:09:07,051 epoch 2 - iter 584/738 - loss 0.12305117 - time (sec): 39.18 - samples/sec: 3321.56 - lr: 0.000046 - momentum: 0.000000
2023-10-13 16:09:12,066 epoch 2 - iter 657/738 - loss 0.12315101 - time (sec): 44.19 - samples/sec: 3333.00 - lr: 0.000045 - momentum: 0.000000
2023-10-13 16:09:17,367 epoch 2 - iter 730/738 - loss 0.12343880 - time (sec): 49.49 - samples/sec: 3328.11 - lr: 0.000045 - momentum: 0.000000
2023-10-13 16:09:17,866 ----------------------------------------------------------------------------------------------------
2023-10-13 16:09:17,866 EPOCH 2 done: loss 0.1233 - lr: 0.000045
2023-10-13 16:09:28,983 DEV : loss 0.10492947697639465 - f1-score (micro avg) 0.7561
2023-10-13 16:09:29,013 saving best model
2023-10-13 16:09:29,530 ----------------------------------------------------------------------------------------------------
2023-10-13 16:09:34,249 epoch 3 - iter 73/738 - loss 0.05692245 - time (sec): 4.72 - samples/sec: 3271.85 - lr: 0.000044 - momentum: 0.000000
2023-10-13 16:09:39,076 epoch 3 - iter 146/738 - loss 0.06379593 - time (sec): 9.54 - samples/sec: 3360.48 - lr: 0.000043 - momentum: 0.000000
2023-10-13 16:09:43,953 epoch 3 - iter 219/738 - loss 0.07184299 - time (sec): 14.42 - samples/sec: 3352.39 - lr: 0.000043 - momentum: 0.000000
2023-10-13 16:09:48,396 epoch 3 - iter 292/738 - loss 0.07056935 - time (sec): 18.86 - samples/sec: 3355.15 - lr: 0.000042 - momentum: 0.000000
2023-10-13 16:09:53,905 epoch 3 - iter 365/738 - loss 0.06945011 - time (sec): 24.37 - samples/sec: 3322.09 - lr: 0.000042 - momentum: 0.000000
2023-10-13 16:09:59,137 epoch 3 - iter 438/738 - loss 0.06940866 - time (sec): 29.60 - samples/sec: 3357.23 - lr: 0.000041 - momentum: 0.000000
2023-10-13 16:10:04,013 epoch 3 - iter 511/738 - loss 0.06696551 - time (sec): 34.48 - samples/sec: 3346.45 - lr: 0.000041 - momentum: 0.000000
2023-10-13 16:10:08,915 epoch 3 - iter 584/738 - loss 0.06853341 - time (sec): 39.38 - samples/sec: 3359.17 - lr: 0.000040 - momentum: 0.000000
2023-10-13 16:10:14,120 epoch 3 - iter 657/738 - loss 0.06811819 - time (sec): 44.59 - samples/sec: 3350.17 - lr: 0.000040 - momentum: 0.000000
2023-10-13 16:10:18,824 epoch 3 - iter 730/738 - loss 0.06952438 - time (sec): 49.29 - samples/sec: 3343.34 - lr: 0.000039 - momentum: 0.000000
2023-10-13 16:10:19,287 ----------------------------------------------------------------------------------------------------
2023-10-13 16:10:19,287 EPOCH 3 done: loss 0.0696 - lr: 0.000039
2023-10-13 16:10:30,446 DEV : loss 0.12435939162969589 - f1-score (micro avg) 0.7908
2023-10-13 16:10:30,475 saving best model
2023-10-13 16:10:31,010 ----------------------------------------------------------------------------------------------------
2023-10-13 16:10:35,777 epoch 4 - iter 73/738 - loss 0.04217679 - time (sec): 4.76 - samples/sec: 3176.63 - lr: 0.000038 - momentum: 0.000000
2023-10-13 16:10:40,427 epoch 4 - iter 146/738 - loss 0.04322531 - time (sec): 9.41 - samples/sec: 3274.46 - lr: 0.000038 - momentum: 0.000000
2023-10-13 16:10:45,415 epoch 4 - iter 219/738 - loss 0.04486914 - time (sec): 14.40 - samples/sec: 3287.51 - lr: 0.000037 - momentum: 0.000000
2023-10-13 16:10:50,035 epoch 4 - iter 292/738 - loss 0.04509778 - time (sec): 19.02 - samples/sec: 3300.10 - lr: 0.000037 - momentum: 0.000000
2023-10-13 16:10:55,060 epoch 4 - iter 365/738 - loss 0.04617630 - time (sec): 24.05 - samples/sec: 3306.06 - lr: 0.000036 - momentum: 0.000000
2023-10-13 16:11:00,379 epoch 4 - iter 438/738 - loss 0.04564721 - time (sec): 29.37 - samples/sec: 3297.33 - lr: 0.000036 - momentum: 0.000000
2023-10-13 16:11:06,000 epoch 4 - iter 511/738 - loss 0.04566072 - time (sec): 34.99 - samples/sec: 3301.63 - lr: 0.000035 - momentum: 0.000000
2023-10-13 16:11:10,755 epoch 4 - iter 584/738 - loss 0.04718094 - time (sec): 39.74 - samples/sec: 3320.12 - lr: 0.000035 - momentum: 0.000000
2023-10-13 16:11:15,857 epoch 4 - iter 657/738 - loss 0.04928165 - time (sec): 44.84 - samples/sec: 3317.33 - lr: 0.000034 - momentum: 0.000000
2023-10-13 16:11:20,524 epoch 4 - iter 730/738 - loss 0.05030015 - time (sec): 49.51 - samples/sec: 3328.40 - lr: 0.000033 - momentum: 0.000000
2023-10-13 16:11:21,012 ----------------------------------------------------------------------------------------------------
2023-10-13 16:11:21,012 EPOCH 4 done: loss 0.0504 - lr: 0.000033
2023-10-13 16:11:32,198 DEV : loss 0.16232198476791382 - f1-score (micro avg) 0.791
2023-10-13 16:11:32,230 saving best model
2023-10-13 16:11:32,738 ----------------------------------------------------------------------------------------------------
2023-10-13 16:11:37,904 epoch 5 - iter 73/738 - loss 0.04547528 - time (sec): 5.16 - samples/sec: 3187.03 - lr: 0.000033 - momentum: 0.000000
2023-10-13 16:11:42,931 epoch 5 - iter 146/738 - loss 0.03745766 - time (sec): 10.19 - samples/sec: 3241.04 - lr: 0.000032 - momentum: 0.000000
2023-10-13 16:11:47,719 epoch 5 - iter 219/738 - loss 0.04018018 - time (sec): 14.98 - samples/sec: 3334.49 - lr: 0.000032 - momentum: 0.000000
2023-10-13 16:11:52,545 epoch 5 - iter 292/738 - loss 0.03642262 - time (sec): 19.80 - samples/sec: 3334.28 - lr: 0.000031 - momentum: 0.000000
2023-10-13 16:11:57,388 epoch 5 - iter 365/738 - loss 0.03525492 - time (sec): 24.65 - samples/sec: 3330.64 - lr: 0.000031 - momentum: 0.000000
2023-10-13 16:12:02,106 epoch 5 - iter 438/738 - loss 0.03546021 - time (sec): 29.36 - samples/sec: 3321.27 - lr: 0.000030 - momentum: 0.000000
2023-10-13 16:12:07,137 epoch 5 - iter 511/738 - loss 0.03493695 - time (sec): 34.39 - samples/sec: 3306.29 - lr: 0.000030 - momentum: 0.000000
2023-10-13 16:12:12,773 epoch 5 - iter 584/738 - loss 0.03563392 - time (sec): 40.03 - samples/sec: 3286.64 - lr: 0.000029 - momentum: 0.000000
2023-10-13 16:12:18,509 epoch 5 - iter 657/738 - loss 0.03533050 - time (sec): 45.77 - samples/sec: 3273.26 - lr: 0.000028 - momentum: 0.000000
2023-10-13 16:12:23,151 epoch 5 - iter 730/738 - loss 0.03550563 - time (sec): 50.41 - samples/sec: 3266.32 - lr: 0.000028 - momentum: 0.000000
2023-10-13 16:12:23,713 ----------------------------------------------------------------------------------------------------
2023-10-13 16:12:23,713 EPOCH 5 done: loss 0.0356 - lr: 0.000028
2023-10-13 16:12:34,872 DEV : loss 0.16207054257392883 - f1-score (micro avg) 0.8011
2023-10-13 16:12:34,902 saving best model
2023-10-13 16:12:35,395 ----------------------------------------------------------------------------------------------------
2023-10-13 16:12:39,937 epoch 6 - iter 73/738 - loss 0.03186713 - time (sec): 4.54 - samples/sec: 3270.44 - lr: 0.000027 - momentum: 0.000000
2023-10-13 16:12:45,514 epoch 6 - iter 146/738 - loss 0.02345574 - time (sec): 10.11 - samples/sec: 3355.66 - lr: 0.000027 - momentum: 0.000000
2023-10-13 16:12:50,577 epoch 6 - iter 219/738 - loss 0.02552735 - time (sec): 15.18 - samples/sec: 3347.83 - lr: 0.000026 - momentum: 0.000000
2023-10-13 16:12:55,260 epoch 6 - iter 292/738 - loss 0.02690248 - time (sec): 19.86 - samples/sec: 3336.26 - lr: 0.000026 - momentum: 0.000000
2023-10-13 16:13:01,206 epoch 6 - iter 365/738 - loss 0.02701658 - time (sec): 25.81 - samples/sec: 3272.23 - lr: 0.000025 - momentum: 0.000000
2023-10-13 16:13:06,100 epoch 6 - iter 438/738 - loss 0.02755494 - time (sec): 30.70 - samples/sec: 3300.44 - lr: 0.000025 - momentum: 0.000000
2023-10-13 16:13:10,632 epoch 6 - iter 511/738 - loss 0.02727897 - time (sec): 35.23 - samples/sec: 3311.23 - lr: 0.000024 - momentum: 0.000000
2023-10-13 16:13:15,741 epoch 6 - iter 584/738 - loss 0.02651565 - time (sec): 40.34 - samples/sec: 3301.92 - lr: 0.000023 - momentum: 0.000000
2023-10-13 16:13:20,645 epoch 6 - iter 657/738 - loss 0.02669398 - time (sec): 45.25 - samples/sec: 3290.08 - lr: 0.000023 - momentum: 0.000000
2023-10-13 16:13:25,644 epoch 6 - iter 730/738 - loss 0.02762465 - time (sec): 50.24 - samples/sec: 3280.08 - lr: 0.000022 - momentum: 0.000000
2023-10-13 16:13:26,124 ----------------------------------------------------------------------------------------------------
2023-10-13 16:13:26,124 EPOCH 6 done: loss 0.0276 - lr: 0.000022
2023-10-13 16:13:37,376 DEV : loss 0.17623859643936157 - f1-score (micro avg) 0.813
2023-10-13 16:13:37,409 saving best model
2023-10-13 16:13:37,915 ----------------------------------------------------------------------------------------------------
2023-10-13 16:13:43,425 epoch 7 - iter 73/738 - loss 0.02042456 - time (sec): 5.51 - samples/sec: 3090.32 - lr: 0.000022 - momentum: 0.000000
2023-10-13 16:13:47,723 epoch 7 - iter 146/738 - loss 0.01755039 - time (sec): 9.81 - samples/sec: 3246.41 - lr: 0.000021 - momentum: 0.000000
2023-10-13 16:13:53,127 epoch 7 - iter 219/738 - loss 0.01705598 - time (sec): 15.21 - samples/sec: 3310.24 - lr: 0.000021 - momentum: 0.000000
2023-10-13 16:13:58,575 epoch 7 - iter 292/738 - loss 0.01623354 - time (sec): 20.66 - samples/sec: 3333.59 - lr: 0.000020 - momentum: 0.000000
2023-10-13 16:14:03,036 epoch 7 - iter 365/738 - loss 0.01564171 - time (sec): 25.12 - samples/sec: 3331.51 - lr: 0.000020 - momentum: 0.000000
2023-10-13 16:14:07,628 epoch 7 - iter 438/738 - loss 0.01656402 - time (sec): 29.71 - samples/sec: 3334.05 - lr: 0.000019 - momentum: 0.000000
2023-10-13 16:14:12,209 epoch 7 - iter 511/738 - loss 0.01804093 - time (sec): 34.29 - samples/sec: 3354.90 - lr: 0.000018 - momentum: 0.000000
2023-10-13 16:14:16,920 epoch 7 - iter 584/738 - loss 0.01843500 - time (sec): 39.00 - samples/sec: 3345.53 - lr: 0.000018 - momentum: 0.000000
2023-10-13 16:14:21,848 epoch 7 - iter 657/738 - loss 0.01777761 - time (sec): 43.93 - samples/sec: 3333.64 - lr: 0.000017 - momentum: 0.000000
2023-10-13 16:14:27,303 epoch 7 - iter 730/738 - loss 0.01822353 - time (sec): 49.39 - samples/sec: 3337.85 - lr: 0.000017 - momentum: 0.000000
2023-10-13 16:14:27,768 ----------------------------------------------------------------------------------------------------
2023-10-13 16:14:27,768 EPOCH 7 done: loss 0.0182 - lr: 0.000017
2023-10-13 16:14:38,887 DEV : loss 0.18489772081375122 - f1-score (micro avg) 0.8296
2023-10-13 16:14:38,917 saving best model
2023-10-13 16:14:39,522 ----------------------------------------------------------------------------------------------------
2023-10-13 16:14:44,322 epoch 8 - iter 73/738 - loss 0.02158221 - time (sec): 4.80 - samples/sec: 3479.64 - lr: 0.000016 - momentum: 0.000000
2023-10-13 16:14:49,508 epoch 8 - iter 146/738 - loss 0.01696772 - time (sec): 9.98 - samples/sec: 3370.64 - lr: 0.000016 - momentum: 0.000000
2023-10-13 16:14:55,059 epoch 8 - iter 219/738 - loss 0.01617594 - time (sec): 15.53 - samples/sec: 3401.07 - lr: 0.000015 - momentum: 0.000000
2023-10-13 16:14:59,771 epoch 8 - iter 292/738 - loss 0.01777930 - time (sec): 20.24 - samples/sec: 3343.21 - lr: 0.000015 - momentum: 0.000000
2023-10-13 16:15:04,316 epoch 8 - iter 365/738 - loss 0.01630697 - time (sec): 24.79 - samples/sec: 3323.60 - lr: 0.000014 - momentum: 0.000000
2023-10-13 16:15:09,176 epoch 8 - iter 438/738 - loss 0.01597156 - time (sec): 29.65 - samples/sec: 3324.09 - lr: 0.000013 - momentum: 0.000000
2023-10-13 16:15:14,012 epoch 8 - iter 511/738 - loss 0.01492589 - time (sec): 34.49 - samples/sec: 3327.04 - lr: 0.000013 - momentum: 0.000000
2023-10-13 16:15:18,408 epoch 8 - iter 584/738 - loss 0.01525583 - time (sec): 38.88 - samples/sec: 3326.93 - lr: 0.000012 - momentum: 0.000000
2023-10-13 16:15:23,227 epoch 8 - iter 657/738 - loss 0.01427343 - time (sec): 43.70 - samples/sec: 3323.76 - lr: 0.000012 - momentum: 0.000000
2023-10-13 16:15:28,712 epoch 8 - iter 730/738 - loss 0.01360292 - time (sec): 49.18 - samples/sec: 3348.93 - lr: 0.000011 - momentum: 0.000000
2023-10-13 16:15:29,203 ----------------------------------------------------------------------------------------------------
2023-10-13 16:15:29,203 EPOCH 8 done: loss 0.0135 - lr: 0.000011
2023-10-13 16:15:40,324 DEV : loss 0.20899920165538788 - f1-score (micro avg) 0.825
2023-10-13 16:15:40,355 ----------------------------------------------------------------------------------------------------
2023-10-13 16:15:45,276 epoch 9 - iter 73/738 - loss 0.00737419 - time (sec): 4.92 - samples/sec: 3151.24 - lr: 0.000011 - momentum: 0.000000
2023-10-13 16:15:50,322 epoch 9 - iter 146/738 - loss 0.00639227 - time (sec): 9.97 - samples/sec: 3242.90 - lr: 0.000010 - momentum: 0.000000
2023-10-13 16:15:54,891 epoch 9 - iter 219/738 - loss 0.00609163 - time (sec): 14.54 - samples/sec: 3323.28 - lr: 0.000010 - momentum: 0.000000
2023-10-13 16:15:59,713 epoch 9 - iter 292/738 - loss 0.00710336 - time (sec): 19.36 - samples/sec: 3348.37 - lr: 0.000009 - momentum: 0.000000
2023-10-13 16:16:05,038 epoch 9 - iter 365/738 - loss 0.00870007 - time (sec): 24.68 - samples/sec: 3353.92 - lr: 0.000008 - momentum: 0.000000
2023-10-13 16:16:09,730 epoch 9 - iter 438/738 - loss 0.00815323 - time (sec): 29.37 - samples/sec: 3345.31 - lr: 0.000008 - momentum: 0.000000
2023-10-13 16:16:14,960 epoch 9 - iter 511/738 - loss 0.00770538 - time (sec): 34.60 - samples/sec: 3328.70 - lr: 0.000007 - momentum: 0.000000
2023-10-13 16:16:19,571 epoch 9 - iter 584/738 - loss 0.00789209 - time (sec): 39.21 - samples/sec: 3320.50 - lr: 0.000007 - momentum: 0.000000
2023-10-13 16:16:24,295 epoch 9 - iter 657/738 - loss 0.00748613 - time (sec): 43.94 - samples/sec: 3337.76 - lr: 0.000006 - momentum: 0.000000
2023-10-13 16:16:29,662 epoch 9 - iter 730/738 - loss 0.00819185 - time (sec): 49.31 - samples/sec: 3338.75 - lr: 0.000006 - momentum: 0.000000
2023-10-13 16:16:30,264 ----------------------------------------------------------------------------------------------------
2023-10-13 16:16:30,264 EPOCH 9 done: loss 0.0083 - lr: 0.000006
2023-10-13 16:16:41,419 DEV : loss 0.20789538323879242 - f1-score (micro avg) 0.8301
2023-10-13 16:16:41,457 saving best model
2023-10-13 16:16:41,968 ----------------------------------------------------------------------------------------------------
2023-10-13 16:16:47,776 epoch 10 - iter 73/738 - loss 0.00953757 - time (sec): 5.80 - samples/sec: 3035.36 - lr: 0.000005 - momentum: 0.000000
2023-10-13 16:16:52,458 epoch 10 - iter 146/738 - loss 0.00607430 - time (sec): 10.48 - samples/sec: 3196.87 - lr: 0.000004 - momentum: 0.000000
2023-10-13 16:16:56,727 epoch 10 - iter 219/738 - loss 0.00711998 - time (sec): 14.75 - samples/sec: 3318.90 - lr: 0.000004 - momentum: 0.000000
2023-10-13 16:17:01,617 epoch 10 - iter 292/738 - loss 0.00638173 - time (sec): 19.64 - samples/sec: 3324.20 - lr: 0.000003 - momentum: 0.000000
2023-10-13 16:17:06,558 epoch 10 - iter 365/738 - loss 0.00593203 - time (sec): 24.59 - samples/sec: 3308.48 - lr: 0.000003 - momentum: 0.000000
2023-10-13 16:17:12,087 epoch 10 - iter 438/738 - loss 0.00538934 - time (sec): 30.11 - samples/sec: 3322.99 - lr: 0.000002 - momentum: 0.000000
2023-10-13 16:17:16,645 epoch 10 - iter 511/738 - loss 0.00527968 - time (sec): 34.67 - samples/sec: 3309.72 - lr: 0.000002 - momentum: 0.000000
2023-10-13 16:17:22,018 epoch 10 - iter 584/738 - loss 0.00510774 - time (sec): 40.04 - samples/sec: 3290.48 - lr: 0.000001 - momentum: 0.000000
2023-10-13 16:17:27,000 epoch 10 - iter 657/738 - loss 0.00530120 - time (sec): 45.03 - samples/sec: 3292.97 - lr: 0.000001 - momentum: 0.000000
2023-10-13 16:17:32,035 epoch 10 - iter 730/738 - loss 0.00491710 - time (sec): 50.06 - samples/sec: 3294.31 - lr: 0.000000 - momentum: 0.000000
2023-10-13 16:17:32,457 ----------------------------------------------------------------------------------------------------
2023-10-13 16:17:32,458 EPOCH 10 done: loss 0.0049 - lr: 0.000000
2023-10-13 16:17:43,618 DEV : loss 0.21186788380146027 - f1-score (micro avg) 0.8321
2023-10-13 16:17:43,651 saving best model
2023-10-13 16:17:44,637 ----------------------------------------------------------------------------------------------------
2023-10-13 16:17:44,638 Loading model from best epoch ...
2023-10-13 16:17:46,067 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
2023-10-13 16:17:51,903
Results:
- F-score (micro) 0.7943
- F-score (macro) 0.6995
- Accuracy 0.6824
By class:
precision recall f1-score support
loc 0.8442 0.8776 0.8606 858
pers 0.7558 0.7952 0.7750 537
org 0.5603 0.5985 0.5788 132
time 0.5538 0.6667 0.6050 54
prod 0.7222 0.6393 0.6783 61
micro avg 0.7769 0.8124 0.7943 1642
macro avg 0.6873 0.7155 0.6995 1642
weighted avg 0.7784 0.8124 0.7947 1642
2023-10-13 16:17:51,904 ----------------------------------------------------------------------------------------------------