|
2023-10-13 11:37:23,100 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:37:23,101 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 11:37:23,101 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:37:23,101 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-13 11:37:23,101 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:37:23,101 Train: 3575 sentences |
|
2023-10-13 11:37:23,101 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 11:37:23,101 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:37:23,101 Training Params: |
|
2023-10-13 11:37:23,101 - learning_rate: "3e-05" |
|
2023-10-13 11:37:23,101 - mini_batch_size: "8" |
|
2023-10-13 11:37:23,101 - max_epochs: "10" |
|
2023-10-13 11:37:23,101 - shuffle: "True" |
|
2023-10-13 11:37:23,101 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:37:23,101 Plugins: |
|
2023-10-13 11:37:23,101 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 11:37:23,101 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:37:23,101 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 11:37:23,101 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 11:37:23,101 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:37:23,101 Computation: |
|
2023-10-13 11:37:23,101 - compute on device: cuda:0 |
|
2023-10-13 11:37:23,102 - embedding storage: none |
|
2023-10-13 11:37:23,102 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:37:23,102 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-13 11:37:23,102 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:37:23,102 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:37:27,202 epoch 1 - iter 44/447 - loss 3.14196282 - time (sec): 4.10 - samples/sec: 2319.10 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 11:37:29,895 epoch 1 - iter 88/447 - loss 2.47803589 - time (sec): 6.79 - samples/sec: 2559.07 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 11:37:32,432 epoch 1 - iter 132/447 - loss 1.88263805 - time (sec): 9.33 - samples/sec: 2683.04 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 11:37:35,166 epoch 1 - iter 176/447 - loss 1.52213191 - time (sec): 12.06 - samples/sec: 2772.84 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 11:37:37,858 epoch 1 - iter 220/447 - loss 1.30514887 - time (sec): 14.75 - samples/sec: 2816.89 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 11:37:40,687 epoch 1 - iter 264/447 - loss 1.13508816 - time (sec): 17.58 - samples/sec: 2865.93 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 11:37:43,389 epoch 1 - iter 308/447 - loss 1.01636293 - time (sec): 20.29 - samples/sec: 2907.63 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 11:37:46,313 epoch 1 - iter 352/447 - loss 0.91617799 - time (sec): 23.21 - samples/sec: 2931.05 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 11:37:49,109 epoch 1 - iter 396/447 - loss 0.84517290 - time (sec): 26.01 - samples/sec: 2934.25 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 11:37:52,076 epoch 1 - iter 440/447 - loss 0.78671462 - time (sec): 28.97 - samples/sec: 2948.06 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 11:37:52,494 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:37:52,495 EPOCH 1 done: loss 0.7797 - lr: 0.000029 |
|
2023-10-13 11:37:57,330 DEV : loss 0.18337313830852509 - f1-score (micro avg) 0.6258 |
|
2023-10-13 11:37:57,355 saving best model |
|
2023-10-13 11:37:57,656 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:38:00,311 epoch 2 - iter 44/447 - loss 0.18448565 - time (sec): 2.65 - samples/sec: 3222.29 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 11:38:03,060 epoch 2 - iter 88/447 - loss 0.20795048 - time (sec): 5.40 - samples/sec: 3154.56 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 11:38:05,773 epoch 2 - iter 132/447 - loss 0.19839698 - time (sec): 8.12 - samples/sec: 3166.50 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 11:38:08,634 epoch 2 - iter 176/447 - loss 0.19080864 - time (sec): 10.98 - samples/sec: 3105.64 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 11:38:11,210 epoch 2 - iter 220/447 - loss 0.18472852 - time (sec): 13.55 - samples/sec: 3096.46 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 11:38:14,043 epoch 2 - iter 264/447 - loss 0.17431429 - time (sec): 16.39 - samples/sec: 3085.73 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 11:38:16,896 epoch 2 - iter 308/447 - loss 0.17213378 - time (sec): 19.24 - samples/sec: 3110.09 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 11:38:19,523 epoch 2 - iter 352/447 - loss 0.17060590 - time (sec): 21.86 - samples/sec: 3102.96 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 11:38:22,075 epoch 2 - iter 396/447 - loss 0.16910987 - time (sec): 24.42 - samples/sec: 3108.60 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 11:38:25,003 epoch 2 - iter 440/447 - loss 0.16480794 - time (sec): 27.34 - samples/sec: 3120.88 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 11:38:25,414 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:38:25,415 EPOCH 2 done: loss 0.1637 - lr: 0.000027 |
|
2023-10-13 11:38:33,914 DEV : loss 0.1275636851787567 - f1-score (micro avg) 0.6914 |
|
2023-10-13 11:38:33,954 saving best model |
|
2023-10-13 11:38:34,472 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:38:37,334 epoch 3 - iter 44/447 - loss 0.09103184 - time (sec): 2.86 - samples/sec: 2697.32 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 11:38:40,169 epoch 3 - iter 88/447 - loss 0.08197685 - time (sec): 5.69 - samples/sec: 2802.74 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 11:38:43,026 epoch 3 - iter 132/447 - loss 0.08754244 - time (sec): 8.55 - samples/sec: 2808.32 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 11:38:46,171 epoch 3 - iter 176/447 - loss 0.08248814 - time (sec): 11.70 - samples/sec: 2810.03 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 11:38:49,406 epoch 3 - iter 220/447 - loss 0.08252264 - time (sec): 14.93 - samples/sec: 2808.58 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 11:38:52,266 epoch 3 - iter 264/447 - loss 0.07964287 - time (sec): 17.79 - samples/sec: 2840.47 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 11:38:55,261 epoch 3 - iter 308/447 - loss 0.08228040 - time (sec): 20.79 - samples/sec: 2845.47 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 11:38:58,300 epoch 3 - iter 352/447 - loss 0.08248586 - time (sec): 23.83 - samples/sec: 2845.71 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 11:39:01,073 epoch 3 - iter 396/447 - loss 0.08367732 - time (sec): 26.60 - samples/sec: 2862.02 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 11:39:04,286 epoch 3 - iter 440/447 - loss 0.08345599 - time (sec): 29.81 - samples/sec: 2866.59 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 11:39:04,678 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:39:04,679 EPOCH 3 done: loss 0.0835 - lr: 0.000023 |
|
2023-10-13 11:39:13,300 DEV : loss 0.12531331181526184 - f1-score (micro avg) 0.736 |
|
2023-10-13 11:39:13,334 saving best model |
|
2023-10-13 11:39:13,819 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:39:16,553 epoch 4 - iter 44/447 - loss 0.06081512 - time (sec): 2.73 - samples/sec: 3280.70 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 11:39:19,128 epoch 4 - iter 88/447 - loss 0.06130093 - time (sec): 5.31 - samples/sec: 3203.74 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 11:39:22,055 epoch 4 - iter 132/447 - loss 0.05727770 - time (sec): 8.23 - samples/sec: 3163.36 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 11:39:25,097 epoch 4 - iter 176/447 - loss 0.05437708 - time (sec): 11.28 - samples/sec: 3164.29 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 11:39:27,953 epoch 4 - iter 220/447 - loss 0.05094653 - time (sec): 14.13 - samples/sec: 3138.28 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 11:39:30,798 epoch 4 - iter 264/447 - loss 0.05163494 - time (sec): 16.98 - samples/sec: 3120.94 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 11:39:33,386 epoch 4 - iter 308/447 - loss 0.05153137 - time (sec): 19.57 - samples/sec: 3133.65 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 11:39:36,130 epoch 4 - iter 352/447 - loss 0.05076236 - time (sec): 22.31 - samples/sec: 3123.62 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 11:39:38,564 epoch 4 - iter 396/447 - loss 0.04840169 - time (sec): 24.74 - samples/sec: 3106.57 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 11:39:41,393 epoch 4 - iter 440/447 - loss 0.04853069 - time (sec): 27.57 - samples/sec: 3096.09 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 11:39:41,794 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:39:41,794 EPOCH 4 done: loss 0.0484 - lr: 0.000020 |
|
2023-10-13 11:39:50,889 DEV : loss 0.1411086916923523 - f1-score (micro avg) 0.7523 |
|
2023-10-13 11:39:50,925 saving best model |
|
2023-10-13 11:39:51,411 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:39:54,906 epoch 5 - iter 44/447 - loss 0.03860730 - time (sec): 3.49 - samples/sec: 2760.43 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 11:39:57,725 epoch 5 - iter 88/447 - loss 0.03556237 - time (sec): 6.31 - samples/sec: 2793.51 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 11:40:00,832 epoch 5 - iter 132/447 - loss 0.03253877 - time (sec): 9.42 - samples/sec: 2787.23 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 11:40:03,721 epoch 5 - iter 176/447 - loss 0.03527827 - time (sec): 12.31 - samples/sec: 2792.55 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 11:40:06,971 epoch 5 - iter 220/447 - loss 0.03375370 - time (sec): 15.56 - samples/sec: 2788.49 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 11:40:09,908 epoch 5 - iter 264/447 - loss 0.03389079 - time (sec): 18.50 - samples/sec: 2812.29 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 11:40:12,805 epoch 5 - iter 308/447 - loss 0.03159410 - time (sec): 21.39 - samples/sec: 2808.51 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 11:40:15,843 epoch 5 - iter 352/447 - loss 0.03133658 - time (sec): 24.43 - samples/sec: 2814.90 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 11:40:18,798 epoch 5 - iter 396/447 - loss 0.03185176 - time (sec): 27.38 - samples/sec: 2799.64 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 11:40:21,613 epoch 5 - iter 440/447 - loss 0.03293362 - time (sec): 30.20 - samples/sec: 2823.75 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 11:40:22,058 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:40:22,058 EPOCH 5 done: loss 0.0331 - lr: 0.000017 |
|
2023-10-13 11:40:30,534 DEV : loss 0.17277590930461884 - f1-score (micro avg) 0.775 |
|
2023-10-13 11:40:30,562 saving best model |
|
2023-10-13 11:40:31,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:40:34,245 epoch 6 - iter 44/447 - loss 0.01641274 - time (sec): 3.09 - samples/sec: 2782.37 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 11:40:36,970 epoch 6 - iter 88/447 - loss 0.02088310 - time (sec): 5.81 - samples/sec: 2784.85 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 11:40:40,026 epoch 6 - iter 132/447 - loss 0.02059010 - time (sec): 8.87 - samples/sec: 2823.72 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 11:40:43,022 epoch 6 - iter 176/447 - loss 0.02145822 - time (sec): 11.87 - samples/sec: 2862.37 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 11:40:45,676 epoch 6 - iter 220/447 - loss 0.02167903 - time (sec): 14.52 - samples/sec: 2861.66 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 11:40:48,507 epoch 6 - iter 264/447 - loss 0.02226561 - time (sec): 17.35 - samples/sec: 2858.77 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 11:40:51,255 epoch 6 - iter 308/447 - loss 0.02320899 - time (sec): 20.10 - samples/sec: 2855.54 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 11:40:54,001 epoch 6 - iter 352/447 - loss 0.02387174 - time (sec): 22.84 - samples/sec: 2888.48 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 11:40:57,475 epoch 6 - iter 396/447 - loss 0.02377590 - time (sec): 26.32 - samples/sec: 2894.39 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 11:41:00,559 epoch 6 - iter 440/447 - loss 0.02324946 - time (sec): 29.40 - samples/sec: 2898.65 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 11:41:00,995 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:41:00,995 EPOCH 6 done: loss 0.0233 - lr: 0.000013 |
|
2023-10-13 11:41:09,443 DEV : loss 0.18894143402576447 - f1-score (micro avg) 0.7728 |
|
2023-10-13 11:41:09,472 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:41:12,375 epoch 7 - iter 44/447 - loss 0.02022617 - time (sec): 2.90 - samples/sec: 3011.72 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 11:41:15,181 epoch 7 - iter 88/447 - loss 0.01788001 - time (sec): 5.71 - samples/sec: 2957.18 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 11:41:18,744 epoch 7 - iter 132/447 - loss 0.01554472 - time (sec): 9.27 - samples/sec: 2910.05 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 11:41:21,757 epoch 7 - iter 176/447 - loss 0.01521381 - time (sec): 12.28 - samples/sec: 2875.62 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 11:41:24,670 epoch 7 - iter 220/447 - loss 0.01618997 - time (sec): 15.20 - samples/sec: 2899.60 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 11:41:27,422 epoch 7 - iter 264/447 - loss 0.01631437 - time (sec): 17.95 - samples/sec: 2899.23 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 11:41:30,376 epoch 7 - iter 308/447 - loss 0.01443128 - time (sec): 20.90 - samples/sec: 2878.30 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 11:41:33,306 epoch 7 - iter 352/447 - loss 0.01478556 - time (sec): 23.83 - samples/sec: 2873.89 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 11:41:36,079 epoch 7 - iter 396/447 - loss 0.01588076 - time (sec): 26.61 - samples/sec: 2862.24 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 11:41:38,818 epoch 7 - iter 440/447 - loss 0.01554793 - time (sec): 29.35 - samples/sec: 2872.11 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 11:41:39,547 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:41:39,548 EPOCH 7 done: loss 0.0151 - lr: 0.000010 |
|
2023-10-13 11:41:48,012 DEV : loss 0.20179197192192078 - f1-score (micro avg) 0.7745 |
|
2023-10-13 11:41:48,038 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:41:51,122 epoch 8 - iter 44/447 - loss 0.01174812 - time (sec): 3.08 - samples/sec: 2779.76 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 11:41:54,327 epoch 8 - iter 88/447 - loss 0.01125552 - time (sec): 6.29 - samples/sec: 2797.47 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 11:41:57,342 epoch 8 - iter 132/447 - loss 0.01066657 - time (sec): 9.30 - samples/sec: 2866.41 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 11:42:00,586 epoch 8 - iter 176/447 - loss 0.00942978 - time (sec): 12.55 - samples/sec: 2878.82 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 11:42:03,430 epoch 8 - iter 220/447 - loss 0.01169514 - time (sec): 15.39 - samples/sec: 2852.35 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 11:42:06,472 epoch 8 - iter 264/447 - loss 0.01206510 - time (sec): 18.43 - samples/sec: 2820.51 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 11:42:09,416 epoch 8 - iter 308/447 - loss 0.01170503 - time (sec): 21.38 - samples/sec: 2850.99 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 11:42:12,188 epoch 8 - iter 352/447 - loss 0.01173590 - time (sec): 24.15 - samples/sec: 2867.64 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 11:42:15,076 epoch 8 - iter 396/447 - loss 0.01118873 - time (sec): 27.04 - samples/sec: 2862.49 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 11:42:17,941 epoch 8 - iter 440/447 - loss 0.01135881 - time (sec): 29.90 - samples/sec: 2852.70 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 11:42:18,361 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:42:18,361 EPOCH 8 done: loss 0.0112 - lr: 0.000007 |
|
2023-10-13 11:42:26,959 DEV : loss 0.20578144490718842 - f1-score (micro avg) 0.7819 |
|
2023-10-13 11:42:26,984 saving best model |
|
2023-10-13 11:42:27,452 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:42:30,327 epoch 9 - iter 44/447 - loss 0.01161954 - time (sec): 2.87 - samples/sec: 2837.49 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 11:42:33,371 epoch 9 - iter 88/447 - loss 0.00638656 - time (sec): 5.92 - samples/sec: 2937.51 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 11:42:36,454 epoch 9 - iter 132/447 - loss 0.00666210 - time (sec): 9.00 - samples/sec: 2858.86 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 11:42:39,560 epoch 9 - iter 176/447 - loss 0.00568668 - time (sec): 12.10 - samples/sec: 2879.22 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 11:42:42,889 epoch 9 - iter 220/447 - loss 0.00591880 - time (sec): 15.43 - samples/sec: 2829.31 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 11:42:45,648 epoch 9 - iter 264/447 - loss 0.00762558 - time (sec): 18.19 - samples/sec: 2851.85 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 11:42:48,749 epoch 9 - iter 308/447 - loss 0.00714813 - time (sec): 21.29 - samples/sec: 2886.12 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 11:42:51,467 epoch 9 - iter 352/447 - loss 0.00676372 - time (sec): 24.01 - samples/sec: 2893.55 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 11:42:54,092 epoch 9 - iter 396/447 - loss 0.00645925 - time (sec): 26.64 - samples/sec: 2905.73 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 11:42:56,930 epoch 9 - iter 440/447 - loss 0.00653268 - time (sec): 29.47 - samples/sec: 2894.96 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 11:42:57,338 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:42:57,338 EPOCH 9 done: loss 0.0066 - lr: 0.000003 |
|
2023-10-13 11:43:05,394 DEV : loss 0.22298528254032135 - f1-score (micro avg) 0.7826 |
|
2023-10-13 11:43:05,420 saving best model |
|
2023-10-13 11:43:05,878 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:43:08,761 epoch 10 - iter 44/447 - loss 0.00685104 - time (sec): 2.88 - samples/sec: 3017.48 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 11:43:11,462 epoch 10 - iter 88/447 - loss 0.00441715 - time (sec): 5.58 - samples/sec: 2963.15 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 11:43:14,101 epoch 10 - iter 132/447 - loss 0.00480244 - time (sec): 8.22 - samples/sec: 3045.37 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 11:43:16,895 epoch 10 - iter 176/447 - loss 0.00463435 - time (sec): 11.01 - samples/sec: 3056.83 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 11:43:19,970 epoch 10 - iter 220/447 - loss 0.00571713 - time (sec): 14.09 - samples/sec: 3038.35 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 11:43:23,223 epoch 10 - iter 264/447 - loss 0.00547250 - time (sec): 17.34 - samples/sec: 2971.42 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 11:43:26,251 epoch 10 - iter 308/447 - loss 0.00523613 - time (sec): 20.37 - samples/sec: 2963.30 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 11:43:28,858 epoch 10 - iter 352/447 - loss 0.00531243 - time (sec): 22.98 - samples/sec: 2974.47 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 11:43:31,533 epoch 10 - iter 396/447 - loss 0.00545590 - time (sec): 25.65 - samples/sec: 2979.26 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 11:43:34,543 epoch 10 - iter 440/447 - loss 0.00532315 - time (sec): 28.66 - samples/sec: 2963.33 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 11:43:35,033 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:43:35,034 EPOCH 10 done: loss 0.0052 - lr: 0.000000 |
|
2023-10-13 11:43:43,137 DEV : loss 0.2202497124671936 - f1-score (micro avg) 0.7904 |
|
2023-10-13 11:43:43,163 saving best model |
|
2023-10-13 11:43:44,012 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:43:44,013 Loading model from best epoch ... |
|
2023-10-13 11:43:45,811 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-13 11:43:50,442 |
|
Results: |
|
- F-score (micro) 0.7564 |
|
- F-score (macro) 0.6816 |
|
- Accuracy 0.6279 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.8413 0.8540 0.8476 596 |
|
pers 0.6805 0.7868 0.7298 333 |
|
org 0.4885 0.4848 0.4867 132 |
|
prod 0.6852 0.5606 0.6167 66 |
|
time 0.7200 0.7347 0.7273 49 |
|
|
|
micro avg 0.7412 0.7721 0.7564 1176 |
|
macro avg 0.6831 0.6842 0.6816 1176 |
|
weighted avg 0.7424 0.7721 0.7558 1176 |
|
|
|
2023-10-13 11:43:50,442 ---------------------------------------------------------------------------------------------------- |
|
|