|
2023-10-13 12:16:20,073 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:16:20,074 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 12:16:20,074 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:16:20,074 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-13 12:16:20,074 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:16:20,074 Train: 3575 sentences |
|
2023-10-13 12:16:20,074 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 12:16:20,074 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:16:20,074 Training Params: |
|
2023-10-13 12:16:20,074 - learning_rate: "5e-05" |
|
2023-10-13 12:16:20,074 - mini_batch_size: "8" |
|
2023-10-13 12:16:20,074 - max_epochs: "10" |
|
2023-10-13 12:16:20,074 - shuffle: "True" |
|
2023-10-13 12:16:20,074 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:16:20,074 Plugins: |
|
2023-10-13 12:16:20,074 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 12:16:20,074 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:16:20,074 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 12:16:20,074 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 12:16:20,074 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:16:20,074 Computation: |
|
2023-10-13 12:16:20,075 - compute on device: cuda:0 |
|
2023-10-13 12:16:20,075 - embedding storage: none |
|
2023-10-13 12:16:20,075 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:16:20,075 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-13 12:16:20,075 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:16:20,075 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:16:22,769 epoch 1 - iter 44/447 - loss 3.14261033 - time (sec): 2.69 - samples/sec: 2960.33 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 12:16:25,692 epoch 1 - iter 88/447 - loss 2.14441076 - time (sec): 5.62 - samples/sec: 2837.51 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 12:16:28,648 epoch 1 - iter 132/447 - loss 1.53821589 - time (sec): 8.57 - samples/sec: 2881.13 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 12:16:31,310 epoch 1 - iter 176/447 - loss 1.26325007 - time (sec): 11.23 - samples/sec: 2904.95 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 12:16:34,078 epoch 1 - iter 220/447 - loss 1.06860500 - time (sec): 14.00 - samples/sec: 2946.64 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 12:16:37,553 epoch 1 - iter 264/447 - loss 0.91852698 - time (sec): 17.48 - samples/sec: 2953.39 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 12:16:40,411 epoch 1 - iter 308/447 - loss 0.83070003 - time (sec): 20.34 - samples/sec: 2931.86 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-13 12:16:43,058 epoch 1 - iter 352/447 - loss 0.75729611 - time (sec): 22.98 - samples/sec: 2960.94 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 12:16:46,142 epoch 1 - iter 396/447 - loss 0.70030509 - time (sec): 26.07 - samples/sec: 2932.61 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-13 12:16:49,210 epoch 1 - iter 440/447 - loss 0.65259845 - time (sec): 29.13 - samples/sec: 2905.36 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 12:16:49,782 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:16:49,783 EPOCH 1 done: loss 0.6424 - lr: 0.000049 |
|
2023-10-13 12:16:54,809 DEV : loss 0.17999280989170074 - f1-score (micro avg) 0.6119 |
|
2023-10-13 12:16:54,839 saving best model |
|
2023-10-13 12:16:55,180 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:16:58,029 epoch 2 - iter 44/447 - loss 0.21885310 - time (sec): 2.85 - samples/sec: 2992.93 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 12:17:00,968 epoch 2 - iter 88/447 - loss 0.20348487 - time (sec): 5.79 - samples/sec: 2915.48 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 12:17:03,645 epoch 2 - iter 132/447 - loss 0.18779304 - time (sec): 8.46 - samples/sec: 2955.05 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 12:17:06,522 epoch 2 - iter 176/447 - loss 0.18362775 - time (sec): 11.34 - samples/sec: 2978.59 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 12:17:09,210 epoch 2 - iter 220/447 - loss 0.17259586 - time (sec): 14.03 - samples/sec: 2963.51 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 12:17:12,381 epoch 2 - iter 264/447 - loss 0.16827387 - time (sec): 17.20 - samples/sec: 2960.19 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 12:17:15,141 epoch 2 - iter 308/447 - loss 0.16382626 - time (sec): 19.96 - samples/sec: 2969.07 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 12:17:18,142 epoch 2 - iter 352/447 - loss 0.15749203 - time (sec): 22.96 - samples/sec: 2987.22 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 12:17:21,109 epoch 2 - iter 396/447 - loss 0.15568860 - time (sec): 25.93 - samples/sec: 2968.50 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 12:17:23,854 epoch 2 - iter 440/447 - loss 0.15491984 - time (sec): 28.67 - samples/sec: 2972.67 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 12:17:24,241 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:17:24,241 EPOCH 2 done: loss 0.1546 - lr: 0.000045 |
|
2023-10-13 12:17:33,041 DEV : loss 0.11931649595499039 - f1-score (micro avg) 0.7133 |
|
2023-10-13 12:17:33,072 saving best model |
|
2023-10-13 12:17:33,543 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:17:36,515 epoch 3 - iter 44/447 - loss 0.09401755 - time (sec): 2.97 - samples/sec: 2867.56 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-13 12:17:39,572 epoch 3 - iter 88/447 - loss 0.08053615 - time (sec): 6.03 - samples/sec: 3004.31 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 12:17:42,521 epoch 3 - iter 132/447 - loss 0.08329820 - time (sec): 8.98 - samples/sec: 3034.73 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 12:17:45,376 epoch 3 - iter 176/447 - loss 0.07875894 - time (sec): 11.83 - samples/sec: 3054.35 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 12:17:48,289 epoch 3 - iter 220/447 - loss 0.08444205 - time (sec): 14.74 - samples/sec: 3052.92 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 12:17:50,914 epoch 3 - iter 264/447 - loss 0.08644774 - time (sec): 17.37 - samples/sec: 3028.81 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 12:17:53,528 epoch 3 - iter 308/447 - loss 0.08382388 - time (sec): 19.98 - samples/sec: 3043.28 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 12:17:56,182 epoch 3 - iter 352/447 - loss 0.08593653 - time (sec): 22.64 - samples/sec: 3040.32 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 12:17:59,167 epoch 3 - iter 396/447 - loss 0.08618714 - time (sec): 25.62 - samples/sec: 3009.13 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 12:18:02,080 epoch 3 - iter 440/447 - loss 0.08664836 - time (sec): 28.54 - samples/sec: 2988.13 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 12:18:02,548 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:18:02,548 EPOCH 3 done: loss 0.0863 - lr: 0.000039 |
|
2023-10-13 12:18:11,118 DEV : loss 0.1396493762731552 - f1-score (micro avg) 0.7502 |
|
2023-10-13 12:18:11,149 saving best model |
|
2023-10-13 12:18:11,612 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:18:14,188 epoch 4 - iter 44/447 - loss 0.05525879 - time (sec): 2.57 - samples/sec: 2939.76 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 12:18:17,182 epoch 4 - iter 88/447 - loss 0.04633208 - time (sec): 5.56 - samples/sec: 3023.46 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 12:18:19,914 epoch 4 - iter 132/447 - loss 0.05499492 - time (sec): 8.30 - samples/sec: 3023.56 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 12:18:22,639 epoch 4 - iter 176/447 - loss 0.05387788 - time (sec): 11.02 - samples/sec: 3051.89 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 12:18:25,154 epoch 4 - iter 220/447 - loss 0.05237504 - time (sec): 13.53 - samples/sec: 3033.31 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-13 12:18:28,279 epoch 4 - iter 264/447 - loss 0.04945493 - time (sec): 16.66 - samples/sec: 3062.16 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-13 12:18:31,380 epoch 4 - iter 308/447 - loss 0.04857546 - time (sec): 19.76 - samples/sec: 3025.12 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 12:18:34,060 epoch 4 - iter 352/447 - loss 0.04897486 - time (sec): 22.44 - samples/sec: 3022.95 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 12:18:37,004 epoch 4 - iter 396/447 - loss 0.04934248 - time (sec): 25.39 - samples/sec: 3037.22 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-13 12:18:39,766 epoch 4 - iter 440/447 - loss 0.04973534 - time (sec): 28.15 - samples/sec: 3032.90 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-13 12:18:40,175 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:18:40,175 EPOCH 4 done: loss 0.0495 - lr: 0.000033 |
|
2023-10-13 12:18:48,839 DEV : loss 0.1706864982843399 - f1-score (micro avg) 0.7558 |
|
2023-10-13 12:18:48,869 saving best model |
|
2023-10-13 12:18:49,312 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:18:52,083 epoch 5 - iter 44/447 - loss 0.05200091 - time (sec): 2.76 - samples/sec: 2928.37 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-13 12:18:54,786 epoch 5 - iter 88/447 - loss 0.03619275 - time (sec): 5.46 - samples/sec: 2908.21 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 12:18:57,731 epoch 5 - iter 132/447 - loss 0.03375555 - time (sec): 8.41 - samples/sec: 2915.48 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 12:19:00,469 epoch 5 - iter 176/447 - loss 0.03452040 - time (sec): 11.15 - samples/sec: 2929.20 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-13 12:19:03,583 epoch 5 - iter 220/447 - loss 0.03448902 - time (sec): 14.26 - samples/sec: 2980.50 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-13 12:19:06,193 epoch 5 - iter 264/447 - loss 0.03387829 - time (sec): 16.87 - samples/sec: 3013.19 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 12:19:09,170 epoch 5 - iter 308/447 - loss 0.03356111 - time (sec): 19.85 - samples/sec: 3011.24 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 12:19:12,309 epoch 5 - iter 352/447 - loss 0.03271895 - time (sec): 22.99 - samples/sec: 3003.67 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 12:19:15,047 epoch 5 - iter 396/447 - loss 0.03239190 - time (sec): 25.72 - samples/sec: 3021.95 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 12:19:17,519 epoch 5 - iter 440/447 - loss 0.03136978 - time (sec): 28.20 - samples/sec: 3022.80 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 12:19:17,931 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:19:17,931 EPOCH 5 done: loss 0.0310 - lr: 0.000028 |
|
2023-10-13 12:19:26,603 DEV : loss 0.20082274079322815 - f1-score (micro avg) 0.7696 |
|
2023-10-13 12:19:26,634 saving best model |
|
2023-10-13 12:19:27,018 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:19:29,851 epoch 6 - iter 44/447 - loss 0.01622643 - time (sec): 2.83 - samples/sec: 3032.18 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 12:19:32,640 epoch 6 - iter 88/447 - loss 0.01695102 - time (sec): 5.62 - samples/sec: 2985.73 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 12:19:35,449 epoch 6 - iter 132/447 - loss 0.01667450 - time (sec): 8.43 - samples/sec: 2944.44 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 12:19:38,260 epoch 6 - iter 176/447 - loss 0.01827741 - time (sec): 11.24 - samples/sec: 2951.34 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 12:19:41,153 epoch 6 - iter 220/447 - loss 0.01959407 - time (sec): 14.13 - samples/sec: 2944.73 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 12:19:43,808 epoch 6 - iter 264/447 - loss 0.02009856 - time (sec): 16.79 - samples/sec: 2975.89 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 12:19:46,363 epoch 6 - iter 308/447 - loss 0.02044350 - time (sec): 19.34 - samples/sec: 2989.34 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 12:19:49,232 epoch 6 - iter 352/447 - loss 0.02069867 - time (sec): 22.21 - samples/sec: 2991.02 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 12:19:52,193 epoch 6 - iter 396/447 - loss 0.02098711 - time (sec): 25.17 - samples/sec: 2980.62 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 12:19:55,409 epoch 6 - iter 440/447 - loss 0.02136923 - time (sec): 28.39 - samples/sec: 2994.56 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 12:19:55,894 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:19:55,894 EPOCH 6 done: loss 0.0211 - lr: 0.000022 |
|
2023-10-13 12:20:04,495 DEV : loss 0.23715780675411224 - f1-score (micro avg) 0.7733 |
|
2023-10-13 12:20:04,525 saving best model |
|
2023-10-13 12:20:04,985 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:20:07,786 epoch 7 - iter 44/447 - loss 0.01199565 - time (sec): 2.79 - samples/sec: 3082.52 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 12:20:10,593 epoch 7 - iter 88/447 - loss 0.01126610 - time (sec): 5.60 - samples/sec: 3054.32 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 12:20:13,242 epoch 7 - iter 132/447 - loss 0.01064061 - time (sec): 8.25 - samples/sec: 3167.04 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 12:20:16,107 epoch 7 - iter 176/447 - loss 0.01633639 - time (sec): 11.11 - samples/sec: 3129.97 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 12:20:18,845 epoch 7 - iter 220/447 - loss 0.01466051 - time (sec): 13.85 - samples/sec: 3084.13 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 12:20:21,664 epoch 7 - iter 264/447 - loss 0.01635494 - time (sec): 16.67 - samples/sec: 3083.72 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 12:20:24,361 epoch 7 - iter 308/447 - loss 0.01815937 - time (sec): 19.37 - samples/sec: 3069.17 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 12:20:27,142 epoch 7 - iter 352/447 - loss 0.01751451 - time (sec): 22.15 - samples/sec: 3064.68 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 12:20:29,760 epoch 7 - iter 396/447 - loss 0.01791652 - time (sec): 24.77 - samples/sec: 3047.44 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 12:20:32,796 epoch 7 - iter 440/447 - loss 0.01696968 - time (sec): 27.80 - samples/sec: 3043.76 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 12:20:33,478 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:20:33,479 EPOCH 7 done: loss 0.0169 - lr: 0.000017 |
|
2023-10-13 12:20:42,328 DEV : loss 0.24267171323299408 - f1-score (micro avg) 0.7719 |
|
2023-10-13 12:20:42,360 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:20:45,065 epoch 8 - iter 44/447 - loss 0.00574822 - time (sec): 2.70 - samples/sec: 3087.09 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 12:20:48,466 epoch 8 - iter 88/447 - loss 0.00743845 - time (sec): 6.11 - samples/sec: 2924.33 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 12:20:51,212 epoch 8 - iter 132/447 - loss 0.00792900 - time (sec): 8.85 - samples/sec: 2948.74 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 12:20:53,937 epoch 8 - iter 176/447 - loss 0.00785956 - time (sec): 11.58 - samples/sec: 2982.77 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 12:20:56,601 epoch 8 - iter 220/447 - loss 0.00788583 - time (sec): 14.24 - samples/sec: 2978.18 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 12:20:59,624 epoch 8 - iter 264/447 - loss 0.00853818 - time (sec): 17.26 - samples/sec: 2966.63 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 12:21:02,453 epoch 8 - iter 308/447 - loss 0.00916299 - time (sec): 20.09 - samples/sec: 2990.31 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 12:21:05,235 epoch 8 - iter 352/447 - loss 0.00932899 - time (sec): 22.87 - samples/sec: 2981.93 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 12:21:08,099 epoch 8 - iter 396/447 - loss 0.01034094 - time (sec): 25.74 - samples/sec: 2982.67 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 12:21:11,010 epoch 8 - iter 440/447 - loss 0.01011961 - time (sec): 28.65 - samples/sec: 2975.75 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 12:21:11,441 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:21:11,441 EPOCH 8 done: loss 0.0101 - lr: 0.000011 |
|
2023-10-13 12:21:19,717 DEV : loss 0.2543591260910034 - f1-score (micro avg) 0.7859 |
|
2023-10-13 12:21:19,749 saving best model |
|
2023-10-13 12:21:20,261 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:21:23,549 epoch 9 - iter 44/447 - loss 0.00528114 - time (sec): 3.28 - samples/sec: 2607.80 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 12:21:26,230 epoch 9 - iter 88/447 - loss 0.00993798 - time (sec): 5.96 - samples/sec: 2822.40 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 12:21:29,082 epoch 9 - iter 132/447 - loss 0.00867724 - time (sec): 8.82 - samples/sec: 2840.89 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 12:21:31,841 epoch 9 - iter 176/447 - loss 0.00730150 - time (sec): 11.57 - samples/sec: 2909.04 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 12:21:35,160 epoch 9 - iter 220/447 - loss 0.00633256 - time (sec): 14.89 - samples/sec: 2909.00 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 12:21:37,857 epoch 9 - iter 264/447 - loss 0.00584102 - time (sec): 17.59 - samples/sec: 2941.43 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 12:21:40,619 epoch 9 - iter 308/447 - loss 0.00698111 - time (sec): 20.35 - samples/sec: 2937.67 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 12:21:43,603 epoch 9 - iter 352/447 - loss 0.00775451 - time (sec): 23.34 - samples/sec: 2946.59 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 12:21:46,303 epoch 9 - iter 396/447 - loss 0.00750628 - time (sec): 26.04 - samples/sec: 2943.23 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 12:21:49,116 epoch 9 - iter 440/447 - loss 0.00746570 - time (sec): 28.85 - samples/sec: 2948.98 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 12:21:49,770 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:21:49,771 EPOCH 9 done: loss 0.0075 - lr: 0.000006 |
|
2023-10-13 12:21:58,239 DEV : loss 0.2509065568447113 - f1-score (micro avg) 0.7812 |
|
2023-10-13 12:21:58,271 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:22:01,228 epoch 10 - iter 44/447 - loss 0.00265283 - time (sec): 2.96 - samples/sec: 3095.76 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 12:22:03,969 epoch 10 - iter 88/447 - loss 0.00408723 - time (sec): 5.70 - samples/sec: 3012.81 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 12:22:06,819 epoch 10 - iter 132/447 - loss 0.00409162 - time (sec): 8.55 - samples/sec: 2966.94 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 12:22:10,474 epoch 10 - iter 176/447 - loss 0.00341264 - time (sec): 12.20 - samples/sec: 2890.46 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 12:22:13,149 epoch 10 - iter 220/447 - loss 0.00461848 - time (sec): 14.88 - samples/sec: 2925.23 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 12:22:15,741 epoch 10 - iter 264/447 - loss 0.00418989 - time (sec): 17.47 - samples/sec: 2963.91 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 12:22:18,378 epoch 10 - iter 308/447 - loss 0.00461877 - time (sec): 20.11 - samples/sec: 2961.29 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 12:22:21,374 epoch 10 - iter 352/447 - loss 0.00511943 - time (sec): 23.10 - samples/sec: 2952.59 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 12:22:24,099 epoch 10 - iter 396/447 - loss 0.00522285 - time (sec): 25.83 - samples/sec: 2955.50 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 12:22:27,111 epoch 10 - iter 440/447 - loss 0.00506098 - time (sec): 28.84 - samples/sec: 2961.27 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 12:22:27,543 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:22:27,543 EPOCH 10 done: loss 0.0050 - lr: 0.000000 |
|
2023-10-13 12:22:35,641 DEV : loss 0.25304141640663147 - f1-score (micro avg) 0.7829 |
|
2023-10-13 12:22:36,008 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:22:36,010 Loading model from best epoch ... |
|
2023-10-13 12:22:37,653 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-13 12:22:42,814 |
|
Results: |
|
- F-score (micro) 0.7517 |
|
- F-score (macro) 0.6707 |
|
- Accuracy 0.6231 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.8654 0.8305 0.8476 596 |
|
pers 0.6525 0.7838 0.7121 333 |
|
org 0.5686 0.4394 0.4957 132 |
|
prod 0.6667 0.4848 0.5614 66 |
|
time 0.7609 0.7143 0.7368 49 |
|
|
|
micro avg 0.7543 0.7491 0.7517 1176 |
|
macro avg 0.7028 0.6506 0.6707 1176 |
|
weighted avg 0.7563 0.7491 0.7491 1176 |
|
|
|
2023-10-13 12:22:42,814 ---------------------------------------------------------------------------------------------------- |
|
|