2023-10-13 11:03:31,358 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:03:31,359 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 11:03:31,359 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:03:31,359 MultiCorpus: 966 train + 219 dev + 204 test sentences - NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator 2023-10-13 11:03:31,359 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:03:31,360 Train: 966 sentences 2023-10-13 11:03:31,360 (train_with_dev=False, train_with_test=False) 2023-10-13 11:03:31,360 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:03:31,360 Training Params: 2023-10-13 11:03:31,360 - learning_rate: "5e-05" 2023-10-13 11:03:31,360 - mini_batch_size: "8" 2023-10-13 11:03:31,360 - max_epochs: "10" 2023-10-13 11:03:31,360 - shuffle: "True" 2023-10-13 11:03:31,360 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:03:31,360 Plugins: 2023-10-13 11:03:31,360 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 11:03:31,360 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:03:31,360 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 11:03:31,360 - metric: "('micro avg', 'f1-score')" 2023-10-13 11:03:31,360 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:03:31,360 Computation: 2023-10-13 11:03:31,360 - compute on device: cuda:0 2023-10-13 11:03:31,360 - embedding storage: none 2023-10-13 11:03:31,360 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:03:31,360 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-13 11:03:31,360 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:03:31,360 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:03:32,056 epoch 1 - iter 12/121 - loss 3.42734933 - time (sec): 0.69 - samples/sec: 3587.31 - lr: 0.000005 - momentum: 0.000000 2023-10-13 11:03:32,812 epoch 1 - iter 24/121 - loss 3.07041464 - time (sec): 1.45 - samples/sec: 3431.22 - lr: 0.000010 - momentum: 0.000000 2023-10-13 11:03:33,518 epoch 1 - iter 36/121 - loss 2.47011538 - time (sec): 2.16 - samples/sec: 3455.08 - lr: 0.000014 - momentum: 0.000000 2023-10-13 11:03:34,217 epoch 1 - iter 48/121 - loss 2.06319267 - time (sec): 2.86 - samples/sec: 3418.37 - lr: 0.000019 - momentum: 0.000000 2023-10-13 11:03:34,952 epoch 1 - iter 60/121 - loss 1.76547093 - time (sec): 3.59 - samples/sec: 3362.80 - lr: 0.000024 - momentum: 0.000000 2023-10-13 11:03:35,737 epoch 1 - iter 72/121 - loss 1.54750921 - time (sec): 4.38 - samples/sec: 3326.86 - lr: 0.000029 - momentum: 0.000000 2023-10-13 11:03:36,500 epoch 1 - iter 84/121 - loss 1.37165921 - time (sec): 5.14 - samples/sec: 3357.87 - lr: 0.000034 - momentum: 0.000000 2023-10-13 11:03:37,245 epoch 1 - iter 96/121 - loss 1.24135257 - time (sec): 5.88 - samples/sec: 3362.73 - lr: 0.000039 - momentum: 0.000000 2023-10-13 11:03:37,918 epoch 1 - iter 108/121 - loss 1.15176729 - time (sec): 6.56 - samples/sec: 3371.83 - lr: 0.000044 - momentum: 0.000000 2023-10-13 11:03:38,597 epoch 1 - iter 120/121 - loss 1.05978174 - time (sec): 7.24 - samples/sec: 3400.61 - lr: 0.000049 - momentum: 0.000000 2023-10-13 11:03:38,647 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:03:38,647 EPOCH 1 done: loss 1.0548 - lr: 0.000049 2023-10-13 11:03:39,219 DEV : loss 0.22252507507801056 - f1-score (micro avg) 0.5569 2023-10-13 11:03:39,224 saving best model 2023-10-13 11:03:39,585 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:03:40,333 epoch 2 - iter 12/121 - loss 0.24218422 - time (sec): 0.75 - samples/sec: 3422.32 - lr: 0.000049 - momentum: 0.000000 2023-10-13 11:03:41,087 epoch 2 - iter 24/121 - loss 0.22819878 - time (sec): 1.50 - samples/sec: 3366.98 - lr: 0.000049 - momentum: 0.000000 2023-10-13 11:03:41,824 epoch 2 - iter 36/121 - loss 0.20546163 - time (sec): 2.24 - samples/sec: 3317.40 - lr: 0.000048 - momentum: 0.000000 2023-10-13 11:03:42,526 epoch 2 - iter 48/121 - loss 0.18959377 - time (sec): 2.94 - samples/sec: 3349.10 - lr: 0.000048 - momentum: 0.000000 2023-10-13 11:03:43,386 epoch 2 - iter 60/121 - loss 0.19054698 - time (sec): 3.80 - samples/sec: 3304.32 - lr: 0.000047 - momentum: 0.000000 2023-10-13 11:03:44,173 epoch 2 - iter 72/121 - loss 0.18267372 - time (sec): 4.59 - samples/sec: 3339.27 - lr: 0.000047 - momentum: 0.000000 2023-10-13 11:03:44,919 epoch 2 - iter 84/121 - loss 0.18202844 - time (sec): 5.33 - samples/sec: 3309.04 - lr: 0.000046 - momentum: 0.000000 2023-10-13 11:03:45,598 epoch 2 - iter 96/121 - loss 0.18474986 - time (sec): 6.01 - samples/sec: 3297.64 - lr: 0.000046 - momentum: 0.000000 2023-10-13 11:03:46,331 epoch 2 - iter 108/121 - loss 0.18317962 - time (sec): 6.74 - samples/sec: 3270.55 - lr: 0.000045 - momentum: 0.000000 2023-10-13 11:03:47,275 epoch 2 - iter 120/121 - loss 0.17672600 - time (sec): 7.69 - samples/sec: 3197.20 - lr: 0.000045 - momentum: 0.000000 2023-10-13 11:03:47,329 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:03:47,329 EPOCH 2 done: loss 0.1769 - lr: 0.000045 2023-10-13 11:03:48,210 DEV : loss 0.13349609076976776 - f1-score (micro avg) 0.7858 2023-10-13 11:03:48,216 saving best model 2023-10-13 11:03:48,685 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:03:49,396 epoch 3 - iter 12/121 - loss 0.11849465 - time (sec): 0.71 - samples/sec: 3337.12 - lr: 0.000044 - momentum: 0.000000 2023-10-13 11:03:50,215 epoch 3 - iter 24/121 - loss 0.11252162 - time (sec): 1.53 - samples/sec: 3257.51 - lr: 0.000043 - momentum: 0.000000 2023-10-13 11:03:51,009 epoch 3 - iter 36/121 - loss 0.10390098 - time (sec): 2.32 - samples/sec: 3189.77 - lr: 0.000043 - momentum: 0.000000 2023-10-13 11:03:51,765 epoch 3 - iter 48/121 - loss 0.09990655 - time (sec): 3.07 - samples/sec: 3189.93 - lr: 0.000042 - momentum: 0.000000 2023-10-13 11:03:52,475 epoch 3 - iter 60/121 - loss 0.10109223 - time (sec): 3.78 - samples/sec: 3215.85 - lr: 0.000042 - momentum: 0.000000 2023-10-13 11:03:53,299 epoch 3 - iter 72/121 - loss 0.10524557 - time (sec): 4.61 - samples/sec: 3243.86 - lr: 0.000041 - momentum: 0.000000 2023-10-13 11:03:54,003 epoch 3 - iter 84/121 - loss 0.10389228 - time (sec): 5.31 - samples/sec: 3239.92 - lr: 0.000041 - momentum: 0.000000 2023-10-13 11:03:54,717 epoch 3 - iter 96/121 - loss 0.10255653 - time (sec): 6.03 - samples/sec: 3262.97 - lr: 0.000040 - momentum: 0.000000 2023-10-13 11:03:55,523 epoch 3 - iter 108/121 - loss 0.10015951 - time (sec): 6.83 - samples/sec: 3227.37 - lr: 0.000040 - momentum: 0.000000 2023-10-13 11:03:56,288 epoch 3 - iter 120/121 - loss 0.10092736 - time (sec): 7.60 - samples/sec: 3245.01 - lr: 0.000039 - momentum: 0.000000 2023-10-13 11:03:56,339 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:03:56,339 EPOCH 3 done: loss 0.1006 - lr: 0.000039 2023-10-13 11:03:57,148 DEV : loss 0.12226267904043198 - f1-score (micro avg) 0.821 2023-10-13 11:03:57,153 saving best model 2023-10-13 11:03:57,631 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:03:58,399 epoch 4 - iter 12/121 - loss 0.05281124 - time (sec): 0.76 - samples/sec: 3272.47 - lr: 0.000038 - momentum: 0.000000 2023-10-13 11:03:59,157 epoch 4 - iter 24/121 - loss 0.06333545 - time (sec): 1.52 - samples/sec: 3320.70 - lr: 0.000038 - momentum: 0.000000 2023-10-13 11:03:59,964 epoch 4 - iter 36/121 - loss 0.06737445 - time (sec): 2.33 - samples/sec: 3370.91 - lr: 0.000037 - momentum: 0.000000 2023-10-13 11:04:00,667 epoch 4 - iter 48/121 - loss 0.07021390 - time (sec): 3.03 - samples/sec: 3365.12 - lr: 0.000037 - momentum: 0.000000 2023-10-13 11:04:01,481 epoch 4 - iter 60/121 - loss 0.06505651 - time (sec): 3.85 - samples/sec: 3327.62 - lr: 0.000036 - momentum: 0.000000 2023-10-13 11:04:02,250 epoch 4 - iter 72/121 - loss 0.06687673 - time (sec): 4.62 - samples/sec: 3320.07 - lr: 0.000036 - momentum: 0.000000 2023-10-13 11:04:02,979 epoch 4 - iter 84/121 - loss 0.06729667 - time (sec): 5.34 - samples/sec: 3364.01 - lr: 0.000035 - momentum: 0.000000 2023-10-13 11:04:03,666 epoch 4 - iter 96/121 - loss 0.06834140 - time (sec): 6.03 - samples/sec: 3304.29 - lr: 0.000035 - momentum: 0.000000 2023-10-13 11:04:04,372 epoch 4 - iter 108/121 - loss 0.06626904 - time (sec): 6.74 - samples/sec: 3315.98 - lr: 0.000034 - momentum: 0.000000 2023-10-13 11:04:05,067 epoch 4 - iter 120/121 - loss 0.06533164 - time (sec): 7.43 - samples/sec: 3317.46 - lr: 0.000034 - momentum: 0.000000 2023-10-13 11:04:05,111 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:04:05,112 EPOCH 4 done: loss 0.0651 - lr: 0.000034 2023-10-13 11:04:05,914 DEV : loss 0.13821756839752197 - f1-score (micro avg) 0.8397 2023-10-13 11:04:05,920 saving best model 2023-10-13 11:04:06,366 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:04:07,102 epoch 5 - iter 12/121 - loss 0.03214331 - time (sec): 0.73 - samples/sec: 2890.54 - lr: 0.000033 - momentum: 0.000000 2023-10-13 11:04:07,872 epoch 5 - iter 24/121 - loss 0.03784570 - time (sec): 1.50 - samples/sec: 3193.38 - lr: 0.000032 - momentum: 0.000000 2023-10-13 11:04:08,554 epoch 5 - iter 36/121 - loss 0.04161904 - time (sec): 2.18 - samples/sec: 3284.77 - lr: 0.000032 - momentum: 0.000000 2023-10-13 11:04:09,285 epoch 5 - iter 48/121 - loss 0.04556921 - time (sec): 2.91 - samples/sec: 3249.10 - lr: 0.000031 - momentum: 0.000000 2023-10-13 11:04:10,132 epoch 5 - iter 60/121 - loss 0.04473601 - time (sec): 3.76 - samples/sec: 3206.49 - lr: 0.000031 - momentum: 0.000000 2023-10-13 11:04:10,896 epoch 5 - iter 72/121 - loss 0.04751925 - time (sec): 4.52 - samples/sec: 3219.13 - lr: 0.000030 - momentum: 0.000000 2023-10-13 11:04:11,645 epoch 5 - iter 84/121 - loss 0.04798357 - time (sec): 5.27 - samples/sec: 3275.80 - lr: 0.000030 - momentum: 0.000000 2023-10-13 11:04:12,365 epoch 5 - iter 96/121 - loss 0.04750000 - time (sec): 5.99 - samples/sec: 3271.67 - lr: 0.000029 - momentum: 0.000000 2023-10-13 11:04:13,182 epoch 5 - iter 108/121 - loss 0.04692119 - time (sec): 6.81 - samples/sec: 3253.72 - lr: 0.000029 - momentum: 0.000000 2023-10-13 11:04:13,902 epoch 5 - iter 120/121 - loss 0.04549183 - time (sec): 7.53 - samples/sec: 3273.16 - lr: 0.000028 - momentum: 0.000000 2023-10-13 11:04:13,954 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:04:13,954 EPOCH 5 done: loss 0.0453 - lr: 0.000028 2023-10-13 11:04:14,789 DEV : loss 0.13376127183437347 - f1-score (micro avg) 0.8265 2023-10-13 11:04:14,794 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:04:15,555 epoch 6 - iter 12/121 - loss 0.03050622 - time (sec): 0.76 - samples/sec: 3354.21 - lr: 0.000027 - momentum: 0.000000 2023-10-13 11:04:16,235 epoch 6 - iter 24/121 - loss 0.02727867 - time (sec): 1.44 - samples/sec: 3350.75 - lr: 0.000027 - momentum: 0.000000 2023-10-13 11:04:16,961 epoch 6 - iter 36/121 - loss 0.02935009 - time (sec): 2.17 - samples/sec: 3393.19 - lr: 0.000026 - momentum: 0.000000 2023-10-13 11:04:17,698 epoch 6 - iter 48/121 - loss 0.03003866 - time (sec): 2.90 - samples/sec: 3388.30 - lr: 0.000026 - momentum: 0.000000 2023-10-13 11:04:18,462 epoch 6 - iter 60/121 - loss 0.02965052 - time (sec): 3.67 - samples/sec: 3421.81 - lr: 0.000025 - momentum: 0.000000 2023-10-13 11:04:19,143 epoch 6 - iter 72/121 - loss 0.03278386 - time (sec): 4.35 - samples/sec: 3415.00 - lr: 0.000025 - momentum: 0.000000 2023-10-13 11:04:19,873 epoch 6 - iter 84/121 - loss 0.03141095 - time (sec): 5.08 - samples/sec: 3383.38 - lr: 0.000024 - momentum: 0.000000 2023-10-13 11:04:20,618 epoch 6 - iter 96/121 - loss 0.03118215 - time (sec): 5.82 - samples/sec: 3393.93 - lr: 0.000024 - momentum: 0.000000 2023-10-13 11:04:21,351 epoch 6 - iter 108/121 - loss 0.03201618 - time (sec): 6.56 - samples/sec: 3376.97 - lr: 0.000023 - momentum: 0.000000 2023-10-13 11:04:22,088 epoch 6 - iter 120/121 - loss 0.03041047 - time (sec): 7.29 - samples/sec: 3353.16 - lr: 0.000022 - momentum: 0.000000 2023-10-13 11:04:22,151 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:04:22,151 EPOCH 6 done: loss 0.0301 - lr: 0.000022 2023-10-13 11:04:23,033 DEV : loss 0.1689298152923584 - f1-score (micro avg) 0.8315 2023-10-13 11:04:23,040 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:04:23,832 epoch 7 - iter 12/121 - loss 0.01723266 - time (sec): 0.79 - samples/sec: 3140.62 - lr: 0.000022 - momentum: 0.000000 2023-10-13 11:04:24,609 epoch 7 - iter 24/121 - loss 0.01518007 - time (sec): 1.57 - samples/sec: 2934.49 - lr: 0.000021 - momentum: 0.000000 2023-10-13 11:04:25,388 epoch 7 - iter 36/121 - loss 0.01632780 - time (sec): 2.35 - samples/sec: 3031.81 - lr: 0.000021 - momentum: 0.000000 2023-10-13 11:04:26,225 epoch 7 - iter 48/121 - loss 0.02216783 - time (sec): 3.18 - samples/sec: 3085.73 - lr: 0.000020 - momentum: 0.000000 2023-10-13 11:04:26,933 epoch 7 - iter 60/121 - loss 0.02158577 - time (sec): 3.89 - samples/sec: 3084.16 - lr: 0.000020 - momentum: 0.000000 2023-10-13 11:04:27,664 epoch 7 - iter 72/121 - loss 0.02379460 - time (sec): 4.62 - samples/sec: 3128.22 - lr: 0.000019 - momentum: 0.000000 2023-10-13 11:04:28,432 epoch 7 - iter 84/121 - loss 0.02361634 - time (sec): 5.39 - samples/sec: 3144.64 - lr: 0.000019 - momentum: 0.000000 2023-10-13 11:04:29,229 epoch 7 - iter 96/121 - loss 0.02119077 - time (sec): 6.19 - samples/sec: 3169.30 - lr: 0.000018 - momentum: 0.000000 2023-10-13 11:04:29,932 epoch 7 - iter 108/121 - loss 0.02010884 - time (sec): 6.89 - samples/sec: 3213.63 - lr: 0.000017 - momentum: 0.000000 2023-10-13 11:04:30,693 epoch 7 - iter 120/121 - loss 0.02014450 - time (sec): 7.65 - samples/sec: 3213.92 - lr: 0.000017 - momentum: 0.000000 2023-10-13 11:04:30,742 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:04:30,742 EPOCH 7 done: loss 0.0200 - lr: 0.000017 2023-10-13 11:04:31,540 DEV : loss 0.1896224468946457 - f1-score (micro avg) 0.8409 2023-10-13 11:04:31,546 saving best model 2023-10-13 11:04:31,985 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:04:32,769 epoch 8 - iter 12/121 - loss 0.01467235 - time (sec): 0.78 - samples/sec: 3333.60 - lr: 0.000016 - momentum: 0.000000 2023-10-13 11:04:33,458 epoch 8 - iter 24/121 - loss 0.01799628 - time (sec): 1.47 - samples/sec: 3276.21 - lr: 0.000016 - momentum: 0.000000 2023-10-13 11:04:34,166 epoch 8 - iter 36/121 - loss 0.01708944 - time (sec): 2.18 - samples/sec: 3421.10 - lr: 0.000015 - momentum: 0.000000 2023-10-13 11:04:34,882 epoch 8 - iter 48/121 - loss 0.01659600 - time (sec): 2.90 - samples/sec: 3480.48 - lr: 0.000015 - momentum: 0.000000 2023-10-13 11:04:35,570 epoch 8 - iter 60/121 - loss 0.01611784 - time (sec): 3.58 - samples/sec: 3447.96 - lr: 0.000014 - momentum: 0.000000 2023-10-13 11:04:36,376 epoch 8 - iter 72/121 - loss 0.01528811 - time (sec): 4.39 - samples/sec: 3436.72 - lr: 0.000014 - momentum: 0.000000 2023-10-13 11:04:37,128 epoch 8 - iter 84/121 - loss 0.01525807 - time (sec): 5.14 - samples/sec: 3373.19 - lr: 0.000013 - momentum: 0.000000 2023-10-13 11:04:37,894 epoch 8 - iter 96/121 - loss 0.01583355 - time (sec): 5.91 - samples/sec: 3354.98 - lr: 0.000013 - momentum: 0.000000 2023-10-13 11:04:38,658 epoch 8 - iter 108/121 - loss 0.01671156 - time (sec): 6.67 - samples/sec: 3337.72 - lr: 0.000012 - momentum: 0.000000 2023-10-13 11:04:39,411 epoch 8 - iter 120/121 - loss 0.01671643 - time (sec): 7.42 - samples/sec: 3318.13 - lr: 0.000011 - momentum: 0.000000 2023-10-13 11:04:39,464 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:04:39,464 EPOCH 8 done: loss 0.0166 - lr: 0.000011 2023-10-13 11:04:40,332 DEV : loss 0.18696588277816772 - f1-score (micro avg) 0.8319 2023-10-13 11:04:40,337 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:04:41,044 epoch 9 - iter 12/121 - loss 0.02103634 - time (sec): 0.71 - samples/sec: 3188.87 - lr: 0.000011 - momentum: 0.000000 2023-10-13 11:04:41,780 epoch 9 - iter 24/121 - loss 0.01613718 - time (sec): 1.44 - samples/sec: 3308.20 - lr: 0.000010 - momentum: 0.000000 2023-10-13 11:04:42,538 epoch 9 - iter 36/121 - loss 0.01795665 - time (sec): 2.20 - samples/sec: 3299.83 - lr: 0.000010 - momentum: 0.000000 2023-10-13 11:04:43,345 epoch 9 - iter 48/121 - loss 0.01455534 - time (sec): 3.01 - samples/sec: 3302.51 - lr: 0.000009 - momentum: 0.000000 2023-10-13 11:04:44,056 epoch 9 - iter 60/121 - loss 0.01371094 - time (sec): 3.72 - samples/sec: 3308.17 - lr: 0.000009 - momentum: 0.000000 2023-10-13 11:04:44,798 epoch 9 - iter 72/121 - loss 0.01340431 - time (sec): 4.46 - samples/sec: 3294.50 - lr: 0.000008 - momentum: 0.000000 2023-10-13 11:04:45,533 epoch 9 - iter 84/121 - loss 0.01186871 - time (sec): 5.19 - samples/sec: 3326.75 - lr: 0.000008 - momentum: 0.000000 2023-10-13 11:04:46,323 epoch 9 - iter 96/121 - loss 0.01170438 - time (sec): 5.98 - samples/sec: 3327.64 - lr: 0.000007 - momentum: 0.000000 2023-10-13 11:04:47,058 epoch 9 - iter 108/121 - loss 0.01104981 - time (sec): 6.72 - samples/sec: 3297.10 - lr: 0.000006 - momentum: 0.000000 2023-10-13 11:04:47,812 epoch 9 - iter 120/121 - loss 0.01116478 - time (sec): 7.47 - samples/sec: 3286.29 - lr: 0.000006 - momentum: 0.000000 2023-10-13 11:04:47,859 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:04:47,860 EPOCH 9 done: loss 0.0111 - lr: 0.000006 2023-10-13 11:04:48,752 DEV : loss 0.20006036758422852 - f1-score (micro avg) 0.8228 2023-10-13 11:04:48,757 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:04:49,506 epoch 10 - iter 12/121 - loss 0.00156941 - time (sec): 0.75 - samples/sec: 3029.36 - lr: 0.000005 - momentum: 0.000000 2023-10-13 11:04:50,233 epoch 10 - iter 24/121 - loss 0.00411536 - time (sec): 1.47 - samples/sec: 3239.17 - lr: 0.000005 - momentum: 0.000000 2023-10-13 11:04:50,940 epoch 10 - iter 36/121 - loss 0.00549516 - time (sec): 2.18 - samples/sec: 3370.52 - lr: 0.000004 - momentum: 0.000000 2023-10-13 11:04:51,724 epoch 10 - iter 48/121 - loss 0.01535190 - time (sec): 2.97 - samples/sec: 3319.52 - lr: 0.000004 - momentum: 0.000000 2023-10-13 11:04:52,428 epoch 10 - iter 60/121 - loss 0.01412996 - time (sec): 3.67 - samples/sec: 3289.15 - lr: 0.000003 - momentum: 0.000000 2023-10-13 11:04:53,103 epoch 10 - iter 72/121 - loss 0.01279995 - time (sec): 4.35 - samples/sec: 3237.60 - lr: 0.000003 - momentum: 0.000000 2023-10-13 11:04:53,899 epoch 10 - iter 84/121 - loss 0.01105238 - time (sec): 5.14 - samples/sec: 3248.43 - lr: 0.000002 - momentum: 0.000000 2023-10-13 11:04:54,617 epoch 10 - iter 96/121 - loss 0.00990444 - time (sec): 5.86 - samples/sec: 3266.37 - lr: 0.000001 - momentum: 0.000000 2023-10-13 11:04:55,356 epoch 10 - iter 108/121 - loss 0.00917855 - time (sec): 6.60 - samples/sec: 3298.33 - lr: 0.000001 - momentum: 0.000000 2023-10-13 11:04:56,185 epoch 10 - iter 120/121 - loss 0.00866098 - time (sec): 7.43 - samples/sec: 3305.97 - lr: 0.000000 - momentum: 0.000000 2023-10-13 11:04:56,235 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:04:56,235 EPOCH 10 done: loss 0.0086 - lr: 0.000000 2023-10-13 11:04:57,056 DEV : loss 0.20496968924999237 - f1-score (micro avg) 0.8247 2023-10-13 11:04:57,439 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:04:57,441 Loading model from best epoch ... 2023-10-13 11:04:58,867 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-13 11:04:59,761 Results: - F-score (micro) 0.8193 - F-score (macro) 0.5823 - Accuracy 0.7116 By class: precision recall f1-score support pers 0.8212 0.8921 0.8552 139 scope 0.8286 0.8992 0.8625 129 work 0.6848 0.7875 0.7326 80 loc 0.7500 0.3333 0.4615 9 date 0.0000 0.0000 0.0000 3 micro avg 0.7907 0.8500 0.8193 360 macro avg 0.6169 0.5824 0.5823 360 weighted avg 0.7849 0.8500 0.8136 360 2023-10-13 11:04:59,761 ----------------------------------------------------------------------------------------------------