2023-10-13 08:45:51,247 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:45:51,248 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 08:45:51,248 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:45:51,248 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-13 08:45:51,248 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:45:51,248 Train: 1100 sentences 2023-10-13 08:45:51,248 (train_with_dev=False, train_with_test=False) 2023-10-13 08:45:51,248 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:45:51,248 Training Params: 2023-10-13 08:45:51,248 - learning_rate: "3e-05" 2023-10-13 08:45:51,248 - mini_batch_size: "8" 2023-10-13 08:45:51,248 - max_epochs: "10" 2023-10-13 08:45:51,248 - shuffle: "True" 2023-10-13 08:45:51,248 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:45:51,248 Plugins: 2023-10-13 08:45:51,248 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 08:45:51,248 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:45:51,248 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 08:45:51,248 - metric: "('micro avg', 'f1-score')" 2023-10-13 08:45:51,248 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:45:51,248 Computation: 2023-10-13 08:45:51,249 - compute on device: cuda:0 2023-10-13 08:45:51,249 - embedding storage: none 2023-10-13 08:45:51,249 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:45:51,249 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-13 08:45:51,249 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:45:51,249 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:45:52,026 epoch 1 - iter 13/138 - loss 3.42478686 - time (sec): 0.78 - samples/sec: 2613.29 - lr: 0.000003 - momentum: 0.000000 2023-10-13 08:45:52,831 epoch 1 - iter 26/138 - loss 3.25697591 - time (sec): 1.58 - samples/sec: 2702.85 - lr: 0.000005 - momentum: 0.000000 2023-10-13 08:45:53,633 epoch 1 - iter 39/138 - loss 2.92896232 - time (sec): 2.38 - samples/sec: 2678.31 - lr: 0.000008 - momentum: 0.000000 2023-10-13 08:45:54,404 epoch 1 - iter 52/138 - loss 2.46822794 - time (sec): 3.15 - samples/sec: 2690.25 - lr: 0.000011 - momentum: 0.000000 2023-10-13 08:45:55,222 epoch 1 - iter 65/138 - loss 2.15028122 - time (sec): 3.97 - samples/sec: 2689.10 - lr: 0.000014 - momentum: 0.000000 2023-10-13 08:45:55,992 epoch 1 - iter 78/138 - loss 1.93536841 - time (sec): 4.74 - samples/sec: 2693.64 - lr: 0.000017 - momentum: 0.000000 2023-10-13 08:45:56,792 epoch 1 - iter 91/138 - loss 1.74412999 - time (sec): 5.54 - samples/sec: 2732.53 - lr: 0.000020 - momentum: 0.000000 2023-10-13 08:45:57,576 epoch 1 - iter 104/138 - loss 1.58222199 - time (sec): 6.33 - samples/sec: 2745.11 - lr: 0.000022 - momentum: 0.000000 2023-10-13 08:45:58,334 epoch 1 - iter 117/138 - loss 1.45813361 - time (sec): 7.08 - samples/sec: 2741.53 - lr: 0.000025 - momentum: 0.000000 2023-10-13 08:45:59,090 epoch 1 - iter 130/138 - loss 1.35021726 - time (sec): 7.84 - samples/sec: 2749.77 - lr: 0.000028 - momentum: 0.000000 2023-10-13 08:45:59,560 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:45:59,560 EPOCH 1 done: loss 1.2954 - lr: 0.000028 2023-10-13 08:46:00,307 DEV : loss 0.3012372851371765 - f1-score (micro avg) 0.6066 2023-10-13 08:46:00,313 saving best model 2023-10-13 08:46:00,682 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:46:01,409 epoch 2 - iter 13/138 - loss 0.32651210 - time (sec): 0.73 - samples/sec: 2747.35 - lr: 0.000030 - momentum: 0.000000 2023-10-13 08:46:02,186 epoch 2 - iter 26/138 - loss 0.27645443 - time (sec): 1.50 - samples/sec: 2795.41 - lr: 0.000029 - momentum: 0.000000 2023-10-13 08:46:02,955 epoch 2 - iter 39/138 - loss 0.27978688 - time (sec): 2.27 - samples/sec: 2768.62 - lr: 0.000029 - momentum: 0.000000 2023-10-13 08:46:03,754 epoch 2 - iter 52/138 - loss 0.26376866 - time (sec): 3.07 - samples/sec: 2769.56 - lr: 0.000029 - momentum: 0.000000 2023-10-13 08:46:04,518 epoch 2 - iter 65/138 - loss 0.24497349 - time (sec): 3.84 - samples/sec: 2789.53 - lr: 0.000028 - momentum: 0.000000 2023-10-13 08:46:05,305 epoch 2 - iter 78/138 - loss 0.24382508 - time (sec): 4.62 - samples/sec: 2805.76 - lr: 0.000028 - momentum: 0.000000 2023-10-13 08:46:06,080 epoch 2 - iter 91/138 - loss 0.23545494 - time (sec): 5.40 - samples/sec: 2803.92 - lr: 0.000028 - momentum: 0.000000 2023-10-13 08:46:06,876 epoch 2 - iter 104/138 - loss 0.22685556 - time (sec): 6.19 - samples/sec: 2806.32 - lr: 0.000028 - momentum: 0.000000 2023-10-13 08:46:07,655 epoch 2 - iter 117/138 - loss 0.22301589 - time (sec): 6.97 - samples/sec: 2803.41 - lr: 0.000027 - momentum: 0.000000 2023-10-13 08:46:08,451 epoch 2 - iter 130/138 - loss 0.21917924 - time (sec): 7.77 - samples/sec: 2775.45 - lr: 0.000027 - momentum: 0.000000 2023-10-13 08:46:08,931 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:46:08,932 EPOCH 2 done: loss 0.2203 - lr: 0.000027 2023-10-13 08:46:09,600 DEV : loss 0.15246804058551788 - f1-score (micro avg) 0.7931 2023-10-13 08:46:09,606 saving best model 2023-10-13 08:46:10,072 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:46:10,838 epoch 3 - iter 13/138 - loss 0.11403792 - time (sec): 0.76 - samples/sec: 2885.88 - lr: 0.000026 - momentum: 0.000000 2023-10-13 08:46:11,686 epoch 3 - iter 26/138 - loss 0.12036127 - time (sec): 1.61 - samples/sec: 2709.68 - lr: 0.000026 - momentum: 0.000000 2023-10-13 08:46:12,500 epoch 3 - iter 39/138 - loss 0.12816251 - time (sec): 2.43 - samples/sec: 2650.50 - lr: 0.000026 - momentum: 0.000000 2023-10-13 08:46:13,344 epoch 3 - iter 52/138 - loss 0.12673381 - time (sec): 3.27 - samples/sec: 2628.56 - lr: 0.000025 - momentum: 0.000000 2023-10-13 08:46:14,126 epoch 3 - iter 65/138 - loss 0.12713805 - time (sec): 4.05 - samples/sec: 2676.32 - lr: 0.000025 - momentum: 0.000000 2023-10-13 08:46:14,877 epoch 3 - iter 78/138 - loss 0.12679176 - time (sec): 4.80 - samples/sec: 2701.29 - lr: 0.000025 - momentum: 0.000000 2023-10-13 08:46:15,698 epoch 3 - iter 91/138 - loss 0.12344778 - time (sec): 5.62 - samples/sec: 2721.66 - lr: 0.000025 - momentum: 0.000000 2023-10-13 08:46:16,447 epoch 3 - iter 104/138 - loss 0.11980624 - time (sec): 6.37 - samples/sec: 2747.00 - lr: 0.000024 - momentum: 0.000000 2023-10-13 08:46:17,198 epoch 3 - iter 117/138 - loss 0.11466405 - time (sec): 7.12 - samples/sec: 2750.58 - lr: 0.000024 - momentum: 0.000000 2023-10-13 08:46:17,981 epoch 3 - iter 130/138 - loss 0.11116083 - time (sec): 7.91 - samples/sec: 2756.19 - lr: 0.000024 - momentum: 0.000000 2023-10-13 08:46:18,414 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:46:18,414 EPOCH 3 done: loss 0.1131 - lr: 0.000024 2023-10-13 08:46:19,095 DEV : loss 0.12917353212833405 - f1-score (micro avg) 0.8363 2023-10-13 08:46:19,100 saving best model 2023-10-13 08:46:19,550 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:46:20,337 epoch 4 - iter 13/138 - loss 0.05993807 - time (sec): 0.78 - samples/sec: 2718.93 - lr: 0.000023 - momentum: 0.000000 2023-10-13 08:46:21,099 epoch 4 - iter 26/138 - loss 0.06306592 - time (sec): 1.55 - samples/sec: 2684.08 - lr: 0.000023 - momentum: 0.000000 2023-10-13 08:46:21,878 epoch 4 - iter 39/138 - loss 0.07823448 - time (sec): 2.32 - samples/sec: 2666.72 - lr: 0.000022 - momentum: 0.000000 2023-10-13 08:46:22,683 epoch 4 - iter 52/138 - loss 0.08537937 - time (sec): 3.13 - samples/sec: 2606.09 - lr: 0.000022 - momentum: 0.000000 2023-10-13 08:46:23,444 epoch 4 - iter 65/138 - loss 0.08099344 - time (sec): 3.89 - samples/sec: 2616.84 - lr: 0.000022 - momentum: 0.000000 2023-10-13 08:46:24,243 epoch 4 - iter 78/138 - loss 0.07823728 - time (sec): 4.69 - samples/sec: 2670.18 - lr: 0.000021 - momentum: 0.000000 2023-10-13 08:46:24,958 epoch 4 - iter 91/138 - loss 0.07524355 - time (sec): 5.40 - samples/sec: 2655.08 - lr: 0.000021 - momentum: 0.000000 2023-10-13 08:46:25,662 epoch 4 - iter 104/138 - loss 0.07383065 - time (sec): 6.11 - samples/sec: 2705.14 - lr: 0.000021 - momentum: 0.000000 2023-10-13 08:46:26,454 epoch 4 - iter 117/138 - loss 0.07842970 - time (sec): 6.90 - samples/sec: 2738.62 - lr: 0.000021 - momentum: 0.000000 2023-10-13 08:46:27,288 epoch 4 - iter 130/138 - loss 0.07677490 - time (sec): 7.73 - samples/sec: 2775.36 - lr: 0.000020 - momentum: 0.000000 2023-10-13 08:46:27,742 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:46:27,742 EPOCH 4 done: loss 0.0747 - lr: 0.000020 2023-10-13 08:46:28,403 DEV : loss 0.13316737115383148 - f1-score (micro avg) 0.8571 2023-10-13 08:46:28,408 saving best model 2023-10-13 08:46:28,900 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:46:29,722 epoch 5 - iter 13/138 - loss 0.05668714 - time (sec): 0.81 - samples/sec: 2608.90 - lr: 0.000020 - momentum: 0.000000 2023-10-13 08:46:30,558 epoch 5 - iter 26/138 - loss 0.05073595 - time (sec): 1.65 - samples/sec: 2682.00 - lr: 0.000019 - momentum: 0.000000 2023-10-13 08:46:31,332 epoch 5 - iter 39/138 - loss 0.04877345 - time (sec): 2.42 - samples/sec: 2701.23 - lr: 0.000019 - momentum: 0.000000 2023-10-13 08:46:32,090 epoch 5 - iter 52/138 - loss 0.05670311 - time (sec): 3.18 - samples/sec: 2769.38 - lr: 0.000019 - momentum: 0.000000 2023-10-13 08:46:32,875 epoch 5 - iter 65/138 - loss 0.05551536 - time (sec): 3.97 - samples/sec: 2774.21 - lr: 0.000018 - momentum: 0.000000 2023-10-13 08:46:33,614 epoch 5 - iter 78/138 - loss 0.05335137 - time (sec): 4.71 - samples/sec: 2797.57 - lr: 0.000018 - momentum: 0.000000 2023-10-13 08:46:34,309 epoch 5 - iter 91/138 - loss 0.05219670 - time (sec): 5.40 - samples/sec: 2758.25 - lr: 0.000018 - momentum: 0.000000 2023-10-13 08:46:35,077 epoch 5 - iter 104/138 - loss 0.05411803 - time (sec): 6.17 - samples/sec: 2771.03 - lr: 0.000018 - momentum: 0.000000 2023-10-13 08:46:35,826 epoch 5 - iter 117/138 - loss 0.05510158 - time (sec): 6.92 - samples/sec: 2777.94 - lr: 0.000017 - momentum: 0.000000 2023-10-13 08:46:36,545 epoch 5 - iter 130/138 - loss 0.05336093 - time (sec): 7.64 - samples/sec: 2798.85 - lr: 0.000017 - momentum: 0.000000 2023-10-13 08:46:37,003 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:46:37,003 EPOCH 5 done: loss 0.0538 - lr: 0.000017 2023-10-13 08:46:37,711 DEV : loss 0.14850230515003204 - f1-score (micro avg) 0.8528 2023-10-13 08:46:37,717 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:46:38,500 epoch 6 - iter 13/138 - loss 0.00779610 - time (sec): 0.78 - samples/sec: 2672.02 - lr: 0.000016 - momentum: 0.000000 2023-10-13 08:46:39,310 epoch 6 - iter 26/138 - loss 0.02041797 - time (sec): 1.59 - samples/sec: 2769.61 - lr: 0.000016 - momentum: 0.000000 2023-10-13 08:46:40,119 epoch 6 - iter 39/138 - loss 0.03794121 - time (sec): 2.40 - samples/sec: 2781.08 - lr: 0.000016 - momentum: 0.000000 2023-10-13 08:46:40,846 epoch 6 - iter 52/138 - loss 0.03830060 - time (sec): 3.13 - samples/sec: 2742.92 - lr: 0.000015 - momentum: 0.000000 2023-10-13 08:46:41,612 epoch 6 - iter 65/138 - loss 0.03985147 - time (sec): 3.89 - samples/sec: 2764.55 - lr: 0.000015 - momentum: 0.000000 2023-10-13 08:46:42,415 epoch 6 - iter 78/138 - loss 0.03508542 - time (sec): 4.70 - samples/sec: 2757.98 - lr: 0.000015 - momentum: 0.000000 2023-10-13 08:46:43,244 epoch 6 - iter 91/138 - loss 0.03779644 - time (sec): 5.53 - samples/sec: 2755.05 - lr: 0.000015 - momentum: 0.000000 2023-10-13 08:46:43,983 epoch 6 - iter 104/138 - loss 0.03970915 - time (sec): 6.26 - samples/sec: 2752.57 - lr: 0.000014 - momentum: 0.000000 2023-10-13 08:46:44,755 epoch 6 - iter 117/138 - loss 0.04009455 - time (sec): 7.04 - samples/sec: 2786.24 - lr: 0.000014 - momentum: 0.000000 2023-10-13 08:46:45,551 epoch 6 - iter 130/138 - loss 0.03795431 - time (sec): 7.83 - samples/sec: 2774.63 - lr: 0.000014 - momentum: 0.000000 2023-10-13 08:46:46,004 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:46:46,004 EPOCH 6 done: loss 0.0367 - lr: 0.000014 2023-10-13 08:46:46,658 DEV : loss 0.15156938135623932 - f1-score (micro avg) 0.8659 2023-10-13 08:46:46,663 saving best model 2023-10-13 08:46:47,138 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:46:47,922 epoch 7 - iter 13/138 - loss 0.03061480 - time (sec): 0.78 - samples/sec: 2343.11 - lr: 0.000013 - momentum: 0.000000 2023-10-13 08:46:48,694 epoch 7 - iter 26/138 - loss 0.04366932 - time (sec): 1.55 - samples/sec: 2524.16 - lr: 0.000013 - momentum: 0.000000 2023-10-13 08:46:49,455 epoch 7 - iter 39/138 - loss 0.04163857 - time (sec): 2.32 - samples/sec: 2702.38 - lr: 0.000012 - momentum: 0.000000 2023-10-13 08:46:50,232 epoch 7 - iter 52/138 - loss 0.03973254 - time (sec): 3.09 - samples/sec: 2749.00 - lr: 0.000012 - momentum: 0.000000 2023-10-13 08:46:51,024 epoch 7 - iter 65/138 - loss 0.03769308 - time (sec): 3.88 - samples/sec: 2748.30 - lr: 0.000012 - momentum: 0.000000 2023-10-13 08:46:51,736 epoch 7 - iter 78/138 - loss 0.03971687 - time (sec): 4.60 - samples/sec: 2783.51 - lr: 0.000012 - momentum: 0.000000 2023-10-13 08:46:52,530 epoch 7 - iter 91/138 - loss 0.03645958 - time (sec): 5.39 - samples/sec: 2796.20 - lr: 0.000011 - momentum: 0.000000 2023-10-13 08:46:53,342 epoch 7 - iter 104/138 - loss 0.03421300 - time (sec): 6.20 - samples/sec: 2810.30 - lr: 0.000011 - momentum: 0.000000 2023-10-13 08:46:54,134 epoch 7 - iter 117/138 - loss 0.03374440 - time (sec): 6.99 - samples/sec: 2792.50 - lr: 0.000011 - momentum: 0.000000 2023-10-13 08:46:54,858 epoch 7 - iter 130/138 - loss 0.03172062 - time (sec): 7.72 - samples/sec: 2788.64 - lr: 0.000010 - momentum: 0.000000 2023-10-13 08:46:55,303 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:46:55,303 EPOCH 7 done: loss 0.0313 - lr: 0.000010 2023-10-13 08:46:55,993 DEV : loss 0.16039469838142395 - f1-score (micro avg) 0.8625 2023-10-13 08:46:55,999 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:46:56,743 epoch 8 - iter 13/138 - loss 0.01939940 - time (sec): 0.74 - samples/sec: 2741.27 - lr: 0.000010 - momentum: 0.000000 2023-10-13 08:46:57,483 epoch 8 - iter 26/138 - loss 0.01610771 - time (sec): 1.48 - samples/sec: 2759.33 - lr: 0.000009 - momentum: 0.000000 2023-10-13 08:46:58,298 epoch 8 - iter 39/138 - loss 0.01296939 - time (sec): 2.30 - samples/sec: 2663.91 - lr: 0.000009 - momentum: 0.000000 2023-10-13 08:46:59,109 epoch 8 - iter 52/138 - loss 0.02739881 - time (sec): 3.11 - samples/sec: 2715.86 - lr: 0.000009 - momentum: 0.000000 2023-10-13 08:46:59,851 epoch 8 - iter 65/138 - loss 0.02796483 - time (sec): 3.85 - samples/sec: 2723.50 - lr: 0.000009 - momentum: 0.000000 2023-10-13 08:47:00,614 epoch 8 - iter 78/138 - loss 0.02826291 - time (sec): 4.61 - samples/sec: 2716.07 - lr: 0.000008 - momentum: 0.000000 2023-10-13 08:47:01,366 epoch 8 - iter 91/138 - loss 0.02654856 - time (sec): 5.37 - samples/sec: 2751.15 - lr: 0.000008 - momentum: 0.000000 2023-10-13 08:47:02,204 epoch 8 - iter 104/138 - loss 0.02640491 - time (sec): 6.20 - samples/sec: 2753.61 - lr: 0.000008 - momentum: 0.000000 2023-10-13 08:47:03,043 epoch 8 - iter 117/138 - loss 0.02653671 - time (sec): 7.04 - samples/sec: 2741.03 - lr: 0.000007 - momentum: 0.000000 2023-10-13 08:47:03,755 epoch 8 - iter 130/138 - loss 0.02503400 - time (sec): 7.75 - samples/sec: 2763.58 - lr: 0.000007 - momentum: 0.000000 2023-10-13 08:47:04,226 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:47:04,227 EPOCH 8 done: loss 0.0272 - lr: 0.000007 2023-10-13 08:47:04,908 DEV : loss 0.15766263008117676 - f1-score (micro avg) 0.8729 2023-10-13 08:47:04,915 saving best model 2023-10-13 08:47:05,502 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:47:06,200 epoch 9 - iter 13/138 - loss 0.00078101 - time (sec): 0.69 - samples/sec: 2794.47 - lr: 0.000006 - momentum: 0.000000 2023-10-13 08:47:06,892 epoch 9 - iter 26/138 - loss 0.03214127 - time (sec): 1.39 - samples/sec: 2858.04 - lr: 0.000006 - momentum: 0.000000 2023-10-13 08:47:07,612 epoch 9 - iter 39/138 - loss 0.03065369 - time (sec): 2.11 - samples/sec: 2917.93 - lr: 0.000006 - momentum: 0.000000 2023-10-13 08:47:08,381 epoch 9 - iter 52/138 - loss 0.03039047 - time (sec): 2.88 - samples/sec: 2938.00 - lr: 0.000005 - momentum: 0.000000 2023-10-13 08:47:09,082 epoch 9 - iter 65/138 - loss 0.02685816 - time (sec): 3.58 - samples/sec: 2979.84 - lr: 0.000005 - momentum: 0.000000 2023-10-13 08:47:09,879 epoch 9 - iter 78/138 - loss 0.02302219 - time (sec): 4.37 - samples/sec: 2943.70 - lr: 0.000005 - momentum: 0.000000 2023-10-13 08:47:10,625 epoch 9 - iter 91/138 - loss 0.02039722 - time (sec): 5.12 - samples/sec: 2898.26 - lr: 0.000005 - momentum: 0.000000 2023-10-13 08:47:11,413 epoch 9 - iter 104/138 - loss 0.02021704 - time (sec): 5.91 - samples/sec: 2902.62 - lr: 0.000004 - momentum: 0.000000 2023-10-13 08:47:12,361 epoch 9 - iter 117/138 - loss 0.02018099 - time (sec): 6.86 - samples/sec: 2834.51 - lr: 0.000004 - momentum: 0.000000 2023-10-13 08:47:13,136 epoch 9 - iter 130/138 - loss 0.02039863 - time (sec): 7.63 - samples/sec: 2829.20 - lr: 0.000004 - momentum: 0.000000 2023-10-13 08:47:13,624 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:47:13,624 EPOCH 9 done: loss 0.0204 - lr: 0.000004 2023-10-13 08:47:14,284 DEV : loss 0.1462574005126953 - f1-score (micro avg) 0.8932 2023-10-13 08:47:14,289 saving best model 2023-10-13 08:47:14,747 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:47:15,508 epoch 10 - iter 13/138 - loss 0.00149311 - time (sec): 0.75 - samples/sec: 2621.90 - lr: 0.000003 - momentum: 0.000000 2023-10-13 08:47:16,285 epoch 10 - iter 26/138 - loss 0.01028203 - time (sec): 1.53 - samples/sec: 2753.93 - lr: 0.000003 - momentum: 0.000000 2023-10-13 08:47:17,047 epoch 10 - iter 39/138 - loss 0.01276969 - time (sec): 2.29 - samples/sec: 2737.87 - lr: 0.000002 - momentum: 0.000000 2023-10-13 08:47:17,814 epoch 10 - iter 52/138 - loss 0.01643193 - time (sec): 3.06 - samples/sec: 2749.39 - lr: 0.000002 - momentum: 0.000000 2023-10-13 08:47:18,611 epoch 10 - iter 65/138 - loss 0.01821951 - time (sec): 3.86 - samples/sec: 2757.10 - lr: 0.000002 - momentum: 0.000000 2023-10-13 08:47:19,411 epoch 10 - iter 78/138 - loss 0.01906226 - time (sec): 4.66 - samples/sec: 2765.54 - lr: 0.000002 - momentum: 0.000000 2023-10-13 08:47:20,157 epoch 10 - iter 91/138 - loss 0.01808510 - time (sec): 5.40 - samples/sec: 2788.91 - lr: 0.000001 - momentum: 0.000000 2023-10-13 08:47:20,927 epoch 10 - iter 104/138 - loss 0.02012840 - time (sec): 6.17 - samples/sec: 2783.85 - lr: 0.000001 - momentum: 0.000000 2023-10-13 08:47:21,683 epoch 10 - iter 117/138 - loss 0.01893462 - time (sec): 6.93 - samples/sec: 2784.50 - lr: 0.000001 - momentum: 0.000000 2023-10-13 08:47:22,501 epoch 10 - iter 130/138 - loss 0.01782572 - time (sec): 7.75 - samples/sec: 2779.88 - lr: 0.000000 - momentum: 0.000000 2023-10-13 08:47:22,971 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:47:22,971 EPOCH 10 done: loss 0.0188 - lr: 0.000000 2023-10-13 08:47:23,630 DEV : loss 0.1496874839067459 - f1-score (micro avg) 0.8854 2023-10-13 08:47:23,990 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:47:23,991 Loading model from best epoch ... 2023-10-13 08:47:25,655 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-13 08:47:26,339 Results: - F-score (micro) 0.9027 - F-score (macro) 0.6411 - Accuracy 0.8365 By class: precision recall f1-score support scope 0.8757 0.9205 0.8975 176 pers 0.9528 0.9453 0.9490 128 work 0.8533 0.8649 0.8591 74 loc 0.5000 0.5000 0.5000 2 object 0.0000 0.0000 0.0000 2 micro avg 0.8946 0.9110 0.9027 382 macro avg 0.6364 0.6461 0.6411 382 weighted avg 0.8906 0.9110 0.9005 382 2023-10-13 08:47:26,339 ----------------------------------------------------------------------------------------------------