2023-10-16 22:46:49,399 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:46:49,400 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-16 22:46:49,400 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:46:49,400 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-16 22:46:49,400 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:46:49,400 Train: 6183 sentences 2023-10-16 22:46:49,400 (train_with_dev=False, train_with_test=False) 2023-10-16 22:46:49,400 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:46:49,400 Training Params: 2023-10-16 22:46:49,400 - learning_rate: "5e-05" 2023-10-16 22:46:49,400 - mini_batch_size: "8" 2023-10-16 22:46:49,400 - max_epochs: "10" 2023-10-16 22:46:49,400 - shuffle: "True" 2023-10-16 22:46:49,400 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:46:49,400 Plugins: 2023-10-16 22:46:49,400 - LinearScheduler | warmup_fraction: '0.1' 2023-10-16 22:46:49,400 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:46:49,401 Final evaluation on model from best epoch (best-model.pt) 2023-10-16 22:46:49,401 - metric: "('micro avg', 'f1-score')" 2023-10-16 22:46:49,401 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:46:49,401 Computation: 2023-10-16 22:46:49,401 - compute on device: cuda:0 2023-10-16 22:46:49,401 - embedding storage: none 2023-10-16 22:46:49,401 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:46:49,401 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-16 22:46:49,401 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:46:49,401 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:46:54,117 epoch 1 - iter 77/773 - loss 1.85204228 - time (sec): 4.72 - samples/sec: 2772.74 - lr: 0.000005 - momentum: 0.000000 2023-10-16 22:46:58,843 epoch 1 - iter 154/773 - loss 1.09260658 - time (sec): 9.44 - samples/sec: 2663.29 - lr: 0.000010 - momentum: 0.000000 2023-10-16 22:47:03,506 epoch 1 - iter 231/773 - loss 0.78200160 - time (sec): 14.10 - samples/sec: 2710.45 - lr: 0.000015 - momentum: 0.000000 2023-10-16 22:47:07,811 epoch 1 - iter 308/773 - loss 0.62435683 - time (sec): 18.41 - samples/sec: 2733.35 - lr: 0.000020 - momentum: 0.000000 2023-10-16 22:47:12,365 epoch 1 - iter 385/773 - loss 0.52479921 - time (sec): 22.96 - samples/sec: 2728.68 - lr: 0.000025 - momentum: 0.000000 2023-10-16 22:47:16,890 epoch 1 - iter 462/773 - loss 0.45944559 - time (sec): 27.49 - samples/sec: 2705.16 - lr: 0.000030 - momentum: 0.000000 2023-10-16 22:47:21,384 epoch 1 - iter 539/773 - loss 0.40938835 - time (sec): 31.98 - samples/sec: 2709.25 - lr: 0.000035 - momentum: 0.000000 2023-10-16 22:47:25,730 epoch 1 - iter 616/773 - loss 0.37131140 - time (sec): 36.33 - samples/sec: 2724.14 - lr: 0.000040 - momentum: 0.000000 2023-10-16 22:47:30,373 epoch 1 - iter 693/773 - loss 0.34140050 - time (sec): 40.97 - samples/sec: 2722.91 - lr: 0.000045 - momentum: 0.000000 2023-10-16 22:47:34,837 epoch 1 - iter 770/773 - loss 0.31723684 - time (sec): 45.43 - samples/sec: 2726.82 - lr: 0.000050 - momentum: 0.000000 2023-10-16 22:47:34,988 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:47:34,988 EPOCH 1 done: loss 0.3166 - lr: 0.000050 2023-10-16 22:47:37,086 DEV : loss 0.05812298133969307 - f1-score (micro avg) 0.7137 2023-10-16 22:47:37,102 saving best model 2023-10-16 22:47:37,457 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:47:41,895 epoch 2 - iter 77/773 - loss 0.09219451 - time (sec): 4.44 - samples/sec: 2766.63 - lr: 0.000049 - momentum: 0.000000 2023-10-16 22:47:46,460 epoch 2 - iter 154/773 - loss 0.08570322 - time (sec): 9.00 - samples/sec: 2825.73 - lr: 0.000049 - momentum: 0.000000 2023-10-16 22:47:50,870 epoch 2 - iter 231/773 - loss 0.08707063 - time (sec): 13.41 - samples/sec: 2761.19 - lr: 0.000048 - momentum: 0.000000 2023-10-16 22:47:55,496 epoch 2 - iter 308/773 - loss 0.08465278 - time (sec): 18.04 - samples/sec: 2736.45 - lr: 0.000048 - momentum: 0.000000 2023-10-16 22:47:59,913 epoch 2 - iter 385/773 - loss 0.08603529 - time (sec): 22.45 - samples/sec: 2728.10 - lr: 0.000047 - momentum: 0.000000 2023-10-16 22:48:04,824 epoch 2 - iter 462/773 - loss 0.08248188 - time (sec): 27.37 - samples/sec: 2732.13 - lr: 0.000047 - momentum: 0.000000 2023-10-16 22:48:09,328 epoch 2 - iter 539/773 - loss 0.08143942 - time (sec): 31.87 - samples/sec: 2723.81 - lr: 0.000046 - momentum: 0.000000 2023-10-16 22:48:13,974 epoch 2 - iter 616/773 - loss 0.08180843 - time (sec): 36.52 - samples/sec: 2708.15 - lr: 0.000046 - momentum: 0.000000 2023-10-16 22:48:18,340 epoch 2 - iter 693/773 - loss 0.07966322 - time (sec): 40.88 - samples/sec: 2712.23 - lr: 0.000045 - momentum: 0.000000 2023-10-16 22:48:23,004 epoch 2 - iter 770/773 - loss 0.07901639 - time (sec): 45.55 - samples/sec: 2719.62 - lr: 0.000044 - momentum: 0.000000 2023-10-16 22:48:23,163 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:48:23,163 EPOCH 2 done: loss 0.0788 - lr: 0.000044 2023-10-16 22:48:25,238 DEV : loss 0.04929770901799202 - f1-score (micro avg) 0.7824 2023-10-16 22:48:25,251 saving best model 2023-10-16 22:48:25,719 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:48:30,138 epoch 3 - iter 77/773 - loss 0.05283669 - time (sec): 4.42 - samples/sec: 2848.46 - lr: 0.000044 - momentum: 0.000000 2023-10-16 22:48:34,806 epoch 3 - iter 154/773 - loss 0.05183408 - time (sec): 9.09 - samples/sec: 2757.45 - lr: 0.000043 - momentum: 0.000000 2023-10-16 22:48:39,342 epoch 3 - iter 231/773 - loss 0.05153972 - time (sec): 13.62 - samples/sec: 2737.19 - lr: 0.000043 - momentum: 0.000000 2023-10-16 22:48:43,865 epoch 3 - iter 308/773 - loss 0.04919031 - time (sec): 18.14 - samples/sec: 2713.46 - lr: 0.000042 - momentum: 0.000000 2023-10-16 22:48:48,279 epoch 3 - iter 385/773 - loss 0.05182767 - time (sec): 22.56 - samples/sec: 2699.47 - lr: 0.000042 - momentum: 0.000000 2023-10-16 22:48:52,627 epoch 3 - iter 462/773 - loss 0.05174989 - time (sec): 26.91 - samples/sec: 2694.28 - lr: 0.000041 - momentum: 0.000000 2023-10-16 22:48:57,120 epoch 3 - iter 539/773 - loss 0.05137726 - time (sec): 31.40 - samples/sec: 2702.93 - lr: 0.000041 - momentum: 0.000000 2023-10-16 22:49:01,721 epoch 3 - iter 616/773 - loss 0.05160722 - time (sec): 36.00 - samples/sec: 2710.49 - lr: 0.000040 - momentum: 0.000000 2023-10-16 22:49:06,370 epoch 3 - iter 693/773 - loss 0.05267520 - time (sec): 40.65 - samples/sec: 2713.63 - lr: 0.000039 - momentum: 0.000000 2023-10-16 22:49:11,167 epoch 3 - iter 770/773 - loss 0.05271309 - time (sec): 45.45 - samples/sec: 2725.20 - lr: 0.000039 - momentum: 0.000000 2023-10-16 22:49:11,328 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:49:11,328 EPOCH 3 done: loss 0.0526 - lr: 0.000039 2023-10-16 22:49:13,790 DEV : loss 0.06217540055513382 - f1-score (micro avg) 0.765 2023-10-16 22:49:13,803 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:49:18,337 epoch 4 - iter 77/773 - loss 0.03307915 - time (sec): 4.53 - samples/sec: 2597.75 - lr: 0.000038 - momentum: 0.000000 2023-10-16 22:49:23,144 epoch 4 - iter 154/773 - loss 0.03502772 - time (sec): 9.34 - samples/sec: 2708.64 - lr: 0.000038 - momentum: 0.000000 2023-10-16 22:49:27,501 epoch 4 - iter 231/773 - loss 0.03841144 - time (sec): 13.70 - samples/sec: 2704.05 - lr: 0.000037 - momentum: 0.000000 2023-10-16 22:49:31,990 epoch 4 - iter 308/773 - loss 0.03694235 - time (sec): 18.19 - samples/sec: 2696.06 - lr: 0.000037 - momentum: 0.000000 2023-10-16 22:49:36,598 epoch 4 - iter 385/773 - loss 0.03453724 - time (sec): 22.79 - samples/sec: 2720.71 - lr: 0.000036 - momentum: 0.000000 2023-10-16 22:49:41,295 epoch 4 - iter 462/773 - loss 0.03493765 - time (sec): 27.49 - samples/sec: 2732.97 - lr: 0.000036 - momentum: 0.000000 2023-10-16 22:49:45,665 epoch 4 - iter 539/773 - loss 0.03520193 - time (sec): 31.86 - samples/sec: 2712.03 - lr: 0.000035 - momentum: 0.000000 2023-10-16 22:49:50,371 epoch 4 - iter 616/773 - loss 0.03666515 - time (sec): 36.57 - samples/sec: 2713.85 - lr: 0.000034 - momentum: 0.000000 2023-10-16 22:49:54,733 epoch 4 - iter 693/773 - loss 0.03600062 - time (sec): 40.93 - samples/sec: 2716.65 - lr: 0.000034 - momentum: 0.000000 2023-10-16 22:49:59,478 epoch 4 - iter 770/773 - loss 0.03587709 - time (sec): 45.67 - samples/sec: 2713.34 - lr: 0.000033 - momentum: 0.000000 2023-10-16 22:49:59,640 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:49:59,640 EPOCH 4 done: loss 0.0358 - lr: 0.000033 2023-10-16 22:50:01,833 DEV : loss 0.09437456727027893 - f1-score (micro avg) 0.7548 2023-10-16 22:50:01,848 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:50:06,576 epoch 5 - iter 77/773 - loss 0.02673306 - time (sec): 4.73 - samples/sec: 2600.08 - lr: 0.000033 - momentum: 0.000000 2023-10-16 22:50:11,310 epoch 5 - iter 154/773 - loss 0.02324661 - time (sec): 9.46 - samples/sec: 2654.40 - lr: 0.000032 - momentum: 0.000000 2023-10-16 22:50:15,933 epoch 5 - iter 231/773 - loss 0.02222597 - time (sec): 14.08 - samples/sec: 2662.75 - lr: 0.000032 - momentum: 0.000000 2023-10-16 22:50:20,768 epoch 5 - iter 308/773 - loss 0.02255704 - time (sec): 18.92 - samples/sec: 2690.90 - lr: 0.000031 - momentum: 0.000000 2023-10-16 22:50:25,371 epoch 5 - iter 385/773 - loss 0.02407859 - time (sec): 23.52 - samples/sec: 2700.14 - lr: 0.000031 - momentum: 0.000000 2023-10-16 22:50:29,740 epoch 5 - iter 462/773 - loss 0.02477722 - time (sec): 27.89 - samples/sec: 2727.07 - lr: 0.000030 - momentum: 0.000000 2023-10-16 22:50:34,296 epoch 5 - iter 539/773 - loss 0.02515718 - time (sec): 32.45 - samples/sec: 2745.77 - lr: 0.000029 - momentum: 0.000000 2023-10-16 22:50:38,637 epoch 5 - iter 616/773 - loss 0.02459067 - time (sec): 36.79 - samples/sec: 2745.62 - lr: 0.000029 - momentum: 0.000000 2023-10-16 22:50:42,904 epoch 5 - iter 693/773 - loss 0.02490799 - time (sec): 41.05 - samples/sec: 2740.82 - lr: 0.000028 - momentum: 0.000000 2023-10-16 22:50:47,140 epoch 5 - iter 770/773 - loss 0.02419246 - time (sec): 45.29 - samples/sec: 2737.34 - lr: 0.000028 - momentum: 0.000000 2023-10-16 22:50:47,290 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:50:47,290 EPOCH 5 done: loss 0.0241 - lr: 0.000028 2023-10-16 22:50:49,400 DEV : loss 0.10803093016147614 - f1-score (micro avg) 0.7821 2023-10-16 22:50:49,414 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:50:53,937 epoch 6 - iter 77/773 - loss 0.01084402 - time (sec): 4.52 - samples/sec: 2709.74 - lr: 0.000027 - momentum: 0.000000 2023-10-16 22:50:58,588 epoch 6 - iter 154/773 - loss 0.01565568 - time (sec): 9.17 - samples/sec: 2682.37 - lr: 0.000027 - momentum: 0.000000 2023-10-16 22:51:03,088 epoch 6 - iter 231/773 - loss 0.01799942 - time (sec): 13.67 - samples/sec: 2671.24 - lr: 0.000026 - momentum: 0.000000 2023-10-16 22:51:07,640 epoch 6 - iter 308/773 - loss 0.01771408 - time (sec): 18.22 - samples/sec: 2707.41 - lr: 0.000026 - momentum: 0.000000 2023-10-16 22:51:11,960 epoch 6 - iter 385/773 - loss 0.01871910 - time (sec): 22.54 - samples/sec: 2739.49 - lr: 0.000025 - momentum: 0.000000 2023-10-16 22:51:16,757 epoch 6 - iter 462/773 - loss 0.01928403 - time (sec): 27.34 - samples/sec: 2747.99 - lr: 0.000024 - momentum: 0.000000 2023-10-16 22:51:21,399 epoch 6 - iter 539/773 - loss 0.01888480 - time (sec): 31.98 - samples/sec: 2721.82 - lr: 0.000024 - momentum: 0.000000 2023-10-16 22:51:25,908 epoch 6 - iter 616/773 - loss 0.02050580 - time (sec): 36.49 - samples/sec: 2709.59 - lr: 0.000023 - momentum: 0.000000 2023-10-16 22:51:30,516 epoch 6 - iter 693/773 - loss 0.01943211 - time (sec): 41.10 - samples/sec: 2701.94 - lr: 0.000023 - momentum: 0.000000 2023-10-16 22:51:35,177 epoch 6 - iter 770/773 - loss 0.01958240 - time (sec): 45.76 - samples/sec: 2708.62 - lr: 0.000022 - momentum: 0.000000 2023-10-16 22:51:35,336 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:51:35,337 EPOCH 6 done: loss 0.0196 - lr: 0.000022 2023-10-16 22:51:37,400 DEV : loss 0.1094818264245987 - f1-score (micro avg) 0.7815 2023-10-16 22:51:37,415 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:51:41,880 epoch 7 - iter 77/773 - loss 0.00528052 - time (sec): 4.46 - samples/sec: 2697.30 - lr: 0.000022 - momentum: 0.000000 2023-10-16 22:51:46,322 epoch 7 - iter 154/773 - loss 0.00899972 - time (sec): 8.91 - samples/sec: 2684.24 - lr: 0.000021 - momentum: 0.000000 2023-10-16 22:51:50,741 epoch 7 - iter 231/773 - loss 0.01093330 - time (sec): 13.33 - samples/sec: 2729.67 - lr: 0.000021 - momentum: 0.000000 2023-10-16 22:51:55,196 epoch 7 - iter 308/773 - loss 0.01394532 - time (sec): 17.78 - samples/sec: 2733.90 - lr: 0.000020 - momentum: 0.000000 2023-10-16 22:51:59,811 epoch 7 - iter 385/773 - loss 0.01538091 - time (sec): 22.39 - samples/sec: 2741.05 - lr: 0.000019 - momentum: 0.000000 2023-10-16 22:52:04,408 epoch 7 - iter 462/773 - loss 0.01469576 - time (sec): 26.99 - samples/sec: 2754.87 - lr: 0.000019 - momentum: 0.000000 2023-10-16 22:52:09,001 epoch 7 - iter 539/773 - loss 0.01528889 - time (sec): 31.58 - samples/sec: 2752.79 - lr: 0.000018 - momentum: 0.000000 2023-10-16 22:52:13,667 epoch 7 - iter 616/773 - loss 0.01471401 - time (sec): 36.25 - samples/sec: 2731.80 - lr: 0.000018 - momentum: 0.000000 2023-10-16 22:52:18,014 epoch 7 - iter 693/773 - loss 0.01434424 - time (sec): 40.60 - samples/sec: 2741.98 - lr: 0.000017 - momentum: 0.000000 2023-10-16 22:52:22,595 epoch 7 - iter 770/773 - loss 0.01514214 - time (sec): 45.18 - samples/sec: 2744.16 - lr: 0.000017 - momentum: 0.000000 2023-10-16 22:52:22,749 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:52:22,749 EPOCH 7 done: loss 0.0151 - lr: 0.000017 2023-10-16 22:52:24,858 DEV : loss 0.10991214215755463 - f1-score (micro avg) 0.7794 2023-10-16 22:52:24,871 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:52:29,258 epoch 8 - iter 77/773 - loss 0.00333898 - time (sec): 4.39 - samples/sec: 2595.49 - lr: 0.000016 - momentum: 0.000000 2023-10-16 22:52:34,059 epoch 8 - iter 154/773 - loss 0.00641512 - time (sec): 9.19 - samples/sec: 2693.65 - lr: 0.000016 - momentum: 0.000000 2023-10-16 22:52:39,164 epoch 8 - iter 231/773 - loss 0.00625228 - time (sec): 14.29 - samples/sec: 2647.99 - lr: 0.000015 - momentum: 0.000000 2023-10-16 22:52:43,962 epoch 8 - iter 308/773 - loss 0.00659826 - time (sec): 19.09 - samples/sec: 2672.30 - lr: 0.000014 - momentum: 0.000000 2023-10-16 22:52:48,443 epoch 8 - iter 385/773 - loss 0.00711413 - time (sec): 23.57 - samples/sec: 2675.76 - lr: 0.000014 - momentum: 0.000000 2023-10-16 22:52:52,773 epoch 8 - iter 462/773 - loss 0.00795173 - time (sec): 27.90 - samples/sec: 2677.65 - lr: 0.000013 - momentum: 0.000000 2023-10-16 22:52:57,154 epoch 8 - iter 539/773 - loss 0.00841455 - time (sec): 32.28 - samples/sec: 2698.48 - lr: 0.000013 - momentum: 0.000000 2023-10-16 22:53:01,874 epoch 8 - iter 616/773 - loss 0.00849543 - time (sec): 37.00 - samples/sec: 2695.17 - lr: 0.000012 - momentum: 0.000000 2023-10-16 22:53:06,448 epoch 8 - iter 693/773 - loss 0.00897091 - time (sec): 41.58 - samples/sec: 2698.10 - lr: 0.000012 - momentum: 0.000000 2023-10-16 22:53:10,765 epoch 8 - iter 770/773 - loss 0.00880887 - time (sec): 45.89 - samples/sec: 2696.82 - lr: 0.000011 - momentum: 0.000000 2023-10-16 22:53:10,931 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:53:10,931 EPOCH 8 done: loss 0.0088 - lr: 0.000011 2023-10-16 22:53:13,015 DEV : loss 0.11286821216344833 - f1-score (micro avg) 0.7934 2023-10-16 22:53:13,029 saving best model 2023-10-16 22:53:13,480 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:53:18,090 epoch 9 - iter 77/773 - loss 0.01038545 - time (sec): 4.60 - samples/sec: 2556.72 - lr: 0.000011 - momentum: 0.000000 2023-10-16 22:53:22,785 epoch 9 - iter 154/773 - loss 0.00653013 - time (sec): 9.29 - samples/sec: 2558.56 - lr: 0.000010 - momentum: 0.000000 2023-10-16 22:53:27,393 epoch 9 - iter 231/773 - loss 0.00606729 - time (sec): 13.90 - samples/sec: 2674.63 - lr: 0.000009 - momentum: 0.000000 2023-10-16 22:53:31,904 epoch 9 - iter 308/773 - loss 0.00657523 - time (sec): 18.41 - samples/sec: 2669.14 - lr: 0.000009 - momentum: 0.000000 2023-10-16 22:53:36,500 epoch 9 - iter 385/773 - loss 0.00675286 - time (sec): 23.01 - samples/sec: 2703.69 - lr: 0.000008 - momentum: 0.000000 2023-10-16 22:53:40,928 epoch 9 - iter 462/773 - loss 0.00630020 - time (sec): 27.44 - samples/sec: 2707.69 - lr: 0.000008 - momentum: 0.000000 2023-10-16 22:53:45,461 epoch 9 - iter 539/773 - loss 0.00593099 - time (sec): 31.97 - samples/sec: 2721.81 - lr: 0.000007 - momentum: 0.000000 2023-10-16 22:53:49,850 epoch 9 - iter 616/773 - loss 0.00631036 - time (sec): 36.36 - samples/sec: 2723.02 - lr: 0.000007 - momentum: 0.000000 2023-10-16 22:53:54,199 epoch 9 - iter 693/773 - loss 0.00613278 - time (sec): 40.71 - samples/sec: 2731.95 - lr: 0.000006 - momentum: 0.000000 2023-10-16 22:53:58,725 epoch 9 - iter 770/773 - loss 0.00617994 - time (sec): 45.23 - samples/sec: 2740.57 - lr: 0.000006 - momentum: 0.000000 2023-10-16 22:53:58,874 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:53:58,874 EPOCH 9 done: loss 0.0062 - lr: 0.000006 2023-10-16 22:54:00,998 DEV : loss 0.11028449237346649 - f1-score (micro avg) 0.8092 2023-10-16 22:54:01,011 saving best model 2023-10-16 22:54:01,454 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:54:05,936 epoch 10 - iter 77/773 - loss 0.00135028 - time (sec): 4.48 - samples/sec: 2727.49 - lr: 0.000005 - momentum: 0.000000 2023-10-16 22:54:10,426 epoch 10 - iter 154/773 - loss 0.00310208 - time (sec): 8.97 - samples/sec: 2771.61 - lr: 0.000005 - momentum: 0.000000 2023-10-16 22:54:14,892 epoch 10 - iter 231/773 - loss 0.00391362 - time (sec): 13.43 - samples/sec: 2768.89 - lr: 0.000004 - momentum: 0.000000 2023-10-16 22:54:19,512 epoch 10 - iter 308/773 - loss 0.00329852 - time (sec): 18.05 - samples/sec: 2754.06 - lr: 0.000003 - momentum: 0.000000 2023-10-16 22:54:23,936 epoch 10 - iter 385/773 - loss 0.00380931 - time (sec): 22.48 - samples/sec: 2768.94 - lr: 0.000003 - momentum: 0.000000 2023-10-16 22:54:28,526 epoch 10 - iter 462/773 - loss 0.00368967 - time (sec): 27.07 - samples/sec: 2765.33 - lr: 0.000002 - momentum: 0.000000 2023-10-16 22:54:33,179 epoch 10 - iter 539/773 - loss 0.00347435 - time (sec): 31.72 - samples/sec: 2740.59 - lr: 0.000002 - momentum: 0.000000 2023-10-16 22:54:37,718 epoch 10 - iter 616/773 - loss 0.00375376 - time (sec): 36.26 - samples/sec: 2740.49 - lr: 0.000001 - momentum: 0.000000 2023-10-16 22:54:42,218 epoch 10 - iter 693/773 - loss 0.00367638 - time (sec): 40.76 - samples/sec: 2737.65 - lr: 0.000001 - momentum: 0.000000 2023-10-16 22:54:46,786 epoch 10 - iter 770/773 - loss 0.00361332 - time (sec): 45.33 - samples/sec: 2734.62 - lr: 0.000000 - momentum: 0.000000 2023-10-16 22:54:46,935 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:54:46,935 EPOCH 10 done: loss 0.0036 - lr: 0.000000 2023-10-16 22:54:49,087 DEV : loss 0.11204110831022263 - f1-score (micro avg) 0.7975 2023-10-16 22:54:49,470 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:54:49,471 Loading model from best epoch ... 2023-10-16 22:54:51,168 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-16 22:54:57,525 Results: - F-score (micro) 0.8227 - F-score (macro) 0.7472 - Accuracy 0.7223 By class: precision recall f1-score support LOC 0.8625 0.8552 0.8588 946 BUILDING 0.6836 0.6541 0.6685 185 STREET 0.7143 0.7143 0.7143 56 micro avg 0.8284 0.8172 0.8227 1187 macro avg 0.7535 0.7412 0.7472 1187 weighted avg 0.8276 0.8172 0.8223 1187 2023-10-16 22:54:57,525 ----------------------------------------------------------------------------------------------------