stefan-it's picture
Upload folder using huggingface_hub
a7f3202
raw
history blame
No virus
23.8 kB
2023-10-16 22:46:49,399 ----------------------------------------------------------------------------------------------------
2023-10-16 22:46:49,400 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 22:46:49,400 ----------------------------------------------------------------------------------------------------
2023-10-16 22:46:49,400 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-16 22:46:49,400 ----------------------------------------------------------------------------------------------------
2023-10-16 22:46:49,400 Train: 6183 sentences
2023-10-16 22:46:49,400 (train_with_dev=False, train_with_test=False)
2023-10-16 22:46:49,400 ----------------------------------------------------------------------------------------------------
2023-10-16 22:46:49,400 Training Params:
2023-10-16 22:46:49,400 - learning_rate: "5e-05"
2023-10-16 22:46:49,400 - mini_batch_size: "8"
2023-10-16 22:46:49,400 - max_epochs: "10"
2023-10-16 22:46:49,400 - shuffle: "True"
2023-10-16 22:46:49,400 ----------------------------------------------------------------------------------------------------
2023-10-16 22:46:49,400 Plugins:
2023-10-16 22:46:49,400 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 22:46:49,400 ----------------------------------------------------------------------------------------------------
2023-10-16 22:46:49,401 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 22:46:49,401 - metric: "('micro avg', 'f1-score')"
2023-10-16 22:46:49,401 ----------------------------------------------------------------------------------------------------
2023-10-16 22:46:49,401 Computation:
2023-10-16 22:46:49,401 - compute on device: cuda:0
2023-10-16 22:46:49,401 - embedding storage: none
2023-10-16 22:46:49,401 ----------------------------------------------------------------------------------------------------
2023-10-16 22:46:49,401 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-16 22:46:49,401 ----------------------------------------------------------------------------------------------------
2023-10-16 22:46:49,401 ----------------------------------------------------------------------------------------------------
2023-10-16 22:46:54,117 epoch 1 - iter 77/773 - loss 1.85204228 - time (sec): 4.72 - samples/sec: 2772.74 - lr: 0.000005 - momentum: 0.000000
2023-10-16 22:46:58,843 epoch 1 - iter 154/773 - loss 1.09260658 - time (sec): 9.44 - samples/sec: 2663.29 - lr: 0.000010 - momentum: 0.000000
2023-10-16 22:47:03,506 epoch 1 - iter 231/773 - loss 0.78200160 - time (sec): 14.10 - samples/sec: 2710.45 - lr: 0.000015 - momentum: 0.000000
2023-10-16 22:47:07,811 epoch 1 - iter 308/773 - loss 0.62435683 - time (sec): 18.41 - samples/sec: 2733.35 - lr: 0.000020 - momentum: 0.000000
2023-10-16 22:47:12,365 epoch 1 - iter 385/773 - loss 0.52479921 - time (sec): 22.96 - samples/sec: 2728.68 - lr: 0.000025 - momentum: 0.000000
2023-10-16 22:47:16,890 epoch 1 - iter 462/773 - loss 0.45944559 - time (sec): 27.49 - samples/sec: 2705.16 - lr: 0.000030 - momentum: 0.000000
2023-10-16 22:47:21,384 epoch 1 - iter 539/773 - loss 0.40938835 - time (sec): 31.98 - samples/sec: 2709.25 - lr: 0.000035 - momentum: 0.000000
2023-10-16 22:47:25,730 epoch 1 - iter 616/773 - loss 0.37131140 - time (sec): 36.33 - samples/sec: 2724.14 - lr: 0.000040 - momentum: 0.000000
2023-10-16 22:47:30,373 epoch 1 - iter 693/773 - loss 0.34140050 - time (sec): 40.97 - samples/sec: 2722.91 - lr: 0.000045 - momentum: 0.000000
2023-10-16 22:47:34,837 epoch 1 - iter 770/773 - loss 0.31723684 - time (sec): 45.43 - samples/sec: 2726.82 - lr: 0.000050 - momentum: 0.000000
2023-10-16 22:47:34,988 ----------------------------------------------------------------------------------------------------
2023-10-16 22:47:34,988 EPOCH 1 done: loss 0.3166 - lr: 0.000050
2023-10-16 22:47:37,086 DEV : loss 0.05812298133969307 - f1-score (micro avg) 0.7137
2023-10-16 22:47:37,102 saving best model
2023-10-16 22:47:37,457 ----------------------------------------------------------------------------------------------------
2023-10-16 22:47:41,895 epoch 2 - iter 77/773 - loss 0.09219451 - time (sec): 4.44 - samples/sec: 2766.63 - lr: 0.000049 - momentum: 0.000000
2023-10-16 22:47:46,460 epoch 2 - iter 154/773 - loss 0.08570322 - time (sec): 9.00 - samples/sec: 2825.73 - lr: 0.000049 - momentum: 0.000000
2023-10-16 22:47:50,870 epoch 2 - iter 231/773 - loss 0.08707063 - time (sec): 13.41 - samples/sec: 2761.19 - lr: 0.000048 - momentum: 0.000000
2023-10-16 22:47:55,496 epoch 2 - iter 308/773 - loss 0.08465278 - time (sec): 18.04 - samples/sec: 2736.45 - lr: 0.000048 - momentum: 0.000000
2023-10-16 22:47:59,913 epoch 2 - iter 385/773 - loss 0.08603529 - time (sec): 22.45 - samples/sec: 2728.10 - lr: 0.000047 - momentum: 0.000000
2023-10-16 22:48:04,824 epoch 2 - iter 462/773 - loss 0.08248188 - time (sec): 27.37 - samples/sec: 2732.13 - lr: 0.000047 - momentum: 0.000000
2023-10-16 22:48:09,328 epoch 2 - iter 539/773 - loss 0.08143942 - time (sec): 31.87 - samples/sec: 2723.81 - lr: 0.000046 - momentum: 0.000000
2023-10-16 22:48:13,974 epoch 2 - iter 616/773 - loss 0.08180843 - time (sec): 36.52 - samples/sec: 2708.15 - lr: 0.000046 - momentum: 0.000000
2023-10-16 22:48:18,340 epoch 2 - iter 693/773 - loss 0.07966322 - time (sec): 40.88 - samples/sec: 2712.23 - lr: 0.000045 - momentum: 0.000000
2023-10-16 22:48:23,004 epoch 2 - iter 770/773 - loss 0.07901639 - time (sec): 45.55 - samples/sec: 2719.62 - lr: 0.000044 - momentum: 0.000000
2023-10-16 22:48:23,163 ----------------------------------------------------------------------------------------------------
2023-10-16 22:48:23,163 EPOCH 2 done: loss 0.0788 - lr: 0.000044
2023-10-16 22:48:25,238 DEV : loss 0.04929770901799202 - f1-score (micro avg) 0.7824
2023-10-16 22:48:25,251 saving best model
2023-10-16 22:48:25,719 ----------------------------------------------------------------------------------------------------
2023-10-16 22:48:30,138 epoch 3 - iter 77/773 - loss 0.05283669 - time (sec): 4.42 - samples/sec: 2848.46 - lr: 0.000044 - momentum: 0.000000
2023-10-16 22:48:34,806 epoch 3 - iter 154/773 - loss 0.05183408 - time (sec): 9.09 - samples/sec: 2757.45 - lr: 0.000043 - momentum: 0.000000
2023-10-16 22:48:39,342 epoch 3 - iter 231/773 - loss 0.05153972 - time (sec): 13.62 - samples/sec: 2737.19 - lr: 0.000043 - momentum: 0.000000
2023-10-16 22:48:43,865 epoch 3 - iter 308/773 - loss 0.04919031 - time (sec): 18.14 - samples/sec: 2713.46 - lr: 0.000042 - momentum: 0.000000
2023-10-16 22:48:48,279 epoch 3 - iter 385/773 - loss 0.05182767 - time (sec): 22.56 - samples/sec: 2699.47 - lr: 0.000042 - momentum: 0.000000
2023-10-16 22:48:52,627 epoch 3 - iter 462/773 - loss 0.05174989 - time (sec): 26.91 - samples/sec: 2694.28 - lr: 0.000041 - momentum: 0.000000
2023-10-16 22:48:57,120 epoch 3 - iter 539/773 - loss 0.05137726 - time (sec): 31.40 - samples/sec: 2702.93 - lr: 0.000041 - momentum: 0.000000
2023-10-16 22:49:01,721 epoch 3 - iter 616/773 - loss 0.05160722 - time (sec): 36.00 - samples/sec: 2710.49 - lr: 0.000040 - momentum: 0.000000
2023-10-16 22:49:06,370 epoch 3 - iter 693/773 - loss 0.05267520 - time (sec): 40.65 - samples/sec: 2713.63 - lr: 0.000039 - momentum: 0.000000
2023-10-16 22:49:11,167 epoch 3 - iter 770/773 - loss 0.05271309 - time (sec): 45.45 - samples/sec: 2725.20 - lr: 0.000039 - momentum: 0.000000
2023-10-16 22:49:11,328 ----------------------------------------------------------------------------------------------------
2023-10-16 22:49:11,328 EPOCH 3 done: loss 0.0526 - lr: 0.000039
2023-10-16 22:49:13,790 DEV : loss 0.06217540055513382 - f1-score (micro avg) 0.765
2023-10-16 22:49:13,803 ----------------------------------------------------------------------------------------------------
2023-10-16 22:49:18,337 epoch 4 - iter 77/773 - loss 0.03307915 - time (sec): 4.53 - samples/sec: 2597.75 - lr: 0.000038 - momentum: 0.000000
2023-10-16 22:49:23,144 epoch 4 - iter 154/773 - loss 0.03502772 - time (sec): 9.34 - samples/sec: 2708.64 - lr: 0.000038 - momentum: 0.000000
2023-10-16 22:49:27,501 epoch 4 - iter 231/773 - loss 0.03841144 - time (sec): 13.70 - samples/sec: 2704.05 - lr: 0.000037 - momentum: 0.000000
2023-10-16 22:49:31,990 epoch 4 - iter 308/773 - loss 0.03694235 - time (sec): 18.19 - samples/sec: 2696.06 - lr: 0.000037 - momentum: 0.000000
2023-10-16 22:49:36,598 epoch 4 - iter 385/773 - loss 0.03453724 - time (sec): 22.79 - samples/sec: 2720.71 - lr: 0.000036 - momentum: 0.000000
2023-10-16 22:49:41,295 epoch 4 - iter 462/773 - loss 0.03493765 - time (sec): 27.49 - samples/sec: 2732.97 - lr: 0.000036 - momentum: 0.000000
2023-10-16 22:49:45,665 epoch 4 - iter 539/773 - loss 0.03520193 - time (sec): 31.86 - samples/sec: 2712.03 - lr: 0.000035 - momentum: 0.000000
2023-10-16 22:49:50,371 epoch 4 - iter 616/773 - loss 0.03666515 - time (sec): 36.57 - samples/sec: 2713.85 - lr: 0.000034 - momentum: 0.000000
2023-10-16 22:49:54,733 epoch 4 - iter 693/773 - loss 0.03600062 - time (sec): 40.93 - samples/sec: 2716.65 - lr: 0.000034 - momentum: 0.000000
2023-10-16 22:49:59,478 epoch 4 - iter 770/773 - loss 0.03587709 - time (sec): 45.67 - samples/sec: 2713.34 - lr: 0.000033 - momentum: 0.000000
2023-10-16 22:49:59,640 ----------------------------------------------------------------------------------------------------
2023-10-16 22:49:59,640 EPOCH 4 done: loss 0.0358 - lr: 0.000033
2023-10-16 22:50:01,833 DEV : loss 0.09437456727027893 - f1-score (micro avg) 0.7548
2023-10-16 22:50:01,848 ----------------------------------------------------------------------------------------------------
2023-10-16 22:50:06,576 epoch 5 - iter 77/773 - loss 0.02673306 - time (sec): 4.73 - samples/sec: 2600.08 - lr: 0.000033 - momentum: 0.000000
2023-10-16 22:50:11,310 epoch 5 - iter 154/773 - loss 0.02324661 - time (sec): 9.46 - samples/sec: 2654.40 - lr: 0.000032 - momentum: 0.000000
2023-10-16 22:50:15,933 epoch 5 - iter 231/773 - loss 0.02222597 - time (sec): 14.08 - samples/sec: 2662.75 - lr: 0.000032 - momentum: 0.000000
2023-10-16 22:50:20,768 epoch 5 - iter 308/773 - loss 0.02255704 - time (sec): 18.92 - samples/sec: 2690.90 - lr: 0.000031 - momentum: 0.000000
2023-10-16 22:50:25,371 epoch 5 - iter 385/773 - loss 0.02407859 - time (sec): 23.52 - samples/sec: 2700.14 - lr: 0.000031 - momentum: 0.000000
2023-10-16 22:50:29,740 epoch 5 - iter 462/773 - loss 0.02477722 - time (sec): 27.89 - samples/sec: 2727.07 - lr: 0.000030 - momentum: 0.000000
2023-10-16 22:50:34,296 epoch 5 - iter 539/773 - loss 0.02515718 - time (sec): 32.45 - samples/sec: 2745.77 - lr: 0.000029 - momentum: 0.000000
2023-10-16 22:50:38,637 epoch 5 - iter 616/773 - loss 0.02459067 - time (sec): 36.79 - samples/sec: 2745.62 - lr: 0.000029 - momentum: 0.000000
2023-10-16 22:50:42,904 epoch 5 - iter 693/773 - loss 0.02490799 - time (sec): 41.05 - samples/sec: 2740.82 - lr: 0.000028 - momentum: 0.000000
2023-10-16 22:50:47,140 epoch 5 - iter 770/773 - loss 0.02419246 - time (sec): 45.29 - samples/sec: 2737.34 - lr: 0.000028 - momentum: 0.000000
2023-10-16 22:50:47,290 ----------------------------------------------------------------------------------------------------
2023-10-16 22:50:47,290 EPOCH 5 done: loss 0.0241 - lr: 0.000028
2023-10-16 22:50:49,400 DEV : loss 0.10803093016147614 - f1-score (micro avg) 0.7821
2023-10-16 22:50:49,414 ----------------------------------------------------------------------------------------------------
2023-10-16 22:50:53,937 epoch 6 - iter 77/773 - loss 0.01084402 - time (sec): 4.52 - samples/sec: 2709.74 - lr: 0.000027 - momentum: 0.000000
2023-10-16 22:50:58,588 epoch 6 - iter 154/773 - loss 0.01565568 - time (sec): 9.17 - samples/sec: 2682.37 - lr: 0.000027 - momentum: 0.000000
2023-10-16 22:51:03,088 epoch 6 - iter 231/773 - loss 0.01799942 - time (sec): 13.67 - samples/sec: 2671.24 - lr: 0.000026 - momentum: 0.000000
2023-10-16 22:51:07,640 epoch 6 - iter 308/773 - loss 0.01771408 - time (sec): 18.22 - samples/sec: 2707.41 - lr: 0.000026 - momentum: 0.000000
2023-10-16 22:51:11,960 epoch 6 - iter 385/773 - loss 0.01871910 - time (sec): 22.54 - samples/sec: 2739.49 - lr: 0.000025 - momentum: 0.000000
2023-10-16 22:51:16,757 epoch 6 - iter 462/773 - loss 0.01928403 - time (sec): 27.34 - samples/sec: 2747.99 - lr: 0.000024 - momentum: 0.000000
2023-10-16 22:51:21,399 epoch 6 - iter 539/773 - loss 0.01888480 - time (sec): 31.98 - samples/sec: 2721.82 - lr: 0.000024 - momentum: 0.000000
2023-10-16 22:51:25,908 epoch 6 - iter 616/773 - loss 0.02050580 - time (sec): 36.49 - samples/sec: 2709.59 - lr: 0.000023 - momentum: 0.000000
2023-10-16 22:51:30,516 epoch 6 - iter 693/773 - loss 0.01943211 - time (sec): 41.10 - samples/sec: 2701.94 - lr: 0.000023 - momentum: 0.000000
2023-10-16 22:51:35,177 epoch 6 - iter 770/773 - loss 0.01958240 - time (sec): 45.76 - samples/sec: 2708.62 - lr: 0.000022 - momentum: 0.000000
2023-10-16 22:51:35,336 ----------------------------------------------------------------------------------------------------
2023-10-16 22:51:35,337 EPOCH 6 done: loss 0.0196 - lr: 0.000022
2023-10-16 22:51:37,400 DEV : loss 0.1094818264245987 - f1-score (micro avg) 0.7815
2023-10-16 22:51:37,415 ----------------------------------------------------------------------------------------------------
2023-10-16 22:51:41,880 epoch 7 - iter 77/773 - loss 0.00528052 - time (sec): 4.46 - samples/sec: 2697.30 - lr: 0.000022 - momentum: 0.000000
2023-10-16 22:51:46,322 epoch 7 - iter 154/773 - loss 0.00899972 - time (sec): 8.91 - samples/sec: 2684.24 - lr: 0.000021 - momentum: 0.000000
2023-10-16 22:51:50,741 epoch 7 - iter 231/773 - loss 0.01093330 - time (sec): 13.33 - samples/sec: 2729.67 - lr: 0.000021 - momentum: 0.000000
2023-10-16 22:51:55,196 epoch 7 - iter 308/773 - loss 0.01394532 - time (sec): 17.78 - samples/sec: 2733.90 - lr: 0.000020 - momentum: 0.000000
2023-10-16 22:51:59,811 epoch 7 - iter 385/773 - loss 0.01538091 - time (sec): 22.39 - samples/sec: 2741.05 - lr: 0.000019 - momentum: 0.000000
2023-10-16 22:52:04,408 epoch 7 - iter 462/773 - loss 0.01469576 - time (sec): 26.99 - samples/sec: 2754.87 - lr: 0.000019 - momentum: 0.000000
2023-10-16 22:52:09,001 epoch 7 - iter 539/773 - loss 0.01528889 - time (sec): 31.58 - samples/sec: 2752.79 - lr: 0.000018 - momentum: 0.000000
2023-10-16 22:52:13,667 epoch 7 - iter 616/773 - loss 0.01471401 - time (sec): 36.25 - samples/sec: 2731.80 - lr: 0.000018 - momentum: 0.000000
2023-10-16 22:52:18,014 epoch 7 - iter 693/773 - loss 0.01434424 - time (sec): 40.60 - samples/sec: 2741.98 - lr: 0.000017 - momentum: 0.000000
2023-10-16 22:52:22,595 epoch 7 - iter 770/773 - loss 0.01514214 - time (sec): 45.18 - samples/sec: 2744.16 - lr: 0.000017 - momentum: 0.000000
2023-10-16 22:52:22,749 ----------------------------------------------------------------------------------------------------
2023-10-16 22:52:22,749 EPOCH 7 done: loss 0.0151 - lr: 0.000017
2023-10-16 22:52:24,858 DEV : loss 0.10991214215755463 - f1-score (micro avg) 0.7794
2023-10-16 22:52:24,871 ----------------------------------------------------------------------------------------------------
2023-10-16 22:52:29,258 epoch 8 - iter 77/773 - loss 0.00333898 - time (sec): 4.39 - samples/sec: 2595.49 - lr: 0.000016 - momentum: 0.000000
2023-10-16 22:52:34,059 epoch 8 - iter 154/773 - loss 0.00641512 - time (sec): 9.19 - samples/sec: 2693.65 - lr: 0.000016 - momentum: 0.000000
2023-10-16 22:52:39,164 epoch 8 - iter 231/773 - loss 0.00625228 - time (sec): 14.29 - samples/sec: 2647.99 - lr: 0.000015 - momentum: 0.000000
2023-10-16 22:52:43,962 epoch 8 - iter 308/773 - loss 0.00659826 - time (sec): 19.09 - samples/sec: 2672.30 - lr: 0.000014 - momentum: 0.000000
2023-10-16 22:52:48,443 epoch 8 - iter 385/773 - loss 0.00711413 - time (sec): 23.57 - samples/sec: 2675.76 - lr: 0.000014 - momentum: 0.000000
2023-10-16 22:52:52,773 epoch 8 - iter 462/773 - loss 0.00795173 - time (sec): 27.90 - samples/sec: 2677.65 - lr: 0.000013 - momentum: 0.000000
2023-10-16 22:52:57,154 epoch 8 - iter 539/773 - loss 0.00841455 - time (sec): 32.28 - samples/sec: 2698.48 - lr: 0.000013 - momentum: 0.000000
2023-10-16 22:53:01,874 epoch 8 - iter 616/773 - loss 0.00849543 - time (sec): 37.00 - samples/sec: 2695.17 - lr: 0.000012 - momentum: 0.000000
2023-10-16 22:53:06,448 epoch 8 - iter 693/773 - loss 0.00897091 - time (sec): 41.58 - samples/sec: 2698.10 - lr: 0.000012 - momentum: 0.000000
2023-10-16 22:53:10,765 epoch 8 - iter 770/773 - loss 0.00880887 - time (sec): 45.89 - samples/sec: 2696.82 - lr: 0.000011 - momentum: 0.000000
2023-10-16 22:53:10,931 ----------------------------------------------------------------------------------------------------
2023-10-16 22:53:10,931 EPOCH 8 done: loss 0.0088 - lr: 0.000011
2023-10-16 22:53:13,015 DEV : loss 0.11286821216344833 - f1-score (micro avg) 0.7934
2023-10-16 22:53:13,029 saving best model
2023-10-16 22:53:13,480 ----------------------------------------------------------------------------------------------------
2023-10-16 22:53:18,090 epoch 9 - iter 77/773 - loss 0.01038545 - time (sec): 4.60 - samples/sec: 2556.72 - lr: 0.000011 - momentum: 0.000000
2023-10-16 22:53:22,785 epoch 9 - iter 154/773 - loss 0.00653013 - time (sec): 9.29 - samples/sec: 2558.56 - lr: 0.000010 - momentum: 0.000000
2023-10-16 22:53:27,393 epoch 9 - iter 231/773 - loss 0.00606729 - time (sec): 13.90 - samples/sec: 2674.63 - lr: 0.000009 - momentum: 0.000000
2023-10-16 22:53:31,904 epoch 9 - iter 308/773 - loss 0.00657523 - time (sec): 18.41 - samples/sec: 2669.14 - lr: 0.000009 - momentum: 0.000000
2023-10-16 22:53:36,500 epoch 9 - iter 385/773 - loss 0.00675286 - time (sec): 23.01 - samples/sec: 2703.69 - lr: 0.000008 - momentum: 0.000000
2023-10-16 22:53:40,928 epoch 9 - iter 462/773 - loss 0.00630020 - time (sec): 27.44 - samples/sec: 2707.69 - lr: 0.000008 - momentum: 0.000000
2023-10-16 22:53:45,461 epoch 9 - iter 539/773 - loss 0.00593099 - time (sec): 31.97 - samples/sec: 2721.81 - lr: 0.000007 - momentum: 0.000000
2023-10-16 22:53:49,850 epoch 9 - iter 616/773 - loss 0.00631036 - time (sec): 36.36 - samples/sec: 2723.02 - lr: 0.000007 - momentum: 0.000000
2023-10-16 22:53:54,199 epoch 9 - iter 693/773 - loss 0.00613278 - time (sec): 40.71 - samples/sec: 2731.95 - lr: 0.000006 - momentum: 0.000000
2023-10-16 22:53:58,725 epoch 9 - iter 770/773 - loss 0.00617994 - time (sec): 45.23 - samples/sec: 2740.57 - lr: 0.000006 - momentum: 0.000000
2023-10-16 22:53:58,874 ----------------------------------------------------------------------------------------------------
2023-10-16 22:53:58,874 EPOCH 9 done: loss 0.0062 - lr: 0.000006
2023-10-16 22:54:00,998 DEV : loss 0.11028449237346649 - f1-score (micro avg) 0.8092
2023-10-16 22:54:01,011 saving best model
2023-10-16 22:54:01,454 ----------------------------------------------------------------------------------------------------
2023-10-16 22:54:05,936 epoch 10 - iter 77/773 - loss 0.00135028 - time (sec): 4.48 - samples/sec: 2727.49 - lr: 0.000005 - momentum: 0.000000
2023-10-16 22:54:10,426 epoch 10 - iter 154/773 - loss 0.00310208 - time (sec): 8.97 - samples/sec: 2771.61 - lr: 0.000005 - momentum: 0.000000
2023-10-16 22:54:14,892 epoch 10 - iter 231/773 - loss 0.00391362 - time (sec): 13.43 - samples/sec: 2768.89 - lr: 0.000004 - momentum: 0.000000
2023-10-16 22:54:19,512 epoch 10 - iter 308/773 - loss 0.00329852 - time (sec): 18.05 - samples/sec: 2754.06 - lr: 0.000003 - momentum: 0.000000
2023-10-16 22:54:23,936 epoch 10 - iter 385/773 - loss 0.00380931 - time (sec): 22.48 - samples/sec: 2768.94 - lr: 0.000003 - momentum: 0.000000
2023-10-16 22:54:28,526 epoch 10 - iter 462/773 - loss 0.00368967 - time (sec): 27.07 - samples/sec: 2765.33 - lr: 0.000002 - momentum: 0.000000
2023-10-16 22:54:33,179 epoch 10 - iter 539/773 - loss 0.00347435 - time (sec): 31.72 - samples/sec: 2740.59 - lr: 0.000002 - momentum: 0.000000
2023-10-16 22:54:37,718 epoch 10 - iter 616/773 - loss 0.00375376 - time (sec): 36.26 - samples/sec: 2740.49 - lr: 0.000001 - momentum: 0.000000
2023-10-16 22:54:42,218 epoch 10 - iter 693/773 - loss 0.00367638 - time (sec): 40.76 - samples/sec: 2737.65 - lr: 0.000001 - momentum: 0.000000
2023-10-16 22:54:46,786 epoch 10 - iter 770/773 - loss 0.00361332 - time (sec): 45.33 - samples/sec: 2734.62 - lr: 0.000000 - momentum: 0.000000
2023-10-16 22:54:46,935 ----------------------------------------------------------------------------------------------------
2023-10-16 22:54:46,935 EPOCH 10 done: loss 0.0036 - lr: 0.000000
2023-10-16 22:54:49,087 DEV : loss 0.11204110831022263 - f1-score (micro avg) 0.7975
2023-10-16 22:54:49,470 ----------------------------------------------------------------------------------------------------
2023-10-16 22:54:49,471 Loading model from best epoch ...
2023-10-16 22:54:51,168 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-16 22:54:57,525
Results:
- F-score (micro) 0.8227
- F-score (macro) 0.7472
- Accuracy 0.7223
By class:
precision recall f1-score support
LOC 0.8625 0.8552 0.8588 946
BUILDING 0.6836 0.6541 0.6685 185
STREET 0.7143 0.7143 0.7143 56
micro avg 0.8284 0.8172 0.8227 1187
macro avg 0.7535 0.7412 0.7472 1187
weighted avg 0.8276 0.8172 0.8223 1187
2023-10-16 22:54:57,525 ----------------------------------------------------------------------------------------------------