2023-10-14 11:19:51,579 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:19:51,580 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-14 11:19:51,580 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:19:51,581 MultiCorpus: 5777 train + 722 dev + 723 test sentences - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl 2023-10-14 11:19:51,581 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:19:51,581 Train: 5777 sentences 2023-10-14 11:19:51,581 (train_with_dev=False, train_with_test=False) 2023-10-14 11:19:51,581 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:19:51,581 Training Params: 2023-10-14 11:19:51,581 - learning_rate: "3e-05" 2023-10-14 11:19:51,581 - mini_batch_size: "4" 2023-10-14 11:19:51,581 - max_epochs: "10" 2023-10-14 11:19:51,581 - shuffle: "True" 2023-10-14 11:19:51,581 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:19:51,581 Plugins: 2023-10-14 11:19:51,581 - LinearScheduler | warmup_fraction: '0.1' 2023-10-14 11:19:51,581 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:19:51,581 Final evaluation on model from best epoch (best-model.pt) 2023-10-14 11:19:51,581 - metric: "('micro avg', 'f1-score')" 2023-10-14 11:19:51,581 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:19:51,581 Computation: 2023-10-14 11:19:51,581 - compute on device: cuda:0 2023-10-14 11:19:51,581 - embedding storage: none 2023-10-14 11:19:51,581 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:19:51,581 Model training base path: "hmbench-icdar/nl-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-14 11:19:51,581 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:19:51,581 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:19:58,951 epoch 1 - iter 144/1445 - loss 1.75565661 - time (sec): 7.37 - samples/sec: 2517.03 - lr: 0.000003 - momentum: 0.000000 2023-10-14 11:20:06,337 epoch 1 - iter 288/1445 - loss 1.03252900 - time (sec): 14.75 - samples/sec: 2434.39 - lr: 0.000006 - momentum: 0.000000 2023-10-14 11:20:13,778 epoch 1 - iter 432/1445 - loss 0.76745105 - time (sec): 22.20 - samples/sec: 2402.87 - lr: 0.000009 - momentum: 0.000000 2023-10-14 11:20:21,198 epoch 1 - iter 576/1445 - loss 0.62697231 - time (sec): 29.62 - samples/sec: 2409.91 - lr: 0.000012 - momentum: 0.000000 2023-10-14 11:20:28,524 epoch 1 - iter 720/1445 - loss 0.53763328 - time (sec): 36.94 - samples/sec: 2400.64 - lr: 0.000015 - momentum: 0.000000 2023-10-14 11:20:35,915 epoch 1 - iter 864/1445 - loss 0.47592185 - time (sec): 44.33 - samples/sec: 2396.57 - lr: 0.000018 - momentum: 0.000000 2023-10-14 11:20:43,308 epoch 1 - iter 1008/1445 - loss 0.43188433 - time (sec): 51.73 - samples/sec: 2396.84 - lr: 0.000021 - momentum: 0.000000 2023-10-14 11:20:50,476 epoch 1 - iter 1152/1445 - loss 0.39731712 - time (sec): 58.89 - samples/sec: 2381.34 - lr: 0.000024 - momentum: 0.000000 2023-10-14 11:20:57,640 epoch 1 - iter 1296/1445 - loss 0.36675744 - time (sec): 66.06 - samples/sec: 2389.98 - lr: 0.000027 - momentum: 0.000000 2023-10-14 11:21:04,925 epoch 1 - iter 1440/1445 - loss 0.34335442 - time (sec): 73.34 - samples/sec: 2394.82 - lr: 0.000030 - momentum: 0.000000 2023-10-14 11:21:05,157 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:21:05,158 EPOCH 1 done: loss 0.3429 - lr: 0.000030 2023-10-14 11:21:08,099 DEV : loss 0.12569260597229004 - f1-score (micro avg) 0.6845 2023-10-14 11:21:08,115 saving best model 2023-10-14 11:21:08,480 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:21:15,744 epoch 2 - iter 144/1445 - loss 0.11445694 - time (sec): 7.26 - samples/sec: 2355.69 - lr: 0.000030 - momentum: 0.000000 2023-10-14 11:21:22,917 epoch 2 - iter 288/1445 - loss 0.11933495 - time (sec): 14.44 - samples/sec: 2365.17 - lr: 0.000029 - momentum: 0.000000 2023-10-14 11:21:30,379 epoch 2 - iter 432/1445 - loss 0.11449909 - time (sec): 21.90 - samples/sec: 2393.79 - lr: 0.000029 - momentum: 0.000000 2023-10-14 11:21:37,825 epoch 2 - iter 576/1445 - loss 0.11286041 - time (sec): 29.34 - samples/sec: 2416.56 - lr: 0.000029 - momentum: 0.000000 2023-10-14 11:21:45,105 epoch 2 - iter 720/1445 - loss 0.11064148 - time (sec): 36.62 - samples/sec: 2402.08 - lr: 0.000028 - momentum: 0.000000 2023-10-14 11:21:52,191 epoch 2 - iter 864/1445 - loss 0.10759736 - time (sec): 43.71 - samples/sec: 2395.87 - lr: 0.000028 - momentum: 0.000000 2023-10-14 11:21:59,736 epoch 2 - iter 1008/1445 - loss 0.10670874 - time (sec): 51.25 - samples/sec: 2386.85 - lr: 0.000028 - momentum: 0.000000 2023-10-14 11:22:07,129 epoch 2 - iter 1152/1445 - loss 0.10492693 - time (sec): 58.65 - samples/sec: 2390.49 - lr: 0.000027 - momentum: 0.000000 2023-10-14 11:22:14,431 epoch 2 - iter 1296/1445 - loss 0.10543454 - time (sec): 65.95 - samples/sec: 2398.76 - lr: 0.000027 - momentum: 0.000000 2023-10-14 11:22:21,625 epoch 2 - iter 1440/1445 - loss 0.10438782 - time (sec): 73.14 - samples/sec: 2403.24 - lr: 0.000027 - momentum: 0.000000 2023-10-14 11:22:21,848 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:22:21,848 EPOCH 2 done: loss 0.1043 - lr: 0.000027 2023-10-14 11:22:25,893 DEV : loss 0.07940532267093658 - f1-score (micro avg) 0.8117 2023-10-14 11:22:25,913 saving best model 2023-10-14 11:22:26,438 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:22:33,672 epoch 3 - iter 144/1445 - loss 0.07386524 - time (sec): 7.23 - samples/sec: 2347.39 - lr: 0.000026 - momentum: 0.000000 2023-10-14 11:22:40,885 epoch 3 - iter 288/1445 - loss 0.07483343 - time (sec): 14.44 - samples/sec: 2359.93 - lr: 0.000026 - momentum: 0.000000 2023-10-14 11:22:48,224 epoch 3 - iter 432/1445 - loss 0.07412461 - time (sec): 21.78 - samples/sec: 2349.80 - lr: 0.000026 - momentum: 0.000000 2023-10-14 11:22:55,420 epoch 3 - iter 576/1445 - loss 0.07525880 - time (sec): 28.98 - samples/sec: 2344.61 - lr: 0.000025 - momentum: 0.000000 2023-10-14 11:23:02,802 epoch 3 - iter 720/1445 - loss 0.07266541 - time (sec): 36.36 - samples/sec: 2318.85 - lr: 0.000025 - momentum: 0.000000 2023-10-14 11:23:10,303 epoch 3 - iter 864/1445 - loss 0.07006827 - time (sec): 43.86 - samples/sec: 2354.04 - lr: 0.000025 - momentum: 0.000000 2023-10-14 11:23:17,543 epoch 3 - iter 1008/1445 - loss 0.07173992 - time (sec): 51.10 - samples/sec: 2356.57 - lr: 0.000024 - momentum: 0.000000 2023-10-14 11:23:25,013 epoch 3 - iter 1152/1445 - loss 0.07507578 - time (sec): 58.57 - samples/sec: 2375.48 - lr: 0.000024 - momentum: 0.000000 2023-10-14 11:23:32,575 epoch 3 - iter 1296/1445 - loss 0.07342122 - time (sec): 66.13 - samples/sec: 2372.37 - lr: 0.000024 - momentum: 0.000000 2023-10-14 11:23:40,030 epoch 3 - iter 1440/1445 - loss 0.07228349 - time (sec): 73.59 - samples/sec: 2388.87 - lr: 0.000023 - momentum: 0.000000 2023-10-14 11:23:40,254 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:23:40,255 EPOCH 3 done: loss 0.0724 - lr: 0.000023 2023-10-14 11:23:43,807 DEV : loss 0.0913621187210083 - f1-score (micro avg) 0.8132 2023-10-14 11:23:43,824 saving best model 2023-10-14 11:23:44,309 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:23:51,639 epoch 4 - iter 144/1445 - loss 0.04124556 - time (sec): 7.33 - samples/sec: 2313.25 - lr: 0.000023 - momentum: 0.000000 2023-10-14 11:23:59,115 epoch 4 - iter 288/1445 - loss 0.04481502 - time (sec): 14.80 - samples/sec: 2421.60 - lr: 0.000023 - momentum: 0.000000 2023-10-14 11:24:06,305 epoch 4 - iter 432/1445 - loss 0.04399488 - time (sec): 21.99 - samples/sec: 2410.15 - lr: 0.000022 - momentum: 0.000000 2023-10-14 11:24:14,200 epoch 4 - iter 576/1445 - loss 0.05066883 - time (sec): 29.89 - samples/sec: 2364.56 - lr: 0.000022 - momentum: 0.000000 2023-10-14 11:24:21,411 epoch 4 - iter 720/1445 - loss 0.05193458 - time (sec): 37.10 - samples/sec: 2356.32 - lr: 0.000022 - momentum: 0.000000 2023-10-14 11:24:28,873 epoch 4 - iter 864/1445 - loss 0.05279794 - time (sec): 44.56 - samples/sec: 2357.40 - lr: 0.000021 - momentum: 0.000000 2023-10-14 11:24:36,264 epoch 4 - iter 1008/1445 - loss 0.05274805 - time (sec): 51.95 - samples/sec: 2368.95 - lr: 0.000021 - momentum: 0.000000 2023-10-14 11:24:43,432 epoch 4 - iter 1152/1445 - loss 0.05186328 - time (sec): 59.12 - samples/sec: 2363.42 - lr: 0.000021 - momentum: 0.000000 2023-10-14 11:24:50,727 epoch 4 - iter 1296/1445 - loss 0.05122179 - time (sec): 66.42 - samples/sec: 2375.97 - lr: 0.000020 - momentum: 0.000000 2023-10-14 11:24:58,244 epoch 4 - iter 1440/1445 - loss 0.05165454 - time (sec): 73.93 - samples/sec: 2377.78 - lr: 0.000020 - momentum: 0.000000 2023-10-14 11:24:58,467 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:24:58,467 EPOCH 4 done: loss 0.0516 - lr: 0.000020 2023-10-14 11:25:02,030 DEV : loss 0.1312413215637207 - f1-score (micro avg) 0.7873 2023-10-14 11:25:02,052 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:25:10,200 epoch 5 - iter 144/1445 - loss 0.03172829 - time (sec): 8.15 - samples/sec: 2041.29 - lr: 0.000020 - momentum: 0.000000 2023-10-14 11:25:17,599 epoch 5 - iter 288/1445 - loss 0.03587651 - time (sec): 15.54 - samples/sec: 2149.38 - lr: 0.000019 - momentum: 0.000000 2023-10-14 11:25:24,807 epoch 5 - iter 432/1445 - loss 0.03852227 - time (sec): 22.75 - samples/sec: 2222.67 - lr: 0.000019 - momentum: 0.000000 2023-10-14 11:25:32,180 epoch 5 - iter 576/1445 - loss 0.03797228 - time (sec): 30.13 - samples/sec: 2295.90 - lr: 0.000019 - momentum: 0.000000 2023-10-14 11:25:39,666 epoch 5 - iter 720/1445 - loss 0.04007370 - time (sec): 37.61 - samples/sec: 2317.85 - lr: 0.000018 - momentum: 0.000000 2023-10-14 11:25:47,202 epoch 5 - iter 864/1445 - loss 0.04030841 - time (sec): 45.15 - samples/sec: 2333.39 - lr: 0.000018 - momentum: 0.000000 2023-10-14 11:25:54,950 epoch 5 - iter 1008/1445 - loss 0.03895283 - time (sec): 52.90 - samples/sec: 2346.70 - lr: 0.000018 - momentum: 0.000000 2023-10-14 11:26:02,363 epoch 5 - iter 1152/1445 - loss 0.03892191 - time (sec): 60.31 - samples/sec: 2335.88 - lr: 0.000017 - momentum: 0.000000 2023-10-14 11:26:10,423 epoch 5 - iter 1296/1445 - loss 0.03776827 - time (sec): 68.37 - samples/sec: 2311.82 - lr: 0.000017 - momentum: 0.000000 2023-10-14 11:26:18,250 epoch 5 - iter 1440/1445 - loss 0.03917454 - time (sec): 76.20 - samples/sec: 2301.84 - lr: 0.000017 - momentum: 0.000000 2023-10-14 11:26:18,546 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:26:18,547 EPOCH 5 done: loss 0.0391 - lr: 0.000017 2023-10-14 11:26:22,555 DEV : loss 0.12906955182552338 - f1-score (micro avg) 0.8056 2023-10-14 11:26:22,571 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:26:29,760 epoch 6 - iter 144/1445 - loss 0.02668724 - time (sec): 7.19 - samples/sec: 2352.12 - lr: 0.000016 - momentum: 0.000000 2023-10-14 11:26:37,402 epoch 6 - iter 288/1445 - loss 0.03043050 - time (sec): 14.83 - samples/sec: 2308.65 - lr: 0.000016 - momentum: 0.000000 2023-10-14 11:26:45,621 epoch 6 - iter 432/1445 - loss 0.03237606 - time (sec): 23.05 - samples/sec: 2260.11 - lr: 0.000016 - momentum: 0.000000 2023-10-14 11:26:54,021 epoch 6 - iter 576/1445 - loss 0.03461076 - time (sec): 31.45 - samples/sec: 2236.70 - lr: 0.000015 - momentum: 0.000000 2023-10-14 11:27:02,112 epoch 6 - iter 720/1445 - loss 0.03444742 - time (sec): 39.54 - samples/sec: 2263.64 - lr: 0.000015 - momentum: 0.000000 2023-10-14 11:27:09,978 epoch 6 - iter 864/1445 - loss 0.03339662 - time (sec): 47.41 - samples/sec: 2253.87 - lr: 0.000015 - momentum: 0.000000 2023-10-14 11:27:17,058 epoch 6 - iter 1008/1445 - loss 0.03185918 - time (sec): 54.49 - samples/sec: 2269.09 - lr: 0.000014 - momentum: 0.000000 2023-10-14 11:27:24,364 epoch 6 - iter 1152/1445 - loss 0.03003098 - time (sec): 61.79 - samples/sec: 2278.81 - lr: 0.000014 - momentum: 0.000000 2023-10-14 11:27:31,815 epoch 6 - iter 1296/1445 - loss 0.03093691 - time (sec): 69.24 - samples/sec: 2285.58 - lr: 0.000014 - momentum: 0.000000 2023-10-14 11:27:39,005 epoch 6 - iter 1440/1445 - loss 0.02981435 - time (sec): 76.43 - samples/sec: 2298.80 - lr: 0.000013 - momentum: 0.000000 2023-10-14 11:27:39,263 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:27:39,263 EPOCH 6 done: loss 0.0297 - lr: 0.000013 2023-10-14 11:27:42,895 DEV : loss 0.15421992540359497 - f1-score (micro avg) 0.8116 2023-10-14 11:27:42,912 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:27:51,172 epoch 7 - iter 144/1445 - loss 0.01588193 - time (sec): 8.26 - samples/sec: 2061.32 - lr: 0.000013 - momentum: 0.000000 2023-10-14 11:27:59,204 epoch 7 - iter 288/1445 - loss 0.01690062 - time (sec): 16.29 - samples/sec: 2101.42 - lr: 0.000013 - momentum: 0.000000 2023-10-14 11:28:07,120 epoch 7 - iter 432/1445 - loss 0.01781960 - time (sec): 24.21 - samples/sec: 2174.74 - lr: 0.000012 - momentum: 0.000000 2023-10-14 11:28:15,004 epoch 7 - iter 576/1445 - loss 0.02064476 - time (sec): 32.09 - samples/sec: 2182.17 - lr: 0.000012 - momentum: 0.000000 2023-10-14 11:28:22,356 epoch 7 - iter 720/1445 - loss 0.01923653 - time (sec): 39.44 - samples/sec: 2233.50 - lr: 0.000012 - momentum: 0.000000 2023-10-14 11:28:29,480 epoch 7 - iter 864/1445 - loss 0.02022570 - time (sec): 46.57 - samples/sec: 2263.41 - lr: 0.000011 - momentum: 0.000000 2023-10-14 11:28:36,967 epoch 7 - iter 1008/1445 - loss 0.02153322 - time (sec): 54.05 - samples/sec: 2275.18 - lr: 0.000011 - momentum: 0.000000 2023-10-14 11:28:44,349 epoch 7 - iter 1152/1445 - loss 0.02052892 - time (sec): 61.44 - samples/sec: 2288.90 - lr: 0.000011 - momentum: 0.000000 2023-10-14 11:28:51,513 epoch 7 - iter 1296/1445 - loss 0.02095327 - time (sec): 68.60 - samples/sec: 2290.67 - lr: 0.000010 - momentum: 0.000000 2023-10-14 11:28:59,028 epoch 7 - iter 1440/1445 - loss 0.02075394 - time (sec): 76.11 - samples/sec: 2308.21 - lr: 0.000010 - momentum: 0.000000 2023-10-14 11:28:59,294 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:28:59,294 EPOCH 7 done: loss 0.0208 - lr: 0.000010 2023-10-14 11:29:02,923 DEV : loss 0.20232853293418884 - f1-score (micro avg) 0.8013 2023-10-14 11:29:02,948 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:29:10,235 epoch 8 - iter 144/1445 - loss 0.01686935 - time (sec): 7.29 - samples/sec: 2444.86 - lr: 0.000010 - momentum: 0.000000 2023-10-14 11:29:17,403 epoch 8 - iter 288/1445 - loss 0.01651959 - time (sec): 14.45 - samples/sec: 2432.31 - lr: 0.000009 - momentum: 0.000000 2023-10-14 11:29:24,873 epoch 8 - iter 432/1445 - loss 0.01702677 - time (sec): 21.92 - samples/sec: 2429.48 - lr: 0.000009 - momentum: 0.000000 2023-10-14 11:29:31,981 epoch 8 - iter 576/1445 - loss 0.01781295 - time (sec): 29.03 - samples/sec: 2431.84 - lr: 0.000009 - momentum: 0.000000 2023-10-14 11:29:39,239 epoch 8 - iter 720/1445 - loss 0.01906418 - time (sec): 36.29 - samples/sec: 2440.50 - lr: 0.000008 - momentum: 0.000000 2023-10-14 11:29:46,360 epoch 8 - iter 864/1445 - loss 0.01854455 - time (sec): 43.41 - samples/sec: 2441.24 - lr: 0.000008 - momentum: 0.000000 2023-10-14 11:29:53,542 epoch 8 - iter 1008/1445 - loss 0.01734082 - time (sec): 50.59 - samples/sec: 2418.61 - lr: 0.000008 - momentum: 0.000000 2023-10-14 11:30:01,355 epoch 8 - iter 1152/1445 - loss 0.01630921 - time (sec): 58.41 - samples/sec: 2403.18 - lr: 0.000007 - momentum: 0.000000 2023-10-14 11:30:09,931 epoch 8 - iter 1296/1445 - loss 0.01692418 - time (sec): 66.98 - samples/sec: 2366.73 - lr: 0.000007 - momentum: 0.000000 2023-10-14 11:30:17,349 epoch 8 - iter 1440/1445 - loss 0.01674331 - time (sec): 74.40 - samples/sec: 2358.30 - lr: 0.000007 - momentum: 0.000000 2023-10-14 11:30:17,642 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:30:17,642 EPOCH 8 done: loss 0.0168 - lr: 0.000007 2023-10-14 11:30:21,838 DEV : loss 0.165278822183609 - f1-score (micro avg) 0.8118 2023-10-14 11:30:21,862 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:30:29,160 epoch 9 - iter 144/1445 - loss 0.00480124 - time (sec): 7.30 - samples/sec: 2388.87 - lr: 0.000006 - momentum: 0.000000 2023-10-14 11:30:36,224 epoch 9 - iter 288/1445 - loss 0.00564082 - time (sec): 14.36 - samples/sec: 2330.43 - lr: 0.000006 - momentum: 0.000000 2023-10-14 11:30:44,078 epoch 9 - iter 432/1445 - loss 0.00787483 - time (sec): 22.21 - samples/sec: 2414.52 - lr: 0.000006 - momentum: 0.000000 2023-10-14 11:30:51,098 epoch 9 - iter 576/1445 - loss 0.00799249 - time (sec): 29.24 - samples/sec: 2405.67 - lr: 0.000005 - momentum: 0.000000 2023-10-14 11:30:58,424 epoch 9 - iter 720/1445 - loss 0.00802746 - time (sec): 36.56 - samples/sec: 2417.76 - lr: 0.000005 - momentum: 0.000000 2023-10-14 11:31:05,652 epoch 9 - iter 864/1445 - loss 0.00927689 - time (sec): 43.79 - samples/sec: 2417.57 - lr: 0.000005 - momentum: 0.000000 2023-10-14 11:31:12,868 epoch 9 - iter 1008/1445 - loss 0.00857392 - time (sec): 51.00 - samples/sec: 2418.14 - lr: 0.000004 - momentum: 0.000000 2023-10-14 11:31:20,330 epoch 9 - iter 1152/1445 - loss 0.00976295 - time (sec): 58.47 - samples/sec: 2401.39 - lr: 0.000004 - momentum: 0.000000 2023-10-14 11:31:27,560 epoch 9 - iter 1296/1445 - loss 0.01000819 - time (sec): 65.70 - samples/sec: 2400.94 - lr: 0.000004 - momentum: 0.000000 2023-10-14 11:31:34,843 epoch 9 - iter 1440/1445 - loss 0.01022696 - time (sec): 72.98 - samples/sec: 2404.55 - lr: 0.000003 - momentum: 0.000000 2023-10-14 11:31:35,106 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:31:35,106 EPOCH 9 done: loss 0.0102 - lr: 0.000003 2023-10-14 11:31:38,643 DEV : loss 0.18068867921829224 - f1-score (micro avg) 0.8122 2023-10-14 11:31:38,665 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:31:47,034 epoch 10 - iter 144/1445 - loss 0.00660655 - time (sec): 8.37 - samples/sec: 2163.54 - lr: 0.000003 - momentum: 0.000000 2023-10-14 11:31:55,098 epoch 10 - iter 288/1445 - loss 0.00706000 - time (sec): 16.43 - samples/sec: 2169.34 - lr: 0.000003 - momentum: 0.000000 2023-10-14 11:32:03,016 epoch 10 - iter 432/1445 - loss 0.00915700 - time (sec): 24.35 - samples/sec: 2188.35 - lr: 0.000002 - momentum: 0.000000 2023-10-14 11:32:10,893 epoch 10 - iter 576/1445 - loss 0.00816302 - time (sec): 32.23 - samples/sec: 2228.93 - lr: 0.000002 - momentum: 0.000000 2023-10-14 11:32:18,020 epoch 10 - iter 720/1445 - loss 0.00731694 - time (sec): 39.35 - samples/sec: 2268.04 - lr: 0.000002 - momentum: 0.000000 2023-10-14 11:32:25,157 epoch 10 - iter 864/1445 - loss 0.00729216 - time (sec): 46.49 - samples/sec: 2297.09 - lr: 0.000001 - momentum: 0.000000 2023-10-14 11:32:32,546 epoch 10 - iter 1008/1445 - loss 0.00743674 - time (sec): 53.88 - samples/sec: 2307.13 - lr: 0.000001 - momentum: 0.000000 2023-10-14 11:32:39,647 epoch 10 - iter 1152/1445 - loss 0.00774300 - time (sec): 60.98 - samples/sec: 2320.04 - lr: 0.000001 - momentum: 0.000000 2023-10-14 11:32:46,774 epoch 10 - iter 1296/1445 - loss 0.00794261 - time (sec): 68.11 - samples/sec: 2320.41 - lr: 0.000000 - momentum: 0.000000 2023-10-14 11:32:53,945 epoch 10 - iter 1440/1445 - loss 0.00789770 - time (sec): 75.28 - samples/sec: 2332.16 - lr: 0.000000 - momentum: 0.000000 2023-10-14 11:32:54,229 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:32:54,229 EPOCH 10 done: loss 0.0079 - lr: 0.000000 2023-10-14 11:32:57,810 DEV : loss 0.18952515721321106 - f1-score (micro avg) 0.8098 2023-10-14 11:32:58,296 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:32:58,297 Loading model from best epoch ... 2023-10-14 11:33:00,088 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-14 11:33:03,295 Results: - F-score (micro) 0.8059 - F-score (macro) 0.7025 - Accuracy 0.6862 By class: precision recall f1-score support PER 0.8119 0.8237 0.8177 482 LOC 0.8277 0.8603 0.8437 458 ORG 0.4754 0.4203 0.4462 69 micro avg 0.7992 0.8127 0.8059 1009 macro avg 0.7050 0.7014 0.7025 1009 weighted avg 0.7961 0.8127 0.8041 1009 2023-10-14 11:33:03,296 ----------------------------------------------------------------------------------------------------