2023-10-17 22:03:23,619 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:03:23,620 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 22:03:23,620 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:03:23,620 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator 2023-10-17 22:03:23,620 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:03:23,620 Train: 5901 sentences 2023-10-17 22:03:23,620 (train_with_dev=False, train_with_test=False) 2023-10-17 22:03:23,620 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:03:23,620 Training Params: 2023-10-17 22:03:23,620 - learning_rate: "5e-05" 2023-10-17 22:03:23,621 - mini_batch_size: "4" 2023-10-17 22:03:23,621 - max_epochs: "10" 2023-10-17 22:03:23,621 - shuffle: "True" 2023-10-17 22:03:23,621 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:03:23,621 Plugins: 2023-10-17 22:03:23,621 - TensorboardLogger 2023-10-17 22:03:23,621 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 22:03:23,621 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:03:23,621 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 22:03:23,621 - metric: "('micro avg', 'f1-score')" 2023-10-17 22:03:23,621 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:03:23,621 Computation: 2023-10-17 22:03:23,621 - compute on device: cuda:0 2023-10-17 22:03:23,621 - embedding storage: none 2023-10-17 22:03:23,621 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:03:23,621 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 22:03:23,621 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:03:23,621 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:03:23,621 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 22:03:30,978 epoch 1 - iter 147/1476 - loss 2.72620720 - time (sec): 7.36 - samples/sec: 2204.50 - lr: 0.000005 - momentum: 0.000000 2023-10-17 22:03:39,185 epoch 1 - iter 294/1476 - loss 1.51923928 - time (sec): 15.56 - samples/sec: 2314.89 - lr: 0.000010 - momentum: 0.000000 2023-10-17 22:03:46,302 epoch 1 - iter 441/1476 - loss 1.16331188 - time (sec): 22.68 - samples/sec: 2297.76 - lr: 0.000015 - momentum: 0.000000 2023-10-17 22:03:53,159 epoch 1 - iter 588/1476 - loss 0.97493996 - time (sec): 29.54 - samples/sec: 2266.77 - lr: 0.000020 - momentum: 0.000000 2023-10-17 22:04:00,176 epoch 1 - iter 735/1476 - loss 0.83453748 - time (sec): 36.55 - samples/sec: 2272.81 - lr: 0.000025 - momentum: 0.000000 2023-10-17 22:04:07,648 epoch 1 - iter 882/1476 - loss 0.72789880 - time (sec): 44.03 - samples/sec: 2306.09 - lr: 0.000030 - momentum: 0.000000 2023-10-17 22:04:14,681 epoch 1 - iter 1029/1476 - loss 0.65284703 - time (sec): 51.06 - samples/sec: 2297.08 - lr: 0.000035 - momentum: 0.000000 2023-10-17 22:04:21,603 epoch 1 - iter 1176/1476 - loss 0.59851827 - time (sec): 57.98 - samples/sec: 2292.73 - lr: 0.000040 - momentum: 0.000000 2023-10-17 22:04:28,725 epoch 1 - iter 1323/1476 - loss 0.55485901 - time (sec): 65.10 - samples/sec: 2287.25 - lr: 0.000045 - momentum: 0.000000 2023-10-17 22:04:35,954 epoch 1 - iter 1470/1476 - loss 0.51634798 - time (sec): 72.33 - samples/sec: 2289.26 - lr: 0.000050 - momentum: 0.000000 2023-10-17 22:04:36,241 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:04:36,242 EPOCH 1 done: loss 0.5145 - lr: 0.000050 2023-10-17 22:04:42,181 DEV : loss 0.13091953098773956 - f1-score (micro avg) 0.7396 2023-10-17 22:04:42,229 saving best model 2023-10-17 22:04:42,659 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:04:50,639 epoch 2 - iter 147/1476 - loss 0.16756048 - time (sec): 7.98 - samples/sec: 2025.62 - lr: 0.000049 - momentum: 0.000000 2023-10-17 22:04:57,729 epoch 2 - iter 294/1476 - loss 0.15633032 - time (sec): 15.07 - samples/sec: 2293.22 - lr: 0.000049 - momentum: 0.000000 2023-10-17 22:05:04,672 epoch 2 - iter 441/1476 - loss 0.15392163 - time (sec): 22.01 - samples/sec: 2321.72 - lr: 0.000048 - momentum: 0.000000 2023-10-17 22:05:11,534 epoch 2 - iter 588/1476 - loss 0.15203226 - time (sec): 28.87 - samples/sec: 2315.36 - lr: 0.000048 - momentum: 0.000000 2023-10-17 22:05:18,847 epoch 2 - iter 735/1476 - loss 0.14873938 - time (sec): 36.19 - samples/sec: 2325.48 - lr: 0.000047 - momentum: 0.000000 2023-10-17 22:05:26,190 epoch 2 - iter 882/1476 - loss 0.14927048 - time (sec): 43.53 - samples/sec: 2325.02 - lr: 0.000047 - momentum: 0.000000 2023-10-17 22:05:33,418 epoch 2 - iter 1029/1476 - loss 0.14475488 - time (sec): 50.76 - samples/sec: 2334.35 - lr: 0.000046 - momentum: 0.000000 2023-10-17 22:05:40,797 epoch 2 - iter 1176/1476 - loss 0.14162635 - time (sec): 58.14 - samples/sec: 2317.51 - lr: 0.000046 - momentum: 0.000000 2023-10-17 22:05:47,784 epoch 2 - iter 1323/1476 - loss 0.14008889 - time (sec): 65.12 - samples/sec: 2313.38 - lr: 0.000045 - momentum: 0.000000 2023-10-17 22:05:54,775 epoch 2 - iter 1470/1476 - loss 0.14001292 - time (sec): 72.11 - samples/sec: 2299.75 - lr: 0.000044 - momentum: 0.000000 2023-10-17 22:05:55,041 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:05:55,042 EPOCH 2 done: loss 0.1401 - lr: 0.000044 2023-10-17 22:06:07,271 DEV : loss 0.16017286479473114 - f1-score (micro avg) 0.7933 2023-10-17 22:06:07,307 saving best model 2023-10-17 22:06:07,844 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:06:14,924 epoch 3 - iter 147/1476 - loss 0.09114117 - time (sec): 7.08 - samples/sec: 2298.57 - lr: 0.000044 - momentum: 0.000000 2023-10-17 22:06:22,753 epoch 3 - iter 294/1476 - loss 0.10429347 - time (sec): 14.91 - samples/sec: 2202.75 - lr: 0.000043 - momentum: 0.000000 2023-10-17 22:06:30,056 epoch 3 - iter 441/1476 - loss 0.09957062 - time (sec): 22.21 - samples/sec: 2230.13 - lr: 0.000043 - momentum: 0.000000 2023-10-17 22:06:37,945 epoch 3 - iter 588/1476 - loss 0.10039330 - time (sec): 30.10 - samples/sec: 2207.88 - lr: 0.000042 - momentum: 0.000000 2023-10-17 22:06:45,340 epoch 3 - iter 735/1476 - loss 0.09773882 - time (sec): 37.49 - samples/sec: 2244.75 - lr: 0.000042 - momentum: 0.000000 2023-10-17 22:06:52,552 epoch 3 - iter 882/1476 - loss 0.09961622 - time (sec): 44.71 - samples/sec: 2253.46 - lr: 0.000041 - momentum: 0.000000 2023-10-17 22:06:59,751 epoch 3 - iter 1029/1476 - loss 0.09540786 - time (sec): 51.90 - samples/sec: 2230.63 - lr: 0.000041 - momentum: 0.000000 2023-10-17 22:07:07,395 epoch 3 - iter 1176/1476 - loss 0.09478534 - time (sec): 59.55 - samples/sec: 2245.36 - lr: 0.000040 - momentum: 0.000000 2023-10-17 22:07:14,713 epoch 3 - iter 1323/1476 - loss 0.09680636 - time (sec): 66.87 - samples/sec: 2244.56 - lr: 0.000039 - momentum: 0.000000 2023-10-17 22:07:21,502 epoch 3 - iter 1470/1476 - loss 0.09588855 - time (sec): 73.66 - samples/sec: 2250.75 - lr: 0.000039 - momentum: 0.000000 2023-10-17 22:07:21,781 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:07:21,782 EPOCH 3 done: loss 0.0958 - lr: 0.000039 2023-10-17 22:07:33,356 DEV : loss 0.17790737748146057 - f1-score (micro avg) 0.801 2023-10-17 22:07:33,388 saving best model 2023-10-17 22:07:33,903 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:07:41,011 epoch 4 - iter 147/1476 - loss 0.04114069 - time (sec): 7.11 - samples/sec: 2214.64 - lr: 0.000038 - momentum: 0.000000 2023-10-17 22:07:47,987 epoch 4 - iter 294/1476 - loss 0.04947034 - time (sec): 14.08 - samples/sec: 2312.10 - lr: 0.000038 - momentum: 0.000000 2023-10-17 22:07:55,401 epoch 4 - iter 441/1476 - loss 0.05042154 - time (sec): 21.49 - samples/sec: 2352.45 - lr: 0.000037 - momentum: 0.000000 2023-10-17 22:08:02,333 epoch 4 - iter 588/1476 - loss 0.05340229 - time (sec): 28.43 - samples/sec: 2326.91 - lr: 0.000037 - momentum: 0.000000 2023-10-17 22:08:09,142 epoch 4 - iter 735/1476 - loss 0.05891758 - time (sec): 35.24 - samples/sec: 2288.42 - lr: 0.000036 - momentum: 0.000000 2023-10-17 22:08:16,344 epoch 4 - iter 882/1476 - loss 0.06294002 - time (sec): 42.44 - samples/sec: 2273.28 - lr: 0.000036 - momentum: 0.000000 2023-10-17 22:08:24,176 epoch 4 - iter 1029/1476 - loss 0.06760461 - time (sec): 50.27 - samples/sec: 2313.15 - lr: 0.000035 - momentum: 0.000000 2023-10-17 22:08:31,010 epoch 4 - iter 1176/1476 - loss 0.06719558 - time (sec): 57.10 - samples/sec: 2296.19 - lr: 0.000034 - momentum: 0.000000 2023-10-17 22:08:38,428 epoch 4 - iter 1323/1476 - loss 0.06606142 - time (sec): 64.52 - samples/sec: 2307.56 - lr: 0.000034 - momentum: 0.000000 2023-10-17 22:08:45,562 epoch 4 - iter 1470/1476 - loss 0.06713464 - time (sec): 71.66 - samples/sec: 2314.85 - lr: 0.000033 - momentum: 0.000000 2023-10-17 22:08:45,821 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:08:45,821 EPOCH 4 done: loss 0.0672 - lr: 0.000033 2023-10-17 22:08:58,149 DEV : loss 0.17303957045078278 - f1-score (micro avg) 0.8219 2023-10-17 22:08:58,189 saving best model 2023-10-17 22:08:58,771 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:09:06,614 epoch 5 - iter 147/1476 - loss 0.04020753 - time (sec): 7.84 - samples/sec: 1984.04 - lr: 0.000033 - momentum: 0.000000 2023-10-17 22:09:13,800 epoch 5 - iter 294/1476 - loss 0.04279187 - time (sec): 15.03 - samples/sec: 2085.00 - lr: 0.000032 - momentum: 0.000000 2023-10-17 22:09:20,910 epoch 5 - iter 441/1476 - loss 0.03954513 - time (sec): 22.14 - samples/sec: 2189.63 - lr: 0.000032 - momentum: 0.000000 2023-10-17 22:09:27,789 epoch 5 - iter 588/1476 - loss 0.04680193 - time (sec): 29.01 - samples/sec: 2186.55 - lr: 0.000031 - momentum: 0.000000 2023-10-17 22:09:34,730 epoch 5 - iter 735/1476 - loss 0.04965938 - time (sec): 35.96 - samples/sec: 2247.37 - lr: 0.000031 - momentum: 0.000000 2023-10-17 22:09:42,257 epoch 5 - iter 882/1476 - loss 0.05118920 - time (sec): 43.48 - samples/sec: 2328.08 - lr: 0.000030 - momentum: 0.000000 2023-10-17 22:09:49,062 epoch 5 - iter 1029/1476 - loss 0.05073835 - time (sec): 50.29 - samples/sec: 2313.96 - lr: 0.000029 - momentum: 0.000000 2023-10-17 22:09:56,021 epoch 5 - iter 1176/1476 - loss 0.04908833 - time (sec): 57.25 - samples/sec: 2320.25 - lr: 0.000029 - momentum: 0.000000 2023-10-17 22:10:03,436 epoch 5 - iter 1323/1476 - loss 0.05014400 - time (sec): 64.66 - samples/sec: 2321.77 - lr: 0.000028 - momentum: 0.000000 2023-10-17 22:10:10,349 epoch 5 - iter 1470/1476 - loss 0.04917217 - time (sec): 71.57 - samples/sec: 2317.74 - lr: 0.000028 - momentum: 0.000000 2023-10-17 22:10:10,640 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:10:10,640 EPOCH 5 done: loss 0.0491 - lr: 0.000028 2023-10-17 22:10:22,284 DEV : loss 0.19989599287509918 - f1-score (micro avg) 0.8277 2023-10-17 22:10:22,323 saving best model 2023-10-17 22:10:22,858 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:10:30,054 epoch 6 - iter 147/1476 - loss 0.02481939 - time (sec): 7.19 - samples/sec: 2391.48 - lr: 0.000027 - momentum: 0.000000 2023-10-17 22:10:37,109 epoch 6 - iter 294/1476 - loss 0.03067185 - time (sec): 14.25 - samples/sec: 2263.02 - lr: 0.000027 - momentum: 0.000000 2023-10-17 22:10:44,201 epoch 6 - iter 441/1476 - loss 0.03101562 - time (sec): 21.34 - samples/sec: 2264.35 - lr: 0.000026 - momentum: 0.000000 2023-10-17 22:10:51,267 epoch 6 - iter 588/1476 - loss 0.02883656 - time (sec): 28.41 - samples/sec: 2241.72 - lr: 0.000026 - momentum: 0.000000 2023-10-17 22:10:58,257 epoch 6 - iter 735/1476 - loss 0.02729639 - time (sec): 35.40 - samples/sec: 2257.56 - lr: 0.000025 - momentum: 0.000000 2023-10-17 22:11:05,567 epoch 6 - iter 882/1476 - loss 0.02913147 - time (sec): 42.71 - samples/sec: 2276.37 - lr: 0.000024 - momentum: 0.000000 2023-10-17 22:11:12,740 epoch 6 - iter 1029/1476 - loss 0.02996869 - time (sec): 49.88 - samples/sec: 2284.68 - lr: 0.000024 - momentum: 0.000000 2023-10-17 22:11:19,397 epoch 6 - iter 1176/1476 - loss 0.03047989 - time (sec): 56.54 - samples/sec: 2305.51 - lr: 0.000023 - momentum: 0.000000 2023-10-17 22:11:26,289 epoch 6 - iter 1323/1476 - loss 0.03075751 - time (sec): 63.43 - samples/sec: 2325.30 - lr: 0.000023 - momentum: 0.000000 2023-10-17 22:11:33,056 epoch 6 - iter 1470/1476 - loss 0.03013028 - time (sec): 70.20 - samples/sec: 2331.69 - lr: 0.000022 - momentum: 0.000000 2023-10-17 22:11:33,656 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:11:33,656 EPOCH 6 done: loss 0.0306 - lr: 0.000022 2023-10-17 22:11:45,493 DEV : loss 0.19098570942878723 - f1-score (micro avg) 0.8276 2023-10-17 22:11:45,532 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:11:52,668 epoch 7 - iter 147/1476 - loss 0.01581578 - time (sec): 7.14 - samples/sec: 2323.02 - lr: 0.000022 - momentum: 0.000000 2023-10-17 22:11:59,684 epoch 7 - iter 294/1476 - loss 0.01468210 - time (sec): 14.15 - samples/sec: 2365.12 - lr: 0.000021 - momentum: 0.000000 2023-10-17 22:12:06,353 epoch 7 - iter 441/1476 - loss 0.01689443 - time (sec): 20.82 - samples/sec: 2355.12 - lr: 0.000021 - momentum: 0.000000 2023-10-17 22:12:13,318 epoch 7 - iter 588/1476 - loss 0.01792445 - time (sec): 27.79 - samples/sec: 2399.05 - lr: 0.000020 - momentum: 0.000000 2023-10-17 22:12:20,504 epoch 7 - iter 735/1476 - loss 0.01916804 - time (sec): 34.97 - samples/sec: 2378.26 - lr: 0.000019 - momentum: 0.000000 2023-10-17 22:12:27,645 epoch 7 - iter 882/1476 - loss 0.01979662 - time (sec): 42.11 - samples/sec: 2354.95 - lr: 0.000019 - momentum: 0.000000 2023-10-17 22:12:34,785 epoch 7 - iter 1029/1476 - loss 0.01873811 - time (sec): 49.25 - samples/sec: 2361.01 - lr: 0.000018 - momentum: 0.000000 2023-10-17 22:12:42,030 epoch 7 - iter 1176/1476 - loss 0.02113036 - time (sec): 56.50 - samples/sec: 2344.67 - lr: 0.000018 - momentum: 0.000000 2023-10-17 22:12:49,458 epoch 7 - iter 1323/1476 - loss 0.02067516 - time (sec): 63.92 - samples/sec: 2351.84 - lr: 0.000017 - momentum: 0.000000 2023-10-17 22:12:56,605 epoch 7 - iter 1470/1476 - loss 0.02071757 - time (sec): 71.07 - samples/sec: 2330.04 - lr: 0.000017 - momentum: 0.000000 2023-10-17 22:12:56,922 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:12:56,922 EPOCH 7 done: loss 0.0206 - lr: 0.000017 2023-10-17 22:13:08,626 DEV : loss 0.22276441752910614 - f1-score (micro avg) 0.8349 2023-10-17 22:13:08,661 saving best model 2023-10-17 22:13:09,234 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:13:16,414 epoch 8 - iter 147/1476 - loss 0.01171971 - time (sec): 7.18 - samples/sec: 2343.91 - lr: 0.000016 - momentum: 0.000000 2023-10-17 22:13:23,479 epoch 8 - iter 294/1476 - loss 0.01381993 - time (sec): 14.24 - samples/sec: 2401.31 - lr: 0.000016 - momentum: 0.000000 2023-10-17 22:13:30,678 epoch 8 - iter 441/1476 - loss 0.01397414 - time (sec): 21.44 - samples/sec: 2415.79 - lr: 0.000015 - momentum: 0.000000 2023-10-17 22:13:37,797 epoch 8 - iter 588/1476 - loss 0.01370204 - time (sec): 28.56 - samples/sec: 2416.24 - lr: 0.000014 - momentum: 0.000000 2023-10-17 22:13:44,734 epoch 8 - iter 735/1476 - loss 0.01392521 - time (sec): 35.50 - samples/sec: 2377.32 - lr: 0.000014 - momentum: 0.000000 2023-10-17 22:13:51,767 epoch 8 - iter 882/1476 - loss 0.01346954 - time (sec): 42.53 - samples/sec: 2347.16 - lr: 0.000013 - momentum: 0.000000 2023-10-17 22:13:58,882 epoch 8 - iter 1029/1476 - loss 0.01358304 - time (sec): 49.64 - samples/sec: 2350.52 - lr: 0.000013 - momentum: 0.000000 2023-10-17 22:14:06,096 epoch 8 - iter 1176/1476 - loss 0.01302687 - time (sec): 56.86 - samples/sec: 2319.01 - lr: 0.000012 - momentum: 0.000000 2023-10-17 22:14:13,203 epoch 8 - iter 1323/1476 - loss 0.01317696 - time (sec): 63.97 - samples/sec: 2306.34 - lr: 0.000012 - momentum: 0.000000 2023-10-17 22:14:20,720 epoch 8 - iter 1470/1476 - loss 0.01309322 - time (sec): 71.48 - samples/sec: 2320.44 - lr: 0.000011 - momentum: 0.000000 2023-10-17 22:14:20,994 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:14:20,994 EPOCH 8 done: loss 0.0131 - lr: 0.000011 2023-10-17 22:14:32,224 DEV : loss 0.24337786436080933 - f1-score (micro avg) 0.8314 2023-10-17 22:14:32,260 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:14:39,150 epoch 9 - iter 147/1476 - loss 0.00511288 - time (sec): 6.89 - samples/sec: 2440.44 - lr: 0.000011 - momentum: 0.000000 2023-10-17 22:14:46,539 epoch 9 - iter 294/1476 - loss 0.00538705 - time (sec): 14.28 - samples/sec: 2461.65 - lr: 0.000010 - momentum: 0.000000 2023-10-17 22:14:54,063 epoch 9 - iter 441/1476 - loss 0.00833169 - time (sec): 21.80 - samples/sec: 2464.56 - lr: 0.000009 - momentum: 0.000000 2023-10-17 22:15:01,988 epoch 9 - iter 588/1476 - loss 0.00813564 - time (sec): 29.73 - samples/sec: 2423.08 - lr: 0.000009 - momentum: 0.000000 2023-10-17 22:15:09,194 epoch 9 - iter 735/1476 - loss 0.00932861 - time (sec): 36.93 - samples/sec: 2392.29 - lr: 0.000008 - momentum: 0.000000 2023-10-17 22:15:16,222 epoch 9 - iter 882/1476 - loss 0.00949210 - time (sec): 43.96 - samples/sec: 2368.66 - lr: 0.000008 - momentum: 0.000000 2023-10-17 22:15:23,456 epoch 9 - iter 1029/1476 - loss 0.00889160 - time (sec): 51.19 - samples/sec: 2350.66 - lr: 0.000007 - momentum: 0.000000 2023-10-17 22:15:30,524 epoch 9 - iter 1176/1476 - loss 0.00863203 - time (sec): 58.26 - samples/sec: 2327.34 - lr: 0.000007 - momentum: 0.000000 2023-10-17 22:15:37,563 epoch 9 - iter 1323/1476 - loss 0.00845918 - time (sec): 65.30 - samples/sec: 2306.80 - lr: 0.000006 - momentum: 0.000000 2023-10-17 22:15:44,389 epoch 9 - iter 1470/1476 - loss 0.00848482 - time (sec): 72.13 - samples/sec: 2299.18 - lr: 0.000006 - momentum: 0.000000 2023-10-17 22:15:44,665 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:15:44,665 EPOCH 9 done: loss 0.0085 - lr: 0.000006 2023-10-17 22:15:56,000 DEV : loss 0.23888403177261353 - f1-score (micro avg) 0.8385 2023-10-17 22:15:56,032 saving best model 2023-10-17 22:15:56,631 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:16:03,873 epoch 10 - iter 147/1476 - loss 0.00295495 - time (sec): 7.24 - samples/sec: 2364.95 - lr: 0.000005 - momentum: 0.000000 2023-10-17 22:16:11,230 epoch 10 - iter 294/1476 - loss 0.00440512 - time (sec): 14.60 - samples/sec: 2356.42 - lr: 0.000004 - momentum: 0.000000 2023-10-17 22:16:18,016 epoch 10 - iter 441/1476 - loss 0.00502621 - time (sec): 21.38 - samples/sec: 2359.26 - lr: 0.000004 - momentum: 0.000000 2023-10-17 22:16:25,083 epoch 10 - iter 588/1476 - loss 0.00462300 - time (sec): 28.45 - samples/sec: 2330.18 - lr: 0.000003 - momentum: 0.000000 2023-10-17 22:16:32,176 epoch 10 - iter 735/1476 - loss 0.00465598 - time (sec): 35.54 - samples/sec: 2344.84 - lr: 0.000003 - momentum: 0.000000 2023-10-17 22:16:39,251 epoch 10 - iter 882/1476 - loss 0.00471816 - time (sec): 42.62 - samples/sec: 2317.07 - lr: 0.000002 - momentum: 0.000000 2023-10-17 22:16:46,379 epoch 10 - iter 1029/1476 - loss 0.00416331 - time (sec): 49.75 - samples/sec: 2321.50 - lr: 0.000002 - momentum: 0.000000 2023-10-17 22:16:53,313 epoch 10 - iter 1176/1476 - loss 0.00440895 - time (sec): 56.68 - samples/sec: 2310.65 - lr: 0.000001 - momentum: 0.000000 2023-10-17 22:17:00,910 epoch 10 - iter 1323/1476 - loss 0.00464593 - time (sec): 64.28 - samples/sec: 2349.98 - lr: 0.000001 - momentum: 0.000000 2023-10-17 22:17:08,205 epoch 10 - iter 1470/1476 - loss 0.00603315 - time (sec): 71.57 - samples/sec: 2316.56 - lr: 0.000000 - momentum: 0.000000 2023-10-17 22:17:08,485 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:17:08,486 EPOCH 10 done: loss 0.0060 - lr: 0.000000 2023-10-17 22:17:20,141 DEV : loss 0.2451922595500946 - f1-score (micro avg) 0.838 2023-10-17 22:17:20,606 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:17:20,608 Loading model from best epoch ... 2023-10-17 22:17:22,322 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod 2023-10-17 22:17:28,436 Results: - F-score (micro) 0.8029 - F-score (macro) 0.6891 - Accuracy 0.6921 By class: precision recall f1-score support loc 0.8642 0.8823 0.8731 858 pers 0.7617 0.8212 0.7903 537 org 0.5798 0.5227 0.5498 132 prod 0.6379 0.6066 0.6218 61 time 0.5625 0.6667 0.6102 54 micro avg 0.7901 0.8161 0.8029 1642 macro avg 0.6812 0.6999 0.6891 1642 weighted avg 0.7895 0.8161 0.8021 1642 2023-10-17 22:17:28,436 ----------------------------------------------------------------------------------------------------