2023-10-13 15:27:51,374 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:27:51,375 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 15:27:51,375 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:27:51,375 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator 2023-10-13 15:27:51,375 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:27:51,375 Train: 5901 sentences 2023-10-13 15:27:51,375 (train_with_dev=False, train_with_test=False) 2023-10-13 15:27:51,375 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:27:51,375 Training Params: 2023-10-13 15:27:51,375 - learning_rate: "3e-05" 2023-10-13 15:27:51,375 - mini_batch_size: "4" 2023-10-13 15:27:51,375 - max_epochs: "10" 2023-10-13 15:27:51,375 - shuffle: "True" 2023-10-13 15:27:51,375 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:27:51,375 Plugins: 2023-10-13 15:27:51,375 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 15:27:51,375 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:27:51,376 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 15:27:51,376 - metric: "('micro avg', 'f1-score')" 2023-10-13 15:27:51,376 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:27:51,376 Computation: 2023-10-13 15:27:51,376 - compute on device: cuda:0 2023-10-13 15:27:51,376 - embedding storage: none 2023-10-13 15:27:51,376 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:27:51,376 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-13 15:27:51,376 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:27:51,376 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:27:58,575 epoch 1 - iter 147/1476 - loss 2.67128935 - time (sec): 7.20 - samples/sec: 2344.96 - lr: 0.000003 - momentum: 0.000000 2023-10-13 15:28:05,440 epoch 1 - iter 294/1476 - loss 1.66437545 - time (sec): 14.06 - samples/sec: 2356.83 - lr: 0.000006 - momentum: 0.000000 2023-10-13 15:28:12,862 epoch 1 - iter 441/1476 - loss 1.24057897 - time (sec): 21.49 - samples/sec: 2436.27 - lr: 0.000009 - momentum: 0.000000 2023-10-13 15:28:19,562 epoch 1 - iter 588/1476 - loss 1.03733344 - time (sec): 28.19 - samples/sec: 2390.97 - lr: 0.000012 - momentum: 0.000000 2023-10-13 15:28:26,431 epoch 1 - iter 735/1476 - loss 0.89839028 - time (sec): 35.05 - samples/sec: 2384.70 - lr: 0.000015 - momentum: 0.000000 2023-10-13 15:28:33,323 epoch 1 - iter 882/1476 - loss 0.79874819 - time (sec): 41.95 - samples/sec: 2361.49 - lr: 0.000018 - momentum: 0.000000 2023-10-13 15:28:40,075 epoch 1 - iter 1029/1476 - loss 0.72454902 - time (sec): 48.70 - samples/sec: 2347.97 - lr: 0.000021 - momentum: 0.000000 2023-10-13 15:28:46,841 epoch 1 - iter 1176/1476 - loss 0.66241525 - time (sec): 55.46 - samples/sec: 2338.34 - lr: 0.000024 - momentum: 0.000000 2023-10-13 15:28:54,229 epoch 1 - iter 1323/1476 - loss 0.59924863 - time (sec): 62.85 - samples/sec: 2372.82 - lr: 0.000027 - momentum: 0.000000 2023-10-13 15:29:01,448 epoch 1 - iter 1470/1476 - loss 0.55709201 - time (sec): 70.07 - samples/sec: 2366.27 - lr: 0.000030 - momentum: 0.000000 2023-10-13 15:29:01,707 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:29:01,708 EPOCH 1 done: loss 0.5556 - lr: 0.000030 2023-10-13 15:29:07,863 DEV : loss 0.14035969972610474 - f1-score (micro avg) 0.7149 2023-10-13 15:29:07,892 saving best model 2023-10-13 15:29:08,465 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:29:15,669 epoch 2 - iter 147/1476 - loss 0.15079759 - time (sec): 7.20 - samples/sec: 2360.12 - lr: 0.000030 - momentum: 0.000000 2023-10-13 15:29:22,670 epoch 2 - iter 294/1476 - loss 0.14126703 - time (sec): 14.20 - samples/sec: 2200.45 - lr: 0.000029 - momentum: 0.000000 2023-10-13 15:29:29,499 epoch 2 - iter 441/1476 - loss 0.14626876 - time (sec): 21.03 - samples/sec: 2222.41 - lr: 0.000029 - momentum: 0.000000 2023-10-13 15:29:36,399 epoch 2 - iter 588/1476 - loss 0.14067959 - time (sec): 27.93 - samples/sec: 2257.92 - lr: 0.000029 - momentum: 0.000000 2023-10-13 15:29:43,088 epoch 2 - iter 735/1476 - loss 0.13794568 - time (sec): 34.62 - samples/sec: 2263.69 - lr: 0.000028 - momentum: 0.000000 2023-10-13 15:29:51,009 epoch 2 - iter 882/1476 - loss 0.13937727 - time (sec): 42.54 - samples/sec: 2355.76 - lr: 0.000028 - momentum: 0.000000 2023-10-13 15:29:58,281 epoch 2 - iter 1029/1476 - loss 0.13468420 - time (sec): 49.81 - samples/sec: 2350.07 - lr: 0.000028 - momentum: 0.000000 2023-10-13 15:30:05,519 epoch 2 - iter 1176/1476 - loss 0.13405240 - time (sec): 57.05 - samples/sec: 2334.43 - lr: 0.000027 - momentum: 0.000000 2023-10-13 15:30:12,368 epoch 2 - iter 1323/1476 - loss 0.13202783 - time (sec): 63.90 - samples/sec: 2345.98 - lr: 0.000027 - momentum: 0.000000 2023-10-13 15:30:19,112 epoch 2 - iter 1470/1476 - loss 0.12858037 - time (sec): 70.65 - samples/sec: 2348.15 - lr: 0.000027 - momentum: 0.000000 2023-10-13 15:30:19,372 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:30:19,372 EPOCH 2 done: loss 0.1285 - lr: 0.000027 2023-10-13 15:30:30,473 DEV : loss 0.1444912701845169 - f1-score (micro avg) 0.764 2023-10-13 15:30:30,501 saving best model 2023-10-13 15:30:31,009 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:30:38,496 epoch 3 - iter 147/1476 - loss 0.08029554 - time (sec): 7.48 - samples/sec: 2048.88 - lr: 0.000026 - momentum: 0.000000 2023-10-13 15:30:46,516 epoch 3 - iter 294/1476 - loss 0.07686342 - time (sec): 15.50 - samples/sec: 2024.98 - lr: 0.000026 - momentum: 0.000000 2023-10-13 15:30:53,679 epoch 3 - iter 441/1476 - loss 0.07956883 - time (sec): 22.66 - samples/sec: 2142.91 - lr: 0.000026 - momentum: 0.000000 2023-10-13 15:31:00,854 epoch 3 - iter 588/1476 - loss 0.08463364 - time (sec): 29.84 - samples/sec: 2205.98 - lr: 0.000025 - momentum: 0.000000 2023-10-13 15:31:07,976 epoch 3 - iter 735/1476 - loss 0.08625536 - time (sec): 36.96 - samples/sec: 2256.21 - lr: 0.000025 - momentum: 0.000000 2023-10-13 15:31:14,723 epoch 3 - iter 882/1476 - loss 0.08680712 - time (sec): 43.71 - samples/sec: 2253.30 - lr: 0.000025 - momentum: 0.000000 2023-10-13 15:31:21,639 epoch 3 - iter 1029/1476 - loss 0.08602669 - time (sec): 50.62 - samples/sec: 2279.35 - lr: 0.000024 - momentum: 0.000000 2023-10-13 15:31:28,731 epoch 3 - iter 1176/1476 - loss 0.08666656 - time (sec): 57.72 - samples/sec: 2289.82 - lr: 0.000024 - momentum: 0.000000 2023-10-13 15:31:35,604 epoch 3 - iter 1323/1476 - loss 0.08609219 - time (sec): 64.59 - samples/sec: 2302.44 - lr: 0.000024 - momentum: 0.000000 2023-10-13 15:31:42,847 epoch 3 - iter 1470/1476 - loss 0.08337165 - time (sec): 71.83 - samples/sec: 2310.10 - lr: 0.000023 - momentum: 0.000000 2023-10-13 15:31:43,113 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:31:43,114 EPOCH 3 done: loss 0.0834 - lr: 0.000023 2023-10-13 15:31:54,210 DEV : loss 0.161760613322258 - f1-score (micro avg) 0.8036 2023-10-13 15:31:54,239 saving best model 2023-10-13 15:31:55,198 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:32:02,152 epoch 4 - iter 147/1476 - loss 0.05876058 - time (sec): 6.95 - samples/sec: 2278.92 - lr: 0.000023 - momentum: 0.000000 2023-10-13 15:32:09,259 epoch 4 - iter 294/1476 - loss 0.05569616 - time (sec): 14.06 - samples/sec: 2387.95 - lr: 0.000023 - momentum: 0.000000 2023-10-13 15:32:16,572 epoch 4 - iter 441/1476 - loss 0.05615460 - time (sec): 21.37 - samples/sec: 2462.00 - lr: 0.000022 - momentum: 0.000000 2023-10-13 15:32:23,371 epoch 4 - iter 588/1476 - loss 0.05297739 - time (sec): 28.17 - samples/sec: 2406.73 - lr: 0.000022 - momentum: 0.000000 2023-10-13 15:32:29,991 epoch 4 - iter 735/1476 - loss 0.05364302 - time (sec): 34.79 - samples/sec: 2397.48 - lr: 0.000022 - momentum: 0.000000 2023-10-13 15:32:36,439 epoch 4 - iter 882/1476 - loss 0.05261850 - time (sec): 41.24 - samples/sec: 2377.08 - lr: 0.000021 - momentum: 0.000000 2023-10-13 15:32:43,387 epoch 4 - iter 1029/1476 - loss 0.05432762 - time (sec): 48.19 - samples/sec: 2408.94 - lr: 0.000021 - momentum: 0.000000 2023-10-13 15:32:49,923 epoch 4 - iter 1176/1476 - loss 0.05493788 - time (sec): 54.72 - samples/sec: 2397.34 - lr: 0.000021 - momentum: 0.000000 2023-10-13 15:32:57,077 epoch 4 - iter 1323/1476 - loss 0.05561241 - time (sec): 61.88 - samples/sec: 2410.01 - lr: 0.000020 - momentum: 0.000000 2023-10-13 15:33:04,188 epoch 4 - iter 1470/1476 - loss 0.05782730 - time (sec): 68.99 - samples/sec: 2403.75 - lr: 0.000020 - momentum: 0.000000 2023-10-13 15:33:04,442 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:33:04,442 EPOCH 4 done: loss 0.0577 - lr: 0.000020 2023-10-13 15:33:15,617 DEV : loss 0.18805062770843506 - f1-score (micro avg) 0.8023 2023-10-13 15:33:15,646 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:33:22,434 epoch 5 - iter 147/1476 - loss 0.04212635 - time (sec): 6.79 - samples/sec: 2269.00 - lr: 0.000020 - momentum: 0.000000 2023-10-13 15:33:29,485 epoch 5 - iter 294/1476 - loss 0.03671755 - time (sec): 13.84 - samples/sec: 2274.14 - lr: 0.000019 - momentum: 0.000000 2023-10-13 15:33:37,084 epoch 5 - iter 441/1476 - loss 0.04111502 - time (sec): 21.44 - samples/sec: 2277.13 - lr: 0.000019 - momentum: 0.000000 2023-10-13 15:33:44,549 epoch 5 - iter 588/1476 - loss 0.03769225 - time (sec): 28.90 - samples/sec: 2236.07 - lr: 0.000019 - momentum: 0.000000 2023-10-13 15:33:51,684 epoch 5 - iter 735/1476 - loss 0.03738604 - time (sec): 36.04 - samples/sec: 2280.93 - lr: 0.000018 - momentum: 0.000000 2023-10-13 15:33:58,651 epoch 5 - iter 882/1476 - loss 0.03956318 - time (sec): 43.00 - samples/sec: 2301.19 - lr: 0.000018 - momentum: 0.000000 2023-10-13 15:34:05,493 epoch 5 - iter 1029/1476 - loss 0.04047921 - time (sec): 49.85 - samples/sec: 2297.13 - lr: 0.000018 - momentum: 0.000000 2023-10-13 15:34:12,761 epoch 5 - iter 1176/1476 - loss 0.04008186 - time (sec): 57.11 - samples/sec: 2326.24 - lr: 0.000017 - momentum: 0.000000 2023-10-13 15:34:19,761 epoch 5 - iter 1323/1476 - loss 0.03983351 - time (sec): 64.11 - samples/sec: 2328.16 - lr: 0.000017 - momentum: 0.000000 2023-10-13 15:34:26,671 epoch 5 - iter 1470/1476 - loss 0.04072459 - time (sec): 71.02 - samples/sec: 2335.98 - lr: 0.000017 - momentum: 0.000000 2023-10-13 15:34:26,963 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:34:26,963 EPOCH 5 done: loss 0.0409 - lr: 0.000017 2023-10-13 15:34:38,138 DEV : loss 0.18352609872817993 - f1-score (micro avg) 0.8138 2023-10-13 15:34:38,175 saving best model 2023-10-13 15:34:38,693 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:34:45,597 epoch 6 - iter 147/1476 - loss 0.03301350 - time (sec): 6.90 - samples/sec: 2180.03 - lr: 0.000016 - momentum: 0.000000 2023-10-13 15:34:52,473 epoch 6 - iter 294/1476 - loss 0.03217606 - time (sec): 13.77 - samples/sec: 2226.65 - lr: 0.000016 - momentum: 0.000000 2023-10-13 15:34:59,887 epoch 6 - iter 441/1476 - loss 0.03175343 - time (sec): 21.19 - samples/sec: 2327.75 - lr: 0.000016 - momentum: 0.000000 2023-10-13 15:35:07,067 epoch 6 - iter 588/1476 - loss 0.03347640 - time (sec): 28.37 - samples/sec: 2316.77 - lr: 0.000015 - momentum: 0.000000 2023-10-13 15:35:14,095 epoch 6 - iter 735/1476 - loss 0.03623768 - time (sec): 35.40 - samples/sec: 2321.80 - lr: 0.000015 - momentum: 0.000000 2023-10-13 15:35:21,146 epoch 6 - iter 882/1476 - loss 0.03364347 - time (sec): 42.45 - samples/sec: 2347.80 - lr: 0.000015 - momentum: 0.000000 2023-10-13 15:35:27,964 epoch 6 - iter 1029/1476 - loss 0.03302726 - time (sec): 49.26 - samples/sec: 2330.39 - lr: 0.000014 - momentum: 0.000000 2023-10-13 15:35:34,863 epoch 6 - iter 1176/1476 - loss 0.03257321 - time (sec): 56.16 - samples/sec: 2333.46 - lr: 0.000014 - momentum: 0.000000 2023-10-13 15:35:42,265 epoch 6 - iter 1323/1476 - loss 0.03242265 - time (sec): 63.57 - samples/sec: 2361.98 - lr: 0.000014 - momentum: 0.000000 2023-10-13 15:35:49,114 epoch 6 - iter 1470/1476 - loss 0.03154020 - time (sec): 70.41 - samples/sec: 2355.93 - lr: 0.000013 - momentum: 0.000000 2023-10-13 15:35:49,382 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:35:49,383 EPOCH 6 done: loss 0.0314 - lr: 0.000013 2023-10-13 15:36:00,508 DEV : loss 0.19981108605861664 - f1-score (micro avg) 0.8107 2023-10-13 15:36:00,539 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:36:07,373 epoch 7 - iter 147/1476 - loss 0.02210395 - time (sec): 6.83 - samples/sec: 2244.86 - lr: 0.000013 - momentum: 0.000000 2023-10-13 15:36:14,809 epoch 7 - iter 294/1476 - loss 0.02469555 - time (sec): 14.27 - samples/sec: 2376.50 - lr: 0.000013 - momentum: 0.000000 2023-10-13 15:36:21,619 epoch 7 - iter 441/1476 - loss 0.02162633 - time (sec): 21.08 - samples/sec: 2330.74 - lr: 0.000012 - momentum: 0.000000 2023-10-13 15:36:28,788 epoch 7 - iter 588/1476 - loss 0.02184522 - time (sec): 28.25 - samples/sec: 2320.49 - lr: 0.000012 - momentum: 0.000000 2023-10-13 15:36:36,140 epoch 7 - iter 735/1476 - loss 0.02211174 - time (sec): 35.60 - samples/sec: 2309.65 - lr: 0.000012 - momentum: 0.000000 2023-10-13 15:36:43,577 epoch 7 - iter 882/1476 - loss 0.02359508 - time (sec): 43.04 - samples/sec: 2338.18 - lr: 0.000011 - momentum: 0.000000 2023-10-13 15:36:50,676 epoch 7 - iter 1029/1476 - loss 0.02284324 - time (sec): 50.14 - samples/sec: 2361.36 - lr: 0.000011 - momentum: 0.000000 2023-10-13 15:36:57,732 epoch 7 - iter 1176/1476 - loss 0.02153323 - time (sec): 57.19 - samples/sec: 2358.44 - lr: 0.000011 - momentum: 0.000000 2023-10-13 15:37:04,473 epoch 7 - iter 1323/1476 - loss 0.02137460 - time (sec): 63.93 - samples/sec: 2343.49 - lr: 0.000010 - momentum: 0.000000 2023-10-13 15:37:11,276 epoch 7 - iter 1470/1476 - loss 0.02141477 - time (sec): 70.74 - samples/sec: 2345.11 - lr: 0.000010 - momentum: 0.000000 2023-10-13 15:37:11,571 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:37:11,571 EPOCH 7 done: loss 0.0215 - lr: 0.000010 2023-10-13 15:37:22,751 DEV : loss 0.20148473978042603 - f1-score (micro avg) 0.8305 2023-10-13 15:37:22,780 saving best model 2023-10-13 15:37:23,262 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:37:30,423 epoch 8 - iter 147/1476 - loss 0.01204328 - time (sec): 7.16 - samples/sec: 2281.23 - lr: 0.000010 - momentum: 0.000000 2023-10-13 15:37:37,180 epoch 8 - iter 294/1476 - loss 0.01038902 - time (sec): 13.91 - samples/sec: 2309.22 - lr: 0.000009 - momentum: 0.000000 2023-10-13 15:37:44,247 epoch 8 - iter 441/1476 - loss 0.01128180 - time (sec): 20.98 - samples/sec: 2379.03 - lr: 0.000009 - momentum: 0.000000 2023-10-13 15:37:51,183 epoch 8 - iter 588/1476 - loss 0.01249804 - time (sec): 27.92 - samples/sec: 2360.92 - lr: 0.000009 - momentum: 0.000000 2023-10-13 15:37:58,038 epoch 8 - iter 735/1476 - loss 0.01344434 - time (sec): 34.77 - samples/sec: 2353.59 - lr: 0.000008 - momentum: 0.000000 2023-10-13 15:38:04,979 epoch 8 - iter 882/1476 - loss 0.01407197 - time (sec): 41.71 - samples/sec: 2342.83 - lr: 0.000008 - momentum: 0.000000 2023-10-13 15:38:11,924 epoch 8 - iter 1029/1476 - loss 0.01397679 - time (sec): 48.66 - samples/sec: 2334.81 - lr: 0.000008 - momentum: 0.000000 2023-10-13 15:38:19,217 epoch 8 - iter 1176/1476 - loss 0.01440414 - time (sec): 55.95 - samples/sec: 2355.21 - lr: 0.000007 - momentum: 0.000000 2023-10-13 15:38:26,144 epoch 8 - iter 1323/1476 - loss 0.01412839 - time (sec): 62.88 - samples/sec: 2359.61 - lr: 0.000007 - momentum: 0.000000 2023-10-13 15:38:33,233 epoch 8 - iter 1470/1476 - loss 0.01337390 - time (sec): 69.97 - samples/sec: 2371.84 - lr: 0.000007 - momentum: 0.000000 2023-10-13 15:38:33,482 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:38:33,482 EPOCH 8 done: loss 0.0136 - lr: 0.000007 2023-10-13 15:38:44,536 DEV : loss 0.20017683506011963 - f1-score (micro avg) 0.8306 2023-10-13 15:38:44,565 saving best model 2023-10-13 15:38:45,121 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:38:52,432 epoch 9 - iter 147/1476 - loss 0.00826798 - time (sec): 7.31 - samples/sec: 2380.38 - lr: 0.000006 - momentum: 0.000000 2023-10-13 15:38:59,260 epoch 9 - iter 294/1476 - loss 0.00873267 - time (sec): 14.14 - samples/sec: 2393.03 - lr: 0.000006 - momentum: 0.000000 2023-10-13 15:39:06,139 epoch 9 - iter 441/1476 - loss 0.01108423 - time (sec): 21.02 - samples/sec: 2357.23 - lr: 0.000006 - momentum: 0.000000 2023-10-13 15:39:13,148 epoch 9 - iter 588/1476 - loss 0.00932900 - time (sec): 28.03 - samples/sec: 2351.45 - lr: 0.000005 - momentum: 0.000000 2023-10-13 15:39:20,508 epoch 9 - iter 735/1476 - loss 0.00942840 - time (sec): 35.39 - samples/sec: 2319.05 - lr: 0.000005 - momentum: 0.000000 2023-10-13 15:39:27,548 epoch 9 - iter 882/1476 - loss 0.00936287 - time (sec): 42.43 - samples/sec: 2299.14 - lr: 0.000005 - momentum: 0.000000 2023-10-13 15:39:34,580 epoch 9 - iter 1029/1476 - loss 0.00958077 - time (sec): 49.46 - samples/sec: 2324.33 - lr: 0.000004 - momentum: 0.000000 2023-10-13 15:39:41,934 epoch 9 - iter 1176/1476 - loss 0.00917407 - time (sec): 56.81 - samples/sec: 2329.00 - lr: 0.000004 - momentum: 0.000000 2023-10-13 15:39:48,949 epoch 9 - iter 1323/1476 - loss 0.00951637 - time (sec): 63.83 - samples/sec: 2333.50 - lr: 0.000004 - momentum: 0.000000 2023-10-13 15:39:55,997 epoch 9 - iter 1470/1476 - loss 0.00949339 - time (sec): 70.87 - samples/sec: 2341.48 - lr: 0.000003 - momentum: 0.000000 2023-10-13 15:39:56,255 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:39:56,255 EPOCH 9 done: loss 0.0095 - lr: 0.000003 2023-10-13 15:40:07,445 DEV : loss 0.21117821335792542 - f1-score (micro avg) 0.8314 2023-10-13 15:40:07,474 saving best model 2023-10-13 15:40:08,082 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:40:14,953 epoch 10 - iter 147/1476 - loss 0.00930289 - time (sec): 6.86 - samples/sec: 2352.47 - lr: 0.000003 - momentum: 0.000000 2023-10-13 15:40:24,876 epoch 10 - iter 294/1476 - loss 0.00642253 - time (sec): 16.79 - samples/sec: 2124.43 - lr: 0.000003 - momentum: 0.000000 2023-10-13 15:40:31,972 epoch 10 - iter 441/1476 - loss 0.00573633 - time (sec): 23.88 - samples/sec: 2164.66 - lr: 0.000002 - momentum: 0.000000 2023-10-13 15:40:38,711 epoch 10 - iter 588/1476 - loss 0.00534887 - time (sec): 30.62 - samples/sec: 2199.43 - lr: 0.000002 - momentum: 0.000000 2023-10-13 15:40:45,486 epoch 10 - iter 735/1476 - loss 0.00493565 - time (sec): 37.40 - samples/sec: 2211.92 - lr: 0.000002 - momentum: 0.000000 2023-10-13 15:40:52,210 epoch 10 - iter 882/1476 - loss 0.00519681 - time (sec): 44.12 - samples/sec: 2220.79 - lr: 0.000001 - momentum: 0.000000 2023-10-13 15:40:59,407 epoch 10 - iter 1029/1476 - loss 0.00566250 - time (sec): 51.32 - samples/sec: 2248.33 - lr: 0.000001 - momentum: 0.000000 2023-10-13 15:41:06,385 epoch 10 - iter 1176/1476 - loss 0.00599606 - time (sec): 58.30 - samples/sec: 2261.89 - lr: 0.000001 - momentum: 0.000000 2023-10-13 15:41:13,308 epoch 10 - iter 1323/1476 - loss 0.00567523 - time (sec): 65.22 - samples/sec: 2267.40 - lr: 0.000000 - momentum: 0.000000 2023-10-13 15:41:21,239 epoch 10 - iter 1470/1476 - loss 0.00589442 - time (sec): 73.15 - samples/sec: 2269.93 - lr: 0.000000 - momentum: 0.000000 2023-10-13 15:41:21,549 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:41:21,549 EPOCH 10 done: loss 0.0059 - lr: 0.000000 2023-10-13 15:41:33,106 DEV : loss 0.2184198498725891 - f1-score (micro avg) 0.8284 2023-10-13 15:41:33,638 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:41:33,640 Loading model from best epoch ... 2023-10-13 15:41:35,214 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod 2023-10-13 15:41:41,298 Results: - F-score (micro) 0.7927 - F-score (macro) 0.6875 - Accuracy 0.6802 By class: precision recall f1-score support loc 0.8710 0.8660 0.8685 858 pers 0.7522 0.7970 0.7740 537 org 0.5127 0.6136 0.5586 132 prod 0.6724 0.6393 0.6555 61 time 0.5397 0.6296 0.5812 54 micro avg 0.7790 0.8069 0.7927 1642 macro avg 0.6696 0.7091 0.6875 1642 weighted avg 0.7851 0.8069 0.7953 1642 2023-10-13 15:41:41,299 ----------------------------------------------------------------------------------------------------