2023-10-17 21:48:44,320 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:48:44,321 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 21:48:44,321 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:48:44,321 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator 2023-10-17 21:48:44,321 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:48:44,321 Train: 5901 sentences 2023-10-17 21:48:44,321 (train_with_dev=False, train_with_test=False) 2023-10-17 21:48:44,321 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:48:44,322 Training Params: 2023-10-17 21:48:44,322 - learning_rate: "3e-05" 2023-10-17 21:48:44,322 - mini_batch_size: "4" 2023-10-17 21:48:44,322 - max_epochs: "10" 2023-10-17 21:48:44,322 - shuffle: "True" 2023-10-17 21:48:44,322 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:48:44,322 Plugins: 2023-10-17 21:48:44,322 - TensorboardLogger 2023-10-17 21:48:44,322 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 21:48:44,322 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:48:44,322 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 21:48:44,322 - metric: "('micro avg', 'f1-score')" 2023-10-17 21:48:44,322 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:48:44,322 Computation: 2023-10-17 21:48:44,322 - compute on device: cuda:0 2023-10-17 21:48:44,322 - embedding storage: none 2023-10-17 21:48:44,322 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:48:44,322 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 21:48:44,322 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:48:44,322 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:48:44,322 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 21:48:52,215 epoch 1 - iter 147/1476 - loss 3.21553891 - time (sec): 7.89 - samples/sec: 2054.82 - lr: 0.000003 - momentum: 0.000000 2023-10-17 21:49:00,069 epoch 1 - iter 294/1476 - loss 1.82928163 - time (sec): 15.75 - samples/sec: 2288.06 - lr: 0.000006 - momentum: 0.000000 2023-10-17 21:49:07,291 epoch 1 - iter 441/1476 - loss 1.40421229 - time (sec): 22.97 - samples/sec: 2268.93 - lr: 0.000009 - momentum: 0.000000 2023-10-17 21:49:14,183 epoch 1 - iter 588/1476 - loss 1.17671724 - time (sec): 29.86 - samples/sec: 2242.25 - lr: 0.000012 - momentum: 0.000000 2023-10-17 21:49:21,418 epoch 1 - iter 735/1476 - loss 1.00461301 - time (sec): 37.09 - samples/sec: 2239.65 - lr: 0.000015 - momentum: 0.000000 2023-10-17 21:49:29,052 epoch 1 - iter 882/1476 - loss 0.87623716 - time (sec): 44.73 - samples/sec: 2269.88 - lr: 0.000018 - momentum: 0.000000 2023-10-17 21:49:36,057 epoch 1 - iter 1029/1476 - loss 0.78397459 - time (sec): 51.73 - samples/sec: 2267.14 - lr: 0.000021 - momentum: 0.000000 2023-10-17 21:49:43,075 epoch 1 - iter 1176/1476 - loss 0.71570404 - time (sec): 58.75 - samples/sec: 2262.64 - lr: 0.000024 - momentum: 0.000000 2023-10-17 21:49:49,813 epoch 1 - iter 1323/1476 - loss 0.65939395 - time (sec): 65.49 - samples/sec: 2273.76 - lr: 0.000027 - momentum: 0.000000 2023-10-17 21:49:56,543 epoch 1 - iter 1470/1476 - loss 0.61090665 - time (sec): 72.22 - samples/sec: 2292.81 - lr: 0.000030 - momentum: 0.000000 2023-10-17 21:49:56,815 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:49:56,815 EPOCH 1 done: loss 0.6087 - lr: 0.000030 2023-10-17 21:50:03,163 DEV : loss 0.11746251583099365 - f1-score (micro avg) 0.7497 2023-10-17 21:50:03,193 saving best model 2023-10-17 21:50:03,639 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:50:10,801 epoch 2 - iter 147/1476 - loss 0.15813511 - time (sec): 7.16 - samples/sec: 2256.78 - lr: 0.000030 - momentum: 0.000000 2023-10-17 21:50:18,227 epoch 2 - iter 294/1476 - loss 0.15426334 - time (sec): 14.59 - samples/sec: 2368.85 - lr: 0.000029 - momentum: 0.000000 2023-10-17 21:50:25,596 epoch 2 - iter 441/1476 - loss 0.14926787 - time (sec): 21.95 - samples/sec: 2327.63 - lr: 0.000029 - momentum: 0.000000 2023-10-17 21:50:32,544 epoch 2 - iter 588/1476 - loss 0.14462305 - time (sec): 28.90 - samples/sec: 2312.96 - lr: 0.000029 - momentum: 0.000000 2023-10-17 21:50:39,870 epoch 2 - iter 735/1476 - loss 0.13899207 - time (sec): 36.23 - samples/sec: 2322.65 - lr: 0.000028 - momentum: 0.000000 2023-10-17 21:50:47,131 epoch 2 - iter 882/1476 - loss 0.13780348 - time (sec): 43.49 - samples/sec: 2327.09 - lr: 0.000028 - momentum: 0.000000 2023-10-17 21:50:54,421 epoch 2 - iter 1029/1476 - loss 0.13394971 - time (sec): 50.78 - samples/sec: 2333.26 - lr: 0.000028 - momentum: 0.000000 2023-10-17 21:51:01,785 epoch 2 - iter 1176/1476 - loss 0.13121195 - time (sec): 58.14 - samples/sec: 2317.18 - lr: 0.000027 - momentum: 0.000000 2023-10-17 21:51:08,880 epoch 2 - iter 1323/1476 - loss 0.13009309 - time (sec): 65.24 - samples/sec: 2309.28 - lr: 0.000027 - momentum: 0.000000 2023-10-17 21:51:16,070 epoch 2 - iter 1470/1476 - loss 0.12888252 - time (sec): 72.43 - samples/sec: 2289.73 - lr: 0.000027 - momentum: 0.000000 2023-10-17 21:51:16,340 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:51:16,341 EPOCH 2 done: loss 0.1290 - lr: 0.000027 2023-10-17 21:51:27,700 DEV : loss 0.13118185102939606 - f1-score (micro avg) 0.8205 2023-10-17 21:51:27,731 saving best model 2023-10-17 21:51:28,302 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:51:35,230 epoch 3 - iter 147/1476 - loss 0.08428630 - time (sec): 6.93 - samples/sec: 2348.59 - lr: 0.000026 - momentum: 0.000000 2023-10-17 21:51:42,235 epoch 3 - iter 294/1476 - loss 0.09269462 - time (sec): 13.93 - samples/sec: 2356.80 - lr: 0.000026 - momentum: 0.000000 2023-10-17 21:51:49,369 epoch 3 - iter 441/1476 - loss 0.09367626 - time (sec): 21.07 - samples/sec: 2351.19 - lr: 0.000026 - momentum: 0.000000 2023-10-17 21:51:56,599 epoch 3 - iter 588/1476 - loss 0.09126280 - time (sec): 28.29 - samples/sec: 2348.58 - lr: 0.000025 - momentum: 0.000000 2023-10-17 21:52:03,877 epoch 3 - iter 735/1476 - loss 0.08799231 - time (sec): 35.57 - samples/sec: 2365.87 - lr: 0.000025 - momentum: 0.000000 2023-10-17 21:52:11,036 epoch 3 - iter 882/1476 - loss 0.09015570 - time (sec): 42.73 - samples/sec: 2357.46 - lr: 0.000025 - momentum: 0.000000 2023-10-17 21:52:18,050 epoch 3 - iter 1029/1476 - loss 0.08819344 - time (sec): 49.75 - samples/sec: 2327.38 - lr: 0.000024 - momentum: 0.000000 2023-10-17 21:52:25,561 epoch 3 - iter 1176/1476 - loss 0.08449557 - time (sec): 57.26 - samples/sec: 2335.19 - lr: 0.000024 - momentum: 0.000000 2023-10-17 21:52:32,715 epoch 3 - iter 1323/1476 - loss 0.08531470 - time (sec): 64.41 - samples/sec: 2330.10 - lr: 0.000024 - momentum: 0.000000 2023-10-17 21:52:39,597 epoch 3 - iter 1470/1476 - loss 0.08419234 - time (sec): 71.29 - samples/sec: 2325.31 - lr: 0.000023 - momentum: 0.000000 2023-10-17 21:52:39,874 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:52:39,874 EPOCH 3 done: loss 0.0843 - lr: 0.000023 2023-10-17 21:52:51,115 DEV : loss 0.16755998134613037 - f1-score (micro avg) 0.8279 2023-10-17 21:52:51,145 saving best model 2023-10-17 21:52:51,681 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:52:58,821 epoch 4 - iter 147/1476 - loss 0.04505536 - time (sec): 7.14 - samples/sec: 2204.94 - lr: 0.000023 - momentum: 0.000000 2023-10-17 21:53:05,784 epoch 4 - iter 294/1476 - loss 0.05272174 - time (sec): 14.10 - samples/sec: 2309.05 - lr: 0.000023 - momentum: 0.000000 2023-10-17 21:53:13,188 epoch 4 - iter 441/1476 - loss 0.04762974 - time (sec): 21.50 - samples/sec: 2351.47 - lr: 0.000022 - momentum: 0.000000 2023-10-17 21:53:20,374 epoch 4 - iter 588/1476 - loss 0.04819030 - time (sec): 28.69 - samples/sec: 2305.59 - lr: 0.000022 - momentum: 0.000000 2023-10-17 21:53:27,155 epoch 4 - iter 735/1476 - loss 0.04871004 - time (sec): 35.47 - samples/sec: 2273.30 - lr: 0.000022 - momentum: 0.000000 2023-10-17 21:53:34,314 epoch 4 - iter 882/1476 - loss 0.05424007 - time (sec): 42.63 - samples/sec: 2263.05 - lr: 0.000021 - momentum: 0.000000 2023-10-17 21:53:42,060 epoch 4 - iter 1029/1476 - loss 0.05865699 - time (sec): 50.38 - samples/sec: 2308.28 - lr: 0.000021 - momentum: 0.000000 2023-10-17 21:53:49,002 epoch 4 - iter 1176/1476 - loss 0.05889975 - time (sec): 57.32 - samples/sec: 2287.65 - lr: 0.000021 - momentum: 0.000000 2023-10-17 21:53:56,692 epoch 4 - iter 1323/1476 - loss 0.05913293 - time (sec): 65.01 - samples/sec: 2290.30 - lr: 0.000020 - momentum: 0.000000 2023-10-17 21:54:03,864 epoch 4 - iter 1470/1476 - loss 0.05991994 - time (sec): 72.18 - samples/sec: 2298.06 - lr: 0.000020 - momentum: 0.000000 2023-10-17 21:54:04,127 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:54:04,127 EPOCH 4 done: loss 0.0598 - lr: 0.000020 2023-10-17 21:54:15,616 DEV : loss 0.17331284284591675 - f1-score (micro avg) 0.8302 2023-10-17 21:54:15,670 saving best model 2023-10-17 21:54:16,260 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:54:23,152 epoch 5 - iter 147/1476 - loss 0.03359008 - time (sec): 6.89 - samples/sec: 2257.85 - lr: 0.000020 - momentum: 0.000000 2023-10-17 21:54:30,014 epoch 5 - iter 294/1476 - loss 0.03632772 - time (sec): 13.75 - samples/sec: 2278.23 - lr: 0.000019 - momentum: 0.000000 2023-10-17 21:54:37,186 epoch 5 - iter 441/1476 - loss 0.03568507 - time (sec): 20.92 - samples/sec: 2316.55 - lr: 0.000019 - momentum: 0.000000 2023-10-17 21:54:44,249 epoch 5 - iter 588/1476 - loss 0.03760278 - time (sec): 27.99 - samples/sec: 2266.87 - lr: 0.000019 - momentum: 0.000000 2023-10-17 21:54:51,594 epoch 5 - iter 735/1476 - loss 0.04078622 - time (sec): 35.33 - samples/sec: 2287.15 - lr: 0.000018 - momentum: 0.000000 2023-10-17 21:54:59,589 epoch 5 - iter 882/1476 - loss 0.04255157 - time (sec): 43.33 - samples/sec: 2336.51 - lr: 0.000018 - momentum: 0.000000 2023-10-17 21:55:06,590 epoch 5 - iter 1029/1476 - loss 0.04205174 - time (sec): 50.33 - samples/sec: 2312.12 - lr: 0.000018 - momentum: 0.000000 2023-10-17 21:55:13,557 epoch 5 - iter 1176/1476 - loss 0.04118599 - time (sec): 57.29 - samples/sec: 2318.35 - lr: 0.000017 - momentum: 0.000000 2023-10-17 21:55:20,931 epoch 5 - iter 1323/1476 - loss 0.04130013 - time (sec): 64.67 - samples/sec: 2321.54 - lr: 0.000017 - momentum: 0.000000 2023-10-17 21:55:27,776 epoch 5 - iter 1470/1476 - loss 0.04011977 - time (sec): 71.51 - samples/sec: 2319.72 - lr: 0.000017 - momentum: 0.000000 2023-10-17 21:55:28,042 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:55:28,042 EPOCH 5 done: loss 0.0401 - lr: 0.000017 2023-10-17 21:55:39,436 DEV : loss 0.20299148559570312 - f1-score (micro avg) 0.8369 2023-10-17 21:55:39,472 saving best model 2023-10-17 21:55:40,011 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:55:47,241 epoch 6 - iter 147/1476 - loss 0.02601882 - time (sec): 7.22 - samples/sec: 2381.67 - lr: 0.000016 - momentum: 0.000000 2023-10-17 21:55:54,128 epoch 6 - iter 294/1476 - loss 0.02904149 - time (sec): 14.11 - samples/sec: 2285.03 - lr: 0.000016 - momentum: 0.000000 2023-10-17 21:56:01,426 epoch 6 - iter 441/1476 - loss 0.02853494 - time (sec): 21.41 - samples/sec: 2257.19 - lr: 0.000016 - momentum: 0.000000 2023-10-17 21:56:08,396 epoch 6 - iter 588/1476 - loss 0.02496954 - time (sec): 28.38 - samples/sec: 2243.92 - lr: 0.000015 - momentum: 0.000000 2023-10-17 21:56:15,336 epoch 6 - iter 735/1476 - loss 0.02547030 - time (sec): 35.32 - samples/sec: 2262.51 - lr: 0.000015 - momentum: 0.000000 2023-10-17 21:56:22,689 epoch 6 - iter 882/1476 - loss 0.02698140 - time (sec): 42.67 - samples/sec: 2278.19 - lr: 0.000015 - momentum: 0.000000 2023-10-17 21:56:30,040 epoch 6 - iter 1029/1476 - loss 0.02674758 - time (sec): 50.02 - samples/sec: 2278.13 - lr: 0.000014 - momentum: 0.000000 2023-10-17 21:56:37,097 epoch 6 - iter 1176/1476 - loss 0.02770049 - time (sec): 57.08 - samples/sec: 2283.56 - lr: 0.000014 - momentum: 0.000000 2023-10-17 21:56:44,374 epoch 6 - iter 1323/1476 - loss 0.02898319 - time (sec): 64.36 - samples/sec: 2291.76 - lr: 0.000014 - momentum: 0.000000 2023-10-17 21:56:51,525 epoch 6 - iter 1470/1476 - loss 0.02874026 - time (sec): 71.51 - samples/sec: 2288.88 - lr: 0.000013 - momentum: 0.000000 2023-10-17 21:56:52,143 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:56:52,144 EPOCH 6 done: loss 0.0305 - lr: 0.000013 2023-10-17 21:57:03,761 DEV : loss 0.20546814799308777 - f1-score (micro avg) 0.8283 2023-10-17 21:57:03,799 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:57:11,091 epoch 7 - iter 147/1476 - loss 0.01863009 - time (sec): 7.29 - samples/sec: 2273.51 - lr: 0.000013 - momentum: 0.000000 2023-10-17 21:57:18,293 epoch 7 - iter 294/1476 - loss 0.01506715 - time (sec): 14.49 - samples/sec: 2309.46 - lr: 0.000013 - momentum: 0.000000 2023-10-17 21:57:25,362 epoch 7 - iter 441/1476 - loss 0.01637533 - time (sec): 21.56 - samples/sec: 2274.18 - lr: 0.000012 - momentum: 0.000000 2023-10-17 21:57:32,712 epoch 7 - iter 588/1476 - loss 0.01718266 - time (sec): 28.91 - samples/sec: 2305.57 - lr: 0.000012 - momentum: 0.000000 2023-10-17 21:57:39,788 epoch 7 - iter 735/1476 - loss 0.01626465 - time (sec): 35.99 - samples/sec: 2311.05 - lr: 0.000012 - momentum: 0.000000 2023-10-17 21:57:47,342 epoch 7 - iter 882/1476 - loss 0.01727661 - time (sec): 43.54 - samples/sec: 2277.66 - lr: 0.000011 - momentum: 0.000000 2023-10-17 21:57:54,472 epoch 7 - iter 1029/1476 - loss 0.01684576 - time (sec): 50.67 - samples/sec: 2294.87 - lr: 0.000011 - momentum: 0.000000 2023-10-17 21:58:01,562 epoch 7 - iter 1176/1476 - loss 0.01900031 - time (sec): 57.76 - samples/sec: 2293.35 - lr: 0.000011 - momentum: 0.000000 2023-10-17 21:58:09,016 epoch 7 - iter 1323/1476 - loss 0.01887086 - time (sec): 65.22 - samples/sec: 2305.29 - lr: 0.000010 - momentum: 0.000000 2023-10-17 21:58:16,034 epoch 7 - iter 1470/1476 - loss 0.01884465 - time (sec): 72.23 - samples/sec: 2292.59 - lr: 0.000010 - momentum: 0.000000 2023-10-17 21:58:16,349 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:58:16,349 EPOCH 7 done: loss 0.0188 - lr: 0.000010 2023-10-17 21:58:28,137 DEV : loss 0.20466740429401398 - f1-score (micro avg) 0.8393 2023-10-17 21:58:28,183 saving best model 2023-10-17 21:58:28,790 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:58:36,102 epoch 8 - iter 147/1476 - loss 0.01243626 - time (sec): 7.31 - samples/sec: 2301.24 - lr: 0.000010 - momentum: 0.000000 2023-10-17 21:58:43,165 epoch 8 - iter 294/1476 - loss 0.01391050 - time (sec): 14.37 - samples/sec: 2379.34 - lr: 0.000009 - momentum: 0.000000 2023-10-17 21:58:50,392 epoch 8 - iter 441/1476 - loss 0.01500665 - time (sec): 21.60 - samples/sec: 2397.81 - lr: 0.000009 - momentum: 0.000000 2023-10-17 21:58:57,536 epoch 8 - iter 588/1476 - loss 0.01378980 - time (sec): 28.74 - samples/sec: 2400.67 - lr: 0.000009 - momentum: 0.000000 2023-10-17 21:59:04,538 epoch 8 - iter 735/1476 - loss 0.01391589 - time (sec): 35.75 - samples/sec: 2360.66 - lr: 0.000008 - momentum: 0.000000 2023-10-17 21:59:11,626 epoch 8 - iter 882/1476 - loss 0.01343908 - time (sec): 42.83 - samples/sec: 2330.48 - lr: 0.000008 - momentum: 0.000000 2023-10-17 21:59:18,850 epoch 8 - iter 1029/1476 - loss 0.01351983 - time (sec): 50.06 - samples/sec: 2331.07 - lr: 0.000008 - momentum: 0.000000 2023-10-17 21:59:25,750 epoch 8 - iter 1176/1476 - loss 0.01341649 - time (sec): 56.96 - samples/sec: 2314.96 - lr: 0.000007 - momentum: 0.000000 2023-10-17 21:59:32,657 epoch 8 - iter 1323/1476 - loss 0.01314697 - time (sec): 63.87 - samples/sec: 2309.97 - lr: 0.000007 - momentum: 0.000000 2023-10-17 21:59:40,165 epoch 8 - iter 1470/1476 - loss 0.01308090 - time (sec): 71.37 - samples/sec: 2324.00 - lr: 0.000007 - momentum: 0.000000 2023-10-17 21:59:40,440 ---------------------------------------------------------------------------------------------------- 2023-10-17 21:59:40,440 EPOCH 8 done: loss 0.0130 - lr: 0.000007 2023-10-17 21:59:52,612 DEV : loss 0.20943744480609894 - f1-score (micro avg) 0.8488 2023-10-17 21:59:52,650 saving best model 2023-10-17 21:59:53,234 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:00:00,328 epoch 9 - iter 147/1476 - loss 0.00953557 - time (sec): 7.09 - samples/sec: 2370.87 - lr: 0.000006 - momentum: 0.000000 2023-10-17 22:00:08,201 epoch 9 - iter 294/1476 - loss 0.00763867 - time (sec): 14.96 - samples/sec: 2348.79 - lr: 0.000006 - momentum: 0.000000 2023-10-17 22:00:15,710 epoch 9 - iter 441/1476 - loss 0.01052035 - time (sec): 22.47 - samples/sec: 2390.84 - lr: 0.000006 - momentum: 0.000000 2023-10-17 22:00:23,177 epoch 9 - iter 588/1476 - loss 0.01054438 - time (sec): 29.94 - samples/sec: 2405.81 - lr: 0.000005 - momentum: 0.000000 2023-10-17 22:00:30,492 epoch 9 - iter 735/1476 - loss 0.01163723 - time (sec): 37.26 - samples/sec: 2371.56 - lr: 0.000005 - momentum: 0.000000 2023-10-17 22:00:37,497 epoch 9 - iter 882/1476 - loss 0.01067846 - time (sec): 44.26 - samples/sec: 2352.64 - lr: 0.000005 - momentum: 0.000000 2023-10-17 22:00:44,690 epoch 9 - iter 1029/1476 - loss 0.00995912 - time (sec): 51.45 - samples/sec: 2338.83 - lr: 0.000004 - momentum: 0.000000 2023-10-17 22:00:52,032 epoch 9 - iter 1176/1476 - loss 0.00976714 - time (sec): 58.79 - samples/sec: 2306.27 - lr: 0.000004 - momentum: 0.000000 2023-10-17 22:00:59,128 epoch 9 - iter 1323/1476 - loss 0.00967883 - time (sec): 65.89 - samples/sec: 2286.15 - lr: 0.000004 - momentum: 0.000000 2023-10-17 22:01:05,985 epoch 9 - iter 1470/1476 - loss 0.00946124 - time (sec): 72.75 - samples/sec: 2279.57 - lr: 0.000003 - momentum: 0.000000 2023-10-17 22:01:06,248 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:01:06,248 EPOCH 9 done: loss 0.0094 - lr: 0.000003 2023-10-17 22:01:17,704 DEV : loss 0.21988801658153534 - f1-score (micro avg) 0.8454 2023-10-17 22:01:17,736 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:01:24,951 epoch 10 - iter 147/1476 - loss 0.00447608 - time (sec): 7.21 - samples/sec: 2374.08 - lr: 0.000003 - momentum: 0.000000 2023-10-17 22:01:32,255 epoch 10 - iter 294/1476 - loss 0.00883325 - time (sec): 14.52 - samples/sec: 2369.45 - lr: 0.000003 - momentum: 0.000000 2023-10-17 22:01:39,367 epoch 10 - iter 441/1476 - loss 0.00766936 - time (sec): 21.63 - samples/sec: 2332.40 - lr: 0.000002 - momentum: 0.000000 2023-10-17 22:01:46,573 epoch 10 - iter 588/1476 - loss 0.00647105 - time (sec): 28.84 - samples/sec: 2299.01 - lr: 0.000002 - momentum: 0.000000 2023-10-17 22:01:53,832 epoch 10 - iter 735/1476 - loss 0.00639483 - time (sec): 36.10 - samples/sec: 2309.00 - lr: 0.000002 - momentum: 0.000000 2023-10-17 22:02:00,794 epoch 10 - iter 882/1476 - loss 0.00613195 - time (sec): 43.06 - samples/sec: 2293.47 - lr: 0.000001 - momentum: 0.000000 2023-10-17 22:02:07,987 epoch 10 - iter 1029/1476 - loss 0.00564404 - time (sec): 50.25 - samples/sec: 2298.22 - lr: 0.000001 - momentum: 0.000000 2023-10-17 22:02:15,173 epoch 10 - iter 1176/1476 - loss 0.00570421 - time (sec): 57.44 - samples/sec: 2280.26 - lr: 0.000001 - momentum: 0.000000 2023-10-17 22:02:22,909 epoch 10 - iter 1323/1476 - loss 0.00608918 - time (sec): 65.17 - samples/sec: 2317.76 - lr: 0.000000 - momentum: 0.000000 2023-10-17 22:02:29,700 epoch 10 - iter 1470/1476 - loss 0.00720383 - time (sec): 71.96 - samples/sec: 2304.00 - lr: 0.000000 - momentum: 0.000000 2023-10-17 22:02:29,971 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:02:29,971 EPOCH 10 done: loss 0.0072 - lr: 0.000000 2023-10-17 22:02:41,352 DEV : loss 0.2181602418422699 - f1-score (micro avg) 0.8512 2023-10-17 22:02:41,382 saving best model 2023-10-17 22:02:42,401 ---------------------------------------------------------------------------------------------------- 2023-10-17 22:02:42,402 Loading model from best epoch ... 2023-10-17 22:02:43,873 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod 2023-10-17 22:02:50,897 Results: - F-score (micro) 0.816 - F-score (macro) 0.7265 - Accuracy 0.7086 By class: precision recall f1-score support loc 0.8704 0.8846 0.8775 858 pers 0.7743 0.8305 0.8014 537 org 0.6000 0.5909 0.5954 132 prod 0.7377 0.7377 0.7377 61 time 0.5806 0.6667 0.6207 54 micro avg 0.8019 0.8307 0.8160 1642 macro avg 0.7126 0.7421 0.7265 1642 weighted avg 0.8028 0.8307 0.8163 1642 2023-10-17 22:02:50,897 ----------------------------------------------------------------------------------------------------