2023-10-14 20:15:14,574 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:15:14,575 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-14 20:15:14,575 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:15:14,575 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences - NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator 2023-10-14 20:15:14,575 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:15:14,575 Train: 14465 sentences 2023-10-14 20:15:14,576 (train_with_dev=False, train_with_test=False) 2023-10-14 20:15:14,576 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:15:14,576 Training Params: 2023-10-14 20:15:14,576 - learning_rate: "3e-05" 2023-10-14 20:15:14,576 - mini_batch_size: "4" 2023-10-14 20:15:14,576 - max_epochs: "10" 2023-10-14 20:15:14,576 - shuffle: "True" 2023-10-14 20:15:14,576 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:15:14,576 Plugins: 2023-10-14 20:15:14,576 - LinearScheduler | warmup_fraction: '0.1' 2023-10-14 20:15:14,576 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:15:14,576 Final evaluation on model from best epoch (best-model.pt) 2023-10-14 20:15:14,576 - metric: "('micro avg', 'f1-score')" 2023-10-14 20:15:14,576 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:15:14,576 Computation: 2023-10-14 20:15:14,576 - compute on device: cuda:0 2023-10-14 20:15:14,576 - embedding storage: none 2023-10-14 20:15:14,576 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:15:14,576 Model training base path: "hmbench-letemps/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-14 20:15:14,576 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:15:14,576 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:15:30,762 epoch 1 - iter 361/3617 - loss 1.47398781 - time (sec): 16.18 - samples/sec: 2325.55 - lr: 0.000003 - momentum: 0.000000 2023-10-14 20:15:47,116 epoch 1 - iter 722/3617 - loss 0.83970179 - time (sec): 32.54 - samples/sec: 2319.29 - lr: 0.000006 - momentum: 0.000000 2023-10-14 20:16:03,219 epoch 1 - iter 1083/3617 - loss 0.62059022 - time (sec): 48.64 - samples/sec: 2290.84 - lr: 0.000009 - momentum: 0.000000 2023-10-14 20:16:19,390 epoch 1 - iter 1444/3617 - loss 0.50223969 - time (sec): 64.81 - samples/sec: 2291.76 - lr: 0.000012 - momentum: 0.000000 2023-10-14 20:16:35,953 epoch 1 - iter 1805/3617 - loss 0.42436958 - time (sec): 81.38 - samples/sec: 2313.82 - lr: 0.000015 - momentum: 0.000000 2023-10-14 20:16:52,015 epoch 1 - iter 2166/3617 - loss 0.37437638 - time (sec): 97.44 - samples/sec: 2319.20 - lr: 0.000018 - momentum: 0.000000 2023-10-14 20:17:07,780 epoch 1 - iter 2527/3617 - loss 0.33741262 - time (sec): 113.20 - samples/sec: 2348.50 - lr: 0.000021 - momentum: 0.000000 2023-10-14 20:17:23,796 epoch 1 - iter 2888/3617 - loss 0.31042790 - time (sec): 129.22 - samples/sec: 2349.52 - lr: 0.000024 - momentum: 0.000000 2023-10-14 20:17:39,981 epoch 1 - iter 3249/3617 - loss 0.28794148 - time (sec): 145.40 - samples/sec: 2345.55 - lr: 0.000027 - momentum: 0.000000 2023-10-14 20:17:56,334 epoch 1 - iter 3610/3617 - loss 0.27061320 - time (sec): 161.76 - samples/sec: 2344.14 - lr: 0.000030 - momentum: 0.000000 2023-10-14 20:17:56,632 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:17:56,633 EPOCH 1 done: loss 0.2704 - lr: 0.000030 2023-10-14 20:18:02,040 DEV : loss 0.11999412626028061 - f1-score (micro avg) 0.6234 2023-10-14 20:18:02,080 saving best model 2023-10-14 20:18:02,475 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:18:21,558 epoch 2 - iter 361/3617 - loss 0.09782984 - time (sec): 19.08 - samples/sec: 2009.32 - lr: 0.000030 - momentum: 0.000000 2023-10-14 20:18:39,215 epoch 2 - iter 722/3617 - loss 0.09421131 - time (sec): 36.74 - samples/sec: 2056.11 - lr: 0.000029 - momentum: 0.000000 2023-10-14 20:18:55,988 epoch 2 - iter 1083/3617 - loss 0.09577052 - time (sec): 53.51 - samples/sec: 2106.61 - lr: 0.000029 - momentum: 0.000000 2023-10-14 20:19:13,211 epoch 2 - iter 1444/3617 - loss 0.09595965 - time (sec): 70.73 - samples/sec: 2158.23 - lr: 0.000029 - momentum: 0.000000 2023-10-14 20:19:29,358 epoch 2 - iter 1805/3617 - loss 0.09590618 - time (sec): 86.88 - samples/sec: 2185.24 - lr: 0.000028 - momentum: 0.000000 2023-10-14 20:19:46,839 epoch 2 - iter 2166/3617 - loss 0.09403509 - time (sec): 104.36 - samples/sec: 2192.52 - lr: 0.000028 - momentum: 0.000000 2023-10-14 20:20:03,705 epoch 2 - iter 2527/3617 - loss 0.09439687 - time (sec): 121.23 - samples/sec: 2211.93 - lr: 0.000028 - momentum: 0.000000 2023-10-14 20:20:20,276 epoch 2 - iter 2888/3617 - loss 0.09538483 - time (sec): 137.80 - samples/sec: 2216.77 - lr: 0.000027 - momentum: 0.000000 2023-10-14 20:20:36,508 epoch 2 - iter 3249/3617 - loss 0.09475508 - time (sec): 154.03 - samples/sec: 2215.51 - lr: 0.000027 - momentum: 0.000000 2023-10-14 20:20:55,584 epoch 2 - iter 3610/3617 - loss 0.09527647 - time (sec): 173.11 - samples/sec: 2191.11 - lr: 0.000027 - momentum: 0.000000 2023-10-14 20:20:55,956 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:20:55,956 EPOCH 2 done: loss 0.0953 - lr: 0.000027 2023-10-14 20:21:02,854 DEV : loss 0.12750780582427979 - f1-score (micro avg) 0.6294 2023-10-14 20:21:02,888 saving best model 2023-10-14 20:21:03,598 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:21:22,467 epoch 3 - iter 361/3617 - loss 0.05413118 - time (sec): 18.87 - samples/sec: 1959.23 - lr: 0.000026 - momentum: 0.000000 2023-10-14 20:21:41,460 epoch 3 - iter 722/3617 - loss 0.06427074 - time (sec): 37.86 - samples/sec: 1973.69 - lr: 0.000026 - momentum: 0.000000 2023-10-14 20:21:57,897 epoch 3 - iter 1083/3617 - loss 0.07334315 - time (sec): 54.30 - samples/sec: 2076.57 - lr: 0.000026 - momentum: 0.000000 2023-10-14 20:22:14,087 epoch 3 - iter 1444/3617 - loss 0.07359621 - time (sec): 70.49 - samples/sec: 2131.15 - lr: 0.000025 - momentum: 0.000000 2023-10-14 20:22:30,386 epoch 3 - iter 1805/3617 - loss 0.07222582 - time (sec): 86.79 - samples/sec: 2165.52 - lr: 0.000025 - momentum: 0.000000 2023-10-14 20:22:46,657 epoch 3 - iter 2166/3617 - loss 0.07167594 - time (sec): 103.06 - samples/sec: 2195.87 - lr: 0.000025 - momentum: 0.000000 2023-10-14 20:23:03,231 epoch 3 - iter 2527/3617 - loss 0.07168900 - time (sec): 119.63 - samples/sec: 2219.22 - lr: 0.000024 - momentum: 0.000000 2023-10-14 20:23:19,692 epoch 3 - iter 2888/3617 - loss 0.07149544 - time (sec): 136.09 - samples/sec: 2230.31 - lr: 0.000024 - momentum: 0.000000 2023-10-14 20:23:36,017 epoch 3 - iter 3249/3617 - loss 0.07253118 - time (sec): 152.42 - samples/sec: 2240.40 - lr: 0.000024 - momentum: 0.000000 2023-10-14 20:23:52,572 epoch 3 - iter 3610/3617 - loss 0.07273065 - time (sec): 168.97 - samples/sec: 2244.48 - lr: 0.000023 - momentum: 0.000000 2023-10-14 20:23:52,880 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:23:52,880 EPOCH 3 done: loss 0.0727 - lr: 0.000023 2023-10-14 20:23:59,342 DEV : loss 0.23288682103157043 - f1-score (micro avg) 0.6258 2023-10-14 20:23:59,373 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:24:15,633 epoch 4 - iter 361/3617 - loss 0.05027835 - time (sec): 16.26 - samples/sec: 2260.02 - lr: 0.000023 - momentum: 0.000000 2023-10-14 20:24:32,140 epoch 4 - iter 722/3617 - loss 0.05104940 - time (sec): 32.77 - samples/sec: 2293.15 - lr: 0.000023 - momentum: 0.000000 2023-10-14 20:24:48,509 epoch 4 - iter 1083/3617 - loss 0.04972765 - time (sec): 49.13 - samples/sec: 2297.62 - lr: 0.000022 - momentum: 0.000000 2023-10-14 20:25:04,840 epoch 4 - iter 1444/3617 - loss 0.04948816 - time (sec): 65.47 - samples/sec: 2300.32 - lr: 0.000022 - momentum: 0.000000 2023-10-14 20:25:21,353 epoch 4 - iter 1805/3617 - loss 0.04959231 - time (sec): 81.98 - samples/sec: 2316.09 - lr: 0.000022 - momentum: 0.000000 2023-10-14 20:25:37,665 epoch 4 - iter 2166/3617 - loss 0.05039825 - time (sec): 98.29 - samples/sec: 2325.84 - lr: 0.000021 - momentum: 0.000000 2023-10-14 20:25:53,888 epoch 4 - iter 2527/3617 - loss 0.05098071 - time (sec): 114.51 - samples/sec: 2324.83 - lr: 0.000021 - momentum: 0.000000 2023-10-14 20:26:09,988 epoch 4 - iter 2888/3617 - loss 0.05275888 - time (sec): 130.61 - samples/sec: 2333.25 - lr: 0.000021 - momentum: 0.000000 2023-10-14 20:26:26,056 epoch 4 - iter 3249/3617 - loss 0.05280906 - time (sec): 146.68 - samples/sec: 2333.90 - lr: 0.000020 - momentum: 0.000000 2023-10-14 20:26:42,216 epoch 4 - iter 3610/3617 - loss 0.05251226 - time (sec): 162.84 - samples/sec: 2328.47 - lr: 0.000020 - momentum: 0.000000 2023-10-14 20:26:42,519 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:26:42,519 EPOCH 4 done: loss 0.0524 - lr: 0.000020 2023-10-14 20:26:48,265 DEV : loss 0.29611918330192566 - f1-score (micro avg) 0.6115 2023-10-14 20:26:48,298 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:27:05,568 epoch 5 - iter 361/3617 - loss 0.04203166 - time (sec): 17.27 - samples/sec: 2138.47 - lr: 0.000020 - momentum: 0.000000 2023-10-14 20:27:21,925 epoch 5 - iter 722/3617 - loss 0.03700812 - time (sec): 33.63 - samples/sec: 2268.87 - lr: 0.000019 - momentum: 0.000000 2023-10-14 20:27:38,355 epoch 5 - iter 1083/3617 - loss 0.03554194 - time (sec): 50.06 - samples/sec: 2282.52 - lr: 0.000019 - momentum: 0.000000 2023-10-14 20:27:54,786 epoch 5 - iter 1444/3617 - loss 0.03570610 - time (sec): 66.49 - samples/sec: 2278.40 - lr: 0.000019 - momentum: 0.000000 2023-10-14 20:28:11,254 epoch 5 - iter 1805/3617 - loss 0.03486095 - time (sec): 82.95 - samples/sec: 2272.79 - lr: 0.000018 - momentum: 0.000000 2023-10-14 20:28:27,973 epoch 5 - iter 2166/3617 - loss 0.03550990 - time (sec): 99.67 - samples/sec: 2291.81 - lr: 0.000018 - momentum: 0.000000 2023-10-14 20:28:44,336 epoch 5 - iter 2527/3617 - loss 0.03627469 - time (sec): 116.04 - samples/sec: 2296.38 - lr: 0.000018 - momentum: 0.000000 2023-10-14 20:29:00,528 epoch 5 - iter 2888/3617 - loss 0.03591614 - time (sec): 132.23 - samples/sec: 2303.16 - lr: 0.000017 - momentum: 0.000000 2023-10-14 20:29:16,672 epoch 5 - iter 3249/3617 - loss 0.03711630 - time (sec): 148.37 - samples/sec: 2304.33 - lr: 0.000017 - momentum: 0.000000 2023-10-14 20:29:32,867 epoch 5 - iter 3610/3617 - loss 0.03676579 - time (sec): 164.57 - samples/sec: 2303.59 - lr: 0.000017 - momentum: 0.000000 2023-10-14 20:29:33,174 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:29:33,174 EPOCH 5 done: loss 0.0367 - lr: 0.000017 2023-10-14 20:29:38,931 DEV : loss 0.30095481872558594 - f1-score (micro avg) 0.6255 2023-10-14 20:29:38,964 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:29:55,577 epoch 6 - iter 361/3617 - loss 0.02879097 - time (sec): 16.61 - samples/sec: 2327.19 - lr: 0.000016 - momentum: 0.000000 2023-10-14 20:30:11,985 epoch 6 - iter 722/3617 - loss 0.02283689 - time (sec): 33.02 - samples/sec: 2300.73 - lr: 0.000016 - momentum: 0.000000 2023-10-14 20:30:28,425 epoch 6 - iter 1083/3617 - loss 0.02449134 - time (sec): 49.46 - samples/sec: 2308.91 - lr: 0.000016 - momentum: 0.000000 2023-10-14 20:30:44,812 epoch 6 - iter 1444/3617 - loss 0.02459281 - time (sec): 65.85 - samples/sec: 2288.93 - lr: 0.000015 - momentum: 0.000000 2023-10-14 20:31:01,172 epoch 6 - iter 1805/3617 - loss 0.02646233 - time (sec): 82.21 - samples/sec: 2285.13 - lr: 0.000015 - momentum: 0.000000 2023-10-14 20:31:17,640 epoch 6 - iter 2166/3617 - loss 0.02632674 - time (sec): 98.67 - samples/sec: 2283.10 - lr: 0.000015 - momentum: 0.000000 2023-10-14 20:31:34,002 epoch 6 - iter 2527/3617 - loss 0.02565887 - time (sec): 115.04 - samples/sec: 2284.73 - lr: 0.000014 - momentum: 0.000000 2023-10-14 20:31:50,346 epoch 6 - iter 2888/3617 - loss 0.02507191 - time (sec): 131.38 - samples/sec: 2294.77 - lr: 0.000014 - momentum: 0.000000 2023-10-14 20:32:06,871 epoch 6 - iter 3249/3617 - loss 0.02538127 - time (sec): 147.91 - samples/sec: 2298.40 - lr: 0.000014 - momentum: 0.000000 2023-10-14 20:32:23,470 epoch 6 - iter 3610/3617 - loss 0.02548542 - time (sec): 164.51 - samples/sec: 2305.67 - lr: 0.000013 - momentum: 0.000000 2023-10-14 20:32:23,772 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:32:23,772 EPOCH 6 done: loss 0.0254 - lr: 0.000013 2023-10-14 20:32:31,112 DEV : loss 0.35236480832099915 - f1-score (micro avg) 0.6282 2023-10-14 20:32:31,150 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:32:48,700 epoch 7 - iter 361/3617 - loss 0.01714858 - time (sec): 17.55 - samples/sec: 2204.32 - lr: 0.000013 - momentum: 0.000000 2023-10-14 20:33:05,924 epoch 7 - iter 722/3617 - loss 0.01781415 - time (sec): 34.77 - samples/sec: 2200.52 - lr: 0.000013 - momentum: 0.000000 2023-10-14 20:33:22,908 epoch 7 - iter 1083/3617 - loss 0.01660375 - time (sec): 51.76 - samples/sec: 2206.15 - lr: 0.000012 - momentum: 0.000000 2023-10-14 20:33:38,636 epoch 7 - iter 1444/3617 - loss 0.01611126 - time (sec): 67.49 - samples/sec: 2258.38 - lr: 0.000012 - momentum: 0.000000 2023-10-14 20:33:55,037 epoch 7 - iter 1805/3617 - loss 0.01766198 - time (sec): 83.89 - samples/sec: 2270.09 - lr: 0.000012 - momentum: 0.000000 2023-10-14 20:34:11,293 epoch 7 - iter 2166/3617 - loss 0.01772828 - time (sec): 100.14 - samples/sec: 2272.39 - lr: 0.000011 - momentum: 0.000000 2023-10-14 20:34:27,657 epoch 7 - iter 2527/3617 - loss 0.01786773 - time (sec): 116.51 - samples/sec: 2275.25 - lr: 0.000011 - momentum: 0.000000 2023-10-14 20:34:44,293 epoch 7 - iter 2888/3617 - loss 0.01874021 - time (sec): 133.14 - samples/sec: 2289.48 - lr: 0.000011 - momentum: 0.000000 2023-10-14 20:35:00,684 epoch 7 - iter 3249/3617 - loss 0.01801121 - time (sec): 149.53 - samples/sec: 2286.88 - lr: 0.000010 - momentum: 0.000000 2023-10-14 20:35:17,050 epoch 7 - iter 3610/3617 - loss 0.01800545 - time (sec): 165.90 - samples/sec: 2287.35 - lr: 0.000010 - momentum: 0.000000 2023-10-14 20:35:17,351 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:35:17,352 EPOCH 7 done: loss 0.0180 - lr: 0.000010 2023-10-14 20:35:23,896 DEV : loss 0.3972169756889343 - f1-score (micro avg) 0.6267 2023-10-14 20:35:23,934 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:35:42,130 epoch 8 - iter 361/3617 - loss 0.00969253 - time (sec): 18.19 - samples/sec: 2088.14 - lr: 0.000010 - momentum: 0.000000 2023-10-14 20:35:58,861 epoch 8 - iter 722/3617 - loss 0.00894942 - time (sec): 34.93 - samples/sec: 2194.36 - lr: 0.000009 - momentum: 0.000000 2023-10-14 20:36:15,061 epoch 8 - iter 1083/3617 - loss 0.00849593 - time (sec): 51.12 - samples/sec: 2217.24 - lr: 0.000009 - momentum: 0.000000 2023-10-14 20:36:30,777 epoch 8 - iter 1444/3617 - loss 0.00914655 - time (sec): 66.84 - samples/sec: 2270.51 - lr: 0.000009 - momentum: 0.000000 2023-10-14 20:36:46,752 epoch 8 - iter 1805/3617 - loss 0.00874731 - time (sec): 82.82 - samples/sec: 2299.67 - lr: 0.000008 - momentum: 0.000000 2023-10-14 20:37:03,211 epoch 8 - iter 2166/3617 - loss 0.00988221 - time (sec): 99.28 - samples/sec: 2296.48 - lr: 0.000008 - momentum: 0.000000 2023-10-14 20:37:19,499 epoch 8 - iter 2527/3617 - loss 0.01064127 - time (sec): 115.56 - samples/sec: 2299.50 - lr: 0.000008 - momentum: 0.000000 2023-10-14 20:37:35,851 epoch 8 - iter 2888/3617 - loss 0.01050074 - time (sec): 131.92 - samples/sec: 2306.37 - lr: 0.000007 - momentum: 0.000000 2023-10-14 20:37:52,088 epoch 8 - iter 3249/3617 - loss 0.01062696 - time (sec): 148.15 - samples/sec: 2305.83 - lr: 0.000007 - momentum: 0.000000 2023-10-14 20:38:08,575 epoch 8 - iter 3610/3617 - loss 0.01078664 - time (sec): 164.64 - samples/sec: 2304.14 - lr: 0.000007 - momentum: 0.000000 2023-10-14 20:38:08,880 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:38:08,880 EPOCH 8 done: loss 0.0108 - lr: 0.000007 2023-10-14 20:38:16,090 DEV : loss 0.41721734404563904 - f1-score (micro avg) 0.6318 2023-10-14 20:38:16,127 saving best model 2023-10-14 20:38:16,663 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:38:35,739 epoch 9 - iter 361/3617 - loss 0.01118969 - time (sec): 19.07 - samples/sec: 1994.33 - lr: 0.000006 - momentum: 0.000000 2023-10-14 20:38:54,004 epoch 9 - iter 722/3617 - loss 0.00841180 - time (sec): 37.34 - samples/sec: 2045.19 - lr: 0.000006 - momentum: 0.000000 2023-10-14 20:39:10,764 epoch 9 - iter 1083/3617 - loss 0.00775291 - time (sec): 54.10 - samples/sec: 2142.25 - lr: 0.000006 - momentum: 0.000000 2023-10-14 20:39:27,106 epoch 9 - iter 1444/3617 - loss 0.00707364 - time (sec): 70.44 - samples/sec: 2169.75 - lr: 0.000005 - momentum: 0.000000 2023-10-14 20:39:43,532 epoch 9 - iter 1805/3617 - loss 0.00759031 - time (sec): 86.87 - samples/sec: 2184.85 - lr: 0.000005 - momentum: 0.000000 2023-10-14 20:39:59,860 epoch 9 - iter 2166/3617 - loss 0.00773152 - time (sec): 103.19 - samples/sec: 2200.69 - lr: 0.000005 - momentum: 0.000000 2023-10-14 20:40:16,264 epoch 9 - iter 2527/3617 - loss 0.00812156 - time (sec): 119.60 - samples/sec: 2214.86 - lr: 0.000004 - momentum: 0.000000 2023-10-14 20:40:32,665 epoch 9 - iter 2888/3617 - loss 0.00788977 - time (sec): 136.00 - samples/sec: 2226.94 - lr: 0.000004 - momentum: 0.000000 2023-10-14 20:40:49,193 epoch 9 - iter 3249/3617 - loss 0.00758141 - time (sec): 152.53 - samples/sec: 2235.85 - lr: 0.000004 - momentum: 0.000000 2023-10-14 20:41:05,613 epoch 9 - iter 3610/3617 - loss 0.00754676 - time (sec): 168.95 - samples/sec: 2245.32 - lr: 0.000003 - momentum: 0.000000 2023-10-14 20:41:05,923 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:41:05,923 EPOCH 9 done: loss 0.0075 - lr: 0.000003 2023-10-14 20:41:11,659 DEV : loss 0.4234275221824646 - f1-score (micro avg) 0.6324 2023-10-14 20:41:11,697 saving best model 2023-10-14 20:41:12,297 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:41:31,527 epoch 10 - iter 361/3617 - loss 0.00680316 - time (sec): 19.23 - samples/sec: 1971.51 - lr: 0.000003 - momentum: 0.000000 2023-10-14 20:41:50,552 epoch 10 - iter 722/3617 - loss 0.00602000 - time (sec): 38.25 - samples/sec: 1971.41 - lr: 0.000003 - momentum: 0.000000 2023-10-14 20:42:08,808 epoch 10 - iter 1083/3617 - loss 0.00486451 - time (sec): 56.51 - samples/sec: 2016.08 - lr: 0.000002 - momentum: 0.000000 2023-10-14 20:42:25,580 epoch 10 - iter 1444/3617 - loss 0.00458704 - time (sec): 73.28 - samples/sec: 2060.75 - lr: 0.000002 - momentum: 0.000000 2023-10-14 20:42:41,997 epoch 10 - iter 1805/3617 - loss 0.00443742 - time (sec): 89.70 - samples/sec: 2113.84 - lr: 0.000002 - momentum: 0.000000 2023-10-14 20:42:58,053 epoch 10 - iter 2166/3617 - loss 0.00538436 - time (sec): 105.75 - samples/sec: 2138.96 - lr: 0.000001 - momentum: 0.000000 2023-10-14 20:43:14,025 epoch 10 - iter 2527/3617 - loss 0.00544432 - time (sec): 121.72 - samples/sec: 2168.45 - lr: 0.000001 - momentum: 0.000000 2023-10-14 20:43:30,436 epoch 10 - iter 2888/3617 - loss 0.00544999 - time (sec): 138.14 - samples/sec: 2196.22 - lr: 0.000001 - momentum: 0.000000 2023-10-14 20:43:46,505 epoch 10 - iter 3249/3617 - loss 0.00524360 - time (sec): 154.20 - samples/sec: 2213.00 - lr: 0.000000 - momentum: 0.000000 2023-10-14 20:44:02,713 epoch 10 - iter 3610/3617 - loss 0.00547411 - time (sec): 170.41 - samples/sec: 2225.38 - lr: 0.000000 - momentum: 0.000000 2023-10-14 20:44:03,016 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:44:03,016 EPOCH 10 done: loss 0.0055 - lr: 0.000000 2023-10-14 20:44:08,786 DEV : loss 0.43111804127693176 - f1-score (micro avg) 0.6361 2023-10-14 20:44:08,840 saving best model 2023-10-14 20:44:09,784 ---------------------------------------------------------------------------------------------------- 2023-10-14 20:44:09,785 Loading model from best epoch ... 2023-10-14 20:44:11,351 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org 2023-10-14 20:44:20,229 Results: - F-score (micro) 0.6471 - F-score (macro) 0.5082 - Accuracy 0.4927 By class: precision recall f1-score support loc 0.6190 0.7834 0.6916 591 pers 0.5736 0.7423 0.6471 357 org 0.2400 0.1519 0.1860 79 micro avg 0.5873 0.7205 0.6471 1027 macro avg 0.4775 0.5592 0.5082 1027 weighted avg 0.5741 0.7205 0.6372 1027 2023-10-14 20:44:20,230 ----------------------------------------------------------------------------------------------------