|
2023-10-14 20:15:14,574 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:15:14,575 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-14 20:15:14,575 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:15:14,575 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences |
|
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator |
|
2023-10-14 20:15:14,575 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:15:14,575 Train: 14465 sentences |
|
2023-10-14 20:15:14,576 (train_with_dev=False, train_with_test=False) |
|
2023-10-14 20:15:14,576 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:15:14,576 Training Params: |
|
2023-10-14 20:15:14,576 - learning_rate: "3e-05" |
|
2023-10-14 20:15:14,576 - mini_batch_size: "4" |
|
2023-10-14 20:15:14,576 - max_epochs: "10" |
|
2023-10-14 20:15:14,576 - shuffle: "True" |
|
2023-10-14 20:15:14,576 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:15:14,576 Plugins: |
|
2023-10-14 20:15:14,576 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-14 20:15:14,576 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:15:14,576 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-14 20:15:14,576 - metric: "('micro avg', 'f1-score')" |
|
2023-10-14 20:15:14,576 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:15:14,576 Computation: |
|
2023-10-14 20:15:14,576 - compute on device: cuda:0 |
|
2023-10-14 20:15:14,576 - embedding storage: none |
|
2023-10-14 20:15:14,576 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:15:14,576 Model training base path: "hmbench-letemps/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-14 20:15:14,576 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:15:14,576 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:15:30,762 epoch 1 - iter 361/3617 - loss 1.47398781 - time (sec): 16.18 - samples/sec: 2325.55 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 20:15:47,116 epoch 1 - iter 722/3617 - loss 0.83970179 - time (sec): 32.54 - samples/sec: 2319.29 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 20:16:03,219 epoch 1 - iter 1083/3617 - loss 0.62059022 - time (sec): 48.64 - samples/sec: 2290.84 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 20:16:19,390 epoch 1 - iter 1444/3617 - loss 0.50223969 - time (sec): 64.81 - samples/sec: 2291.76 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 20:16:35,953 epoch 1 - iter 1805/3617 - loss 0.42436958 - time (sec): 81.38 - samples/sec: 2313.82 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 20:16:52,015 epoch 1 - iter 2166/3617 - loss 0.37437638 - time (sec): 97.44 - samples/sec: 2319.20 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 20:17:07,780 epoch 1 - iter 2527/3617 - loss 0.33741262 - time (sec): 113.20 - samples/sec: 2348.50 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 20:17:23,796 epoch 1 - iter 2888/3617 - loss 0.31042790 - time (sec): 129.22 - samples/sec: 2349.52 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 20:17:39,981 epoch 1 - iter 3249/3617 - loss 0.28794148 - time (sec): 145.40 - samples/sec: 2345.55 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 20:17:56,334 epoch 1 - iter 3610/3617 - loss 0.27061320 - time (sec): 161.76 - samples/sec: 2344.14 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 20:17:56,632 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:17:56,633 EPOCH 1 done: loss 0.2704 - lr: 0.000030 |
|
2023-10-14 20:18:02,040 DEV : loss 0.11999412626028061 - f1-score (micro avg) 0.6234 |
|
2023-10-14 20:18:02,080 saving best model |
|
2023-10-14 20:18:02,475 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:18:21,558 epoch 2 - iter 361/3617 - loss 0.09782984 - time (sec): 19.08 - samples/sec: 2009.32 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 20:18:39,215 epoch 2 - iter 722/3617 - loss 0.09421131 - time (sec): 36.74 - samples/sec: 2056.11 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 20:18:55,988 epoch 2 - iter 1083/3617 - loss 0.09577052 - time (sec): 53.51 - samples/sec: 2106.61 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 20:19:13,211 epoch 2 - iter 1444/3617 - loss 0.09595965 - time (sec): 70.73 - samples/sec: 2158.23 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 20:19:29,358 epoch 2 - iter 1805/3617 - loss 0.09590618 - time (sec): 86.88 - samples/sec: 2185.24 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-14 20:19:46,839 epoch 2 - iter 2166/3617 - loss 0.09403509 - time (sec): 104.36 - samples/sec: 2192.52 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-14 20:20:03,705 epoch 2 - iter 2527/3617 - loss 0.09439687 - time (sec): 121.23 - samples/sec: 2211.93 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-14 20:20:20,276 epoch 2 - iter 2888/3617 - loss 0.09538483 - time (sec): 137.80 - samples/sec: 2216.77 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 20:20:36,508 epoch 2 - iter 3249/3617 - loss 0.09475508 - time (sec): 154.03 - samples/sec: 2215.51 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 20:20:55,584 epoch 2 - iter 3610/3617 - loss 0.09527647 - time (sec): 173.11 - samples/sec: 2191.11 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 20:20:55,956 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:20:55,956 EPOCH 2 done: loss 0.0953 - lr: 0.000027 |
|
2023-10-14 20:21:02,854 DEV : loss 0.12750780582427979 - f1-score (micro avg) 0.6294 |
|
2023-10-14 20:21:02,888 saving best model |
|
2023-10-14 20:21:03,598 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:21:22,467 epoch 3 - iter 361/3617 - loss 0.05413118 - time (sec): 18.87 - samples/sec: 1959.23 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-14 20:21:41,460 epoch 3 - iter 722/3617 - loss 0.06427074 - time (sec): 37.86 - samples/sec: 1973.69 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-14 20:21:57,897 epoch 3 - iter 1083/3617 - loss 0.07334315 - time (sec): 54.30 - samples/sec: 2076.57 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-14 20:22:14,087 epoch 3 - iter 1444/3617 - loss 0.07359621 - time (sec): 70.49 - samples/sec: 2131.15 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 20:22:30,386 epoch 3 - iter 1805/3617 - loss 0.07222582 - time (sec): 86.79 - samples/sec: 2165.52 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 20:22:46,657 epoch 3 - iter 2166/3617 - loss 0.07167594 - time (sec): 103.06 - samples/sec: 2195.87 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 20:23:03,231 epoch 3 - iter 2527/3617 - loss 0.07168900 - time (sec): 119.63 - samples/sec: 2219.22 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 20:23:19,692 epoch 3 - iter 2888/3617 - loss 0.07149544 - time (sec): 136.09 - samples/sec: 2230.31 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 20:23:36,017 epoch 3 - iter 3249/3617 - loss 0.07253118 - time (sec): 152.42 - samples/sec: 2240.40 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 20:23:52,572 epoch 3 - iter 3610/3617 - loss 0.07273065 - time (sec): 168.97 - samples/sec: 2244.48 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 20:23:52,880 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:23:52,880 EPOCH 3 done: loss 0.0727 - lr: 0.000023 |
|
2023-10-14 20:23:59,342 DEV : loss 0.23288682103157043 - f1-score (micro avg) 0.6258 |
|
2023-10-14 20:23:59,373 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:24:15,633 epoch 4 - iter 361/3617 - loss 0.05027835 - time (sec): 16.26 - samples/sec: 2260.02 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 20:24:32,140 epoch 4 - iter 722/3617 - loss 0.05104940 - time (sec): 32.77 - samples/sec: 2293.15 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 20:24:48,509 epoch 4 - iter 1083/3617 - loss 0.04972765 - time (sec): 49.13 - samples/sec: 2297.62 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 20:25:04,840 epoch 4 - iter 1444/3617 - loss 0.04948816 - time (sec): 65.47 - samples/sec: 2300.32 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 20:25:21,353 epoch 4 - iter 1805/3617 - loss 0.04959231 - time (sec): 81.98 - samples/sec: 2316.09 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 20:25:37,665 epoch 4 - iter 2166/3617 - loss 0.05039825 - time (sec): 98.29 - samples/sec: 2325.84 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 20:25:53,888 epoch 4 - iter 2527/3617 - loss 0.05098071 - time (sec): 114.51 - samples/sec: 2324.83 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 20:26:09,988 epoch 4 - iter 2888/3617 - loss 0.05275888 - time (sec): 130.61 - samples/sec: 2333.25 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 20:26:26,056 epoch 4 - iter 3249/3617 - loss 0.05280906 - time (sec): 146.68 - samples/sec: 2333.90 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 20:26:42,216 epoch 4 - iter 3610/3617 - loss 0.05251226 - time (sec): 162.84 - samples/sec: 2328.47 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 20:26:42,519 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:26:42,519 EPOCH 4 done: loss 0.0524 - lr: 0.000020 |
|
2023-10-14 20:26:48,265 DEV : loss 0.29611918330192566 - f1-score (micro avg) 0.6115 |
|
2023-10-14 20:26:48,298 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:27:05,568 epoch 5 - iter 361/3617 - loss 0.04203166 - time (sec): 17.27 - samples/sec: 2138.47 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 20:27:21,925 epoch 5 - iter 722/3617 - loss 0.03700812 - time (sec): 33.63 - samples/sec: 2268.87 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 20:27:38,355 epoch 5 - iter 1083/3617 - loss 0.03554194 - time (sec): 50.06 - samples/sec: 2282.52 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 20:27:54,786 epoch 5 - iter 1444/3617 - loss 0.03570610 - time (sec): 66.49 - samples/sec: 2278.40 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 20:28:11,254 epoch 5 - iter 1805/3617 - loss 0.03486095 - time (sec): 82.95 - samples/sec: 2272.79 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 20:28:27,973 epoch 5 - iter 2166/3617 - loss 0.03550990 - time (sec): 99.67 - samples/sec: 2291.81 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 20:28:44,336 epoch 5 - iter 2527/3617 - loss 0.03627469 - time (sec): 116.04 - samples/sec: 2296.38 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 20:29:00,528 epoch 5 - iter 2888/3617 - loss 0.03591614 - time (sec): 132.23 - samples/sec: 2303.16 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 20:29:16,672 epoch 5 - iter 3249/3617 - loss 0.03711630 - time (sec): 148.37 - samples/sec: 2304.33 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 20:29:32,867 epoch 5 - iter 3610/3617 - loss 0.03676579 - time (sec): 164.57 - samples/sec: 2303.59 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 20:29:33,174 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:29:33,174 EPOCH 5 done: loss 0.0367 - lr: 0.000017 |
|
2023-10-14 20:29:38,931 DEV : loss 0.30095481872558594 - f1-score (micro avg) 0.6255 |
|
2023-10-14 20:29:38,964 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:29:55,577 epoch 6 - iter 361/3617 - loss 0.02879097 - time (sec): 16.61 - samples/sec: 2327.19 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 20:30:11,985 epoch 6 - iter 722/3617 - loss 0.02283689 - time (sec): 33.02 - samples/sec: 2300.73 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 20:30:28,425 epoch 6 - iter 1083/3617 - loss 0.02449134 - time (sec): 49.46 - samples/sec: 2308.91 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 20:30:44,812 epoch 6 - iter 1444/3617 - loss 0.02459281 - time (sec): 65.85 - samples/sec: 2288.93 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 20:31:01,172 epoch 6 - iter 1805/3617 - loss 0.02646233 - time (sec): 82.21 - samples/sec: 2285.13 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 20:31:17,640 epoch 6 - iter 2166/3617 - loss 0.02632674 - time (sec): 98.67 - samples/sec: 2283.10 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 20:31:34,002 epoch 6 - iter 2527/3617 - loss 0.02565887 - time (sec): 115.04 - samples/sec: 2284.73 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 20:31:50,346 epoch 6 - iter 2888/3617 - loss 0.02507191 - time (sec): 131.38 - samples/sec: 2294.77 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 20:32:06,871 epoch 6 - iter 3249/3617 - loss 0.02538127 - time (sec): 147.91 - samples/sec: 2298.40 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 20:32:23,470 epoch 6 - iter 3610/3617 - loss 0.02548542 - time (sec): 164.51 - samples/sec: 2305.67 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 20:32:23,772 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:32:23,772 EPOCH 6 done: loss 0.0254 - lr: 0.000013 |
|
2023-10-14 20:32:31,112 DEV : loss 0.35236480832099915 - f1-score (micro avg) 0.6282 |
|
2023-10-14 20:32:31,150 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:32:48,700 epoch 7 - iter 361/3617 - loss 0.01714858 - time (sec): 17.55 - samples/sec: 2204.32 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 20:33:05,924 epoch 7 - iter 722/3617 - loss 0.01781415 - time (sec): 34.77 - samples/sec: 2200.52 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 20:33:22,908 epoch 7 - iter 1083/3617 - loss 0.01660375 - time (sec): 51.76 - samples/sec: 2206.15 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 20:33:38,636 epoch 7 - iter 1444/3617 - loss 0.01611126 - time (sec): 67.49 - samples/sec: 2258.38 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 20:33:55,037 epoch 7 - iter 1805/3617 - loss 0.01766198 - time (sec): 83.89 - samples/sec: 2270.09 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 20:34:11,293 epoch 7 - iter 2166/3617 - loss 0.01772828 - time (sec): 100.14 - samples/sec: 2272.39 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 20:34:27,657 epoch 7 - iter 2527/3617 - loss 0.01786773 - time (sec): 116.51 - samples/sec: 2275.25 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 20:34:44,293 epoch 7 - iter 2888/3617 - loss 0.01874021 - time (sec): 133.14 - samples/sec: 2289.48 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 20:35:00,684 epoch 7 - iter 3249/3617 - loss 0.01801121 - time (sec): 149.53 - samples/sec: 2286.88 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 20:35:17,050 epoch 7 - iter 3610/3617 - loss 0.01800545 - time (sec): 165.90 - samples/sec: 2287.35 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 20:35:17,351 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:35:17,352 EPOCH 7 done: loss 0.0180 - lr: 0.000010 |
|
2023-10-14 20:35:23,896 DEV : loss 0.3972169756889343 - f1-score (micro avg) 0.6267 |
|
2023-10-14 20:35:23,934 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:35:42,130 epoch 8 - iter 361/3617 - loss 0.00969253 - time (sec): 18.19 - samples/sec: 2088.14 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 20:35:58,861 epoch 8 - iter 722/3617 - loss 0.00894942 - time (sec): 34.93 - samples/sec: 2194.36 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 20:36:15,061 epoch 8 - iter 1083/3617 - loss 0.00849593 - time (sec): 51.12 - samples/sec: 2217.24 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 20:36:30,777 epoch 8 - iter 1444/3617 - loss 0.00914655 - time (sec): 66.84 - samples/sec: 2270.51 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 20:36:46,752 epoch 8 - iter 1805/3617 - loss 0.00874731 - time (sec): 82.82 - samples/sec: 2299.67 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-14 20:37:03,211 epoch 8 - iter 2166/3617 - loss 0.00988221 - time (sec): 99.28 - samples/sec: 2296.48 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-14 20:37:19,499 epoch 8 - iter 2527/3617 - loss 0.01064127 - time (sec): 115.56 - samples/sec: 2299.50 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-14 20:37:35,851 epoch 8 - iter 2888/3617 - loss 0.01050074 - time (sec): 131.92 - samples/sec: 2306.37 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 20:37:52,088 epoch 8 - iter 3249/3617 - loss 0.01062696 - time (sec): 148.15 - samples/sec: 2305.83 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 20:38:08,575 epoch 8 - iter 3610/3617 - loss 0.01078664 - time (sec): 164.64 - samples/sec: 2304.14 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 20:38:08,880 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:38:08,880 EPOCH 8 done: loss 0.0108 - lr: 0.000007 |
|
2023-10-14 20:38:16,090 DEV : loss 0.41721734404563904 - f1-score (micro avg) 0.6318 |
|
2023-10-14 20:38:16,127 saving best model |
|
2023-10-14 20:38:16,663 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:38:35,739 epoch 9 - iter 361/3617 - loss 0.01118969 - time (sec): 19.07 - samples/sec: 1994.33 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 20:38:54,004 epoch 9 - iter 722/3617 - loss 0.00841180 - time (sec): 37.34 - samples/sec: 2045.19 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 20:39:10,764 epoch 9 - iter 1083/3617 - loss 0.00775291 - time (sec): 54.10 - samples/sec: 2142.25 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 20:39:27,106 epoch 9 - iter 1444/3617 - loss 0.00707364 - time (sec): 70.44 - samples/sec: 2169.75 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 20:39:43,532 epoch 9 - iter 1805/3617 - loss 0.00759031 - time (sec): 86.87 - samples/sec: 2184.85 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 20:39:59,860 epoch 9 - iter 2166/3617 - loss 0.00773152 - time (sec): 103.19 - samples/sec: 2200.69 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 20:40:16,264 epoch 9 - iter 2527/3617 - loss 0.00812156 - time (sec): 119.60 - samples/sec: 2214.86 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 20:40:32,665 epoch 9 - iter 2888/3617 - loss 0.00788977 - time (sec): 136.00 - samples/sec: 2226.94 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 20:40:49,193 epoch 9 - iter 3249/3617 - loss 0.00758141 - time (sec): 152.53 - samples/sec: 2235.85 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 20:41:05,613 epoch 9 - iter 3610/3617 - loss 0.00754676 - time (sec): 168.95 - samples/sec: 2245.32 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 20:41:05,923 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:41:05,923 EPOCH 9 done: loss 0.0075 - lr: 0.000003 |
|
2023-10-14 20:41:11,659 DEV : loss 0.4234275221824646 - f1-score (micro avg) 0.6324 |
|
2023-10-14 20:41:11,697 saving best model |
|
2023-10-14 20:41:12,297 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:41:31,527 epoch 10 - iter 361/3617 - loss 0.00680316 - time (sec): 19.23 - samples/sec: 1971.51 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 20:41:50,552 epoch 10 - iter 722/3617 - loss 0.00602000 - time (sec): 38.25 - samples/sec: 1971.41 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 20:42:08,808 epoch 10 - iter 1083/3617 - loss 0.00486451 - time (sec): 56.51 - samples/sec: 2016.08 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 20:42:25,580 epoch 10 - iter 1444/3617 - loss 0.00458704 - time (sec): 73.28 - samples/sec: 2060.75 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 20:42:41,997 epoch 10 - iter 1805/3617 - loss 0.00443742 - time (sec): 89.70 - samples/sec: 2113.84 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 20:42:58,053 epoch 10 - iter 2166/3617 - loss 0.00538436 - time (sec): 105.75 - samples/sec: 2138.96 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 20:43:14,025 epoch 10 - iter 2527/3617 - loss 0.00544432 - time (sec): 121.72 - samples/sec: 2168.45 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 20:43:30,436 epoch 10 - iter 2888/3617 - loss 0.00544999 - time (sec): 138.14 - samples/sec: 2196.22 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 20:43:46,505 epoch 10 - iter 3249/3617 - loss 0.00524360 - time (sec): 154.20 - samples/sec: 2213.00 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-14 20:44:02,713 epoch 10 - iter 3610/3617 - loss 0.00547411 - time (sec): 170.41 - samples/sec: 2225.38 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-14 20:44:03,016 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:44:03,016 EPOCH 10 done: loss 0.0055 - lr: 0.000000 |
|
2023-10-14 20:44:08,786 DEV : loss 0.43111804127693176 - f1-score (micro avg) 0.6361 |
|
2023-10-14 20:44:08,840 saving best model |
|
2023-10-14 20:44:09,784 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:44:09,785 Loading model from best epoch ... |
|
2023-10-14 20:44:11,351 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org |
|
2023-10-14 20:44:20,229 |
|
Results: |
|
- F-score (micro) 0.6471 |
|
- F-score (macro) 0.5082 |
|
- Accuracy 0.4927 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.6190 0.7834 0.6916 591 |
|
pers 0.5736 0.7423 0.6471 357 |
|
org 0.2400 0.1519 0.1860 79 |
|
|
|
micro avg 0.5873 0.7205 0.6471 1027 |
|
macro avg 0.4775 0.5592 0.5082 1027 |
|
weighted avg 0.5741 0.7205 0.6372 1027 |
|
|
|
2023-10-14 20:44:20,230 ---------------------------------------------------------------------------------------------------- |
|
|