2023-10-13 10:50:21,939 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:50:21,940 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 10:50:21,940 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:50:21,940 MultiCorpus: 966 train + 219 dev + 204 test sentences - NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator 2023-10-13 10:50:21,940 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:50:21,940 Train: 966 sentences 2023-10-13 10:50:21,941 (train_with_dev=False, train_with_test=False) 2023-10-13 10:50:21,941 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:50:21,941 Training Params: 2023-10-13 10:50:21,941 - learning_rate: "5e-05" 2023-10-13 10:50:21,941 - mini_batch_size: "4" 2023-10-13 10:50:21,941 - max_epochs: "10" 2023-10-13 10:50:21,941 - shuffle: "True" 2023-10-13 10:50:21,941 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:50:21,941 Plugins: 2023-10-13 10:50:21,941 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 10:50:21,941 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:50:21,941 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 10:50:21,941 - metric: "('micro avg', 'f1-score')" 2023-10-13 10:50:21,941 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:50:21,941 Computation: 2023-10-13 10:50:21,941 - compute on device: cuda:0 2023-10-13 10:50:21,941 - embedding storage: none 2023-10-13 10:50:21,941 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:50:21,941 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-13 10:50:21,941 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:50:21,941 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:50:23,012 epoch 1 - iter 24/242 - loss 3.06282383 - time (sec): 1.07 - samples/sec: 2329.54 - lr: 0.000005 - momentum: 0.000000 2023-10-13 10:50:24,074 epoch 1 - iter 48/242 - loss 2.41016119 - time (sec): 2.13 - samples/sec: 2338.10 - lr: 0.000010 - momentum: 0.000000 2023-10-13 10:50:25,151 epoch 1 - iter 72/242 - loss 1.86772217 - time (sec): 3.21 - samples/sec: 2307.38 - lr: 0.000015 - momentum: 0.000000 2023-10-13 10:50:26,211 epoch 1 - iter 96/242 - loss 1.53419880 - time (sec): 4.27 - samples/sec: 2363.70 - lr: 0.000020 - momentum: 0.000000 2023-10-13 10:50:27,259 epoch 1 - iter 120/242 - loss 1.34593945 - time (sec): 5.32 - samples/sec: 2326.36 - lr: 0.000025 - momentum: 0.000000 2023-10-13 10:50:28,306 epoch 1 - iter 144/242 - loss 1.20657811 - time (sec): 6.36 - samples/sec: 2296.86 - lr: 0.000030 - momentum: 0.000000 2023-10-13 10:50:29,371 epoch 1 - iter 168/242 - loss 1.09501481 - time (sec): 7.43 - samples/sec: 2319.58 - lr: 0.000035 - momentum: 0.000000 2023-10-13 10:50:30,440 epoch 1 - iter 192/242 - loss 0.99508537 - time (sec): 8.50 - samples/sec: 2317.02 - lr: 0.000039 - momentum: 0.000000 2023-10-13 10:50:31,505 epoch 1 - iter 216/242 - loss 0.91770039 - time (sec): 9.56 - samples/sec: 2296.92 - lr: 0.000044 - momentum: 0.000000 2023-10-13 10:50:32,588 epoch 1 - iter 240/242 - loss 0.84389161 - time (sec): 10.65 - samples/sec: 2312.19 - lr: 0.000049 - momentum: 0.000000 2023-10-13 10:50:32,668 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:50:32,668 EPOCH 1 done: loss 0.8414 - lr: 0.000049 2023-10-13 10:50:33,424 DEV : loss 0.22544705867767334 - f1-score (micro avg) 0.5808 2023-10-13 10:50:33,429 saving best model 2023-10-13 10:50:33,828 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:50:34,870 epoch 2 - iter 24/242 - loss 0.16782529 - time (sec): 1.04 - samples/sec: 2245.12 - lr: 0.000049 - momentum: 0.000000 2023-10-13 10:50:35,977 epoch 2 - iter 48/242 - loss 0.19015269 - time (sec): 2.15 - samples/sec: 2388.75 - lr: 0.000049 - momentum: 0.000000 2023-10-13 10:50:37,051 epoch 2 - iter 72/242 - loss 0.21197478 - time (sec): 3.22 - samples/sec: 2354.36 - lr: 0.000048 - momentum: 0.000000 2023-10-13 10:50:38,118 epoch 2 - iter 96/242 - loss 0.19725258 - time (sec): 4.29 - samples/sec: 2343.05 - lr: 0.000048 - momentum: 0.000000 2023-10-13 10:50:39,153 epoch 2 - iter 120/242 - loss 0.19764562 - time (sec): 5.32 - samples/sec: 2324.87 - lr: 0.000047 - momentum: 0.000000 2023-10-13 10:50:40,215 epoch 2 - iter 144/242 - loss 0.18993492 - time (sec): 6.39 - samples/sec: 2351.07 - lr: 0.000047 - momentum: 0.000000 2023-10-13 10:50:41,304 epoch 2 - iter 168/242 - loss 0.18159121 - time (sec): 7.47 - samples/sec: 2353.41 - lr: 0.000046 - momentum: 0.000000 2023-10-13 10:50:42,349 epoch 2 - iter 192/242 - loss 0.18477535 - time (sec): 8.52 - samples/sec: 2314.96 - lr: 0.000046 - momentum: 0.000000 2023-10-13 10:50:43,384 epoch 2 - iter 216/242 - loss 0.18262221 - time (sec): 9.55 - samples/sec: 2302.86 - lr: 0.000045 - momentum: 0.000000 2023-10-13 10:50:44,437 epoch 2 - iter 240/242 - loss 0.17813184 - time (sec): 10.61 - samples/sec: 2315.70 - lr: 0.000045 - momentum: 0.000000 2023-10-13 10:50:44,522 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:50:44,522 EPOCH 2 done: loss 0.1777 - lr: 0.000045 2023-10-13 10:50:45,402 DEV : loss 0.1380850225687027 - f1-score (micro avg) 0.8193 2023-10-13 10:50:45,413 saving best model 2023-10-13 10:50:45,953 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:50:47,290 epoch 3 - iter 24/242 - loss 0.10384515 - time (sec): 1.33 - samples/sec: 1874.08 - lr: 0.000044 - momentum: 0.000000 2023-10-13 10:50:48,588 epoch 3 - iter 48/242 - loss 0.09686419 - time (sec): 2.63 - samples/sec: 1811.25 - lr: 0.000043 - momentum: 0.000000 2023-10-13 10:50:49,861 epoch 3 - iter 72/242 - loss 0.09718335 - time (sec): 3.90 - samples/sec: 1832.05 - lr: 0.000043 - momentum: 0.000000 2023-10-13 10:50:51,156 epoch 3 - iter 96/242 - loss 0.11160279 - time (sec): 5.20 - samples/sec: 1835.28 - lr: 0.000042 - momentum: 0.000000 2023-10-13 10:50:52,459 epoch 3 - iter 120/242 - loss 0.11346936 - time (sec): 6.50 - samples/sec: 1833.43 - lr: 0.000042 - momentum: 0.000000 2023-10-13 10:50:53,775 epoch 3 - iter 144/242 - loss 0.10886789 - time (sec): 7.82 - samples/sec: 1854.80 - lr: 0.000041 - momentum: 0.000000 2023-10-13 10:50:55,025 epoch 3 - iter 168/242 - loss 0.10155377 - time (sec): 9.07 - samples/sec: 1868.02 - lr: 0.000041 - momentum: 0.000000 2023-10-13 10:50:56,272 epoch 3 - iter 192/242 - loss 0.10251928 - time (sec): 10.32 - samples/sec: 1899.67 - lr: 0.000040 - momentum: 0.000000 2023-10-13 10:50:57,462 epoch 3 - iter 216/242 - loss 0.10384288 - time (sec): 11.51 - samples/sec: 1892.16 - lr: 0.000040 - momentum: 0.000000 2023-10-13 10:50:58,711 epoch 3 - iter 240/242 - loss 0.10239479 - time (sec): 12.75 - samples/sec: 1929.57 - lr: 0.000039 - momentum: 0.000000 2023-10-13 10:50:58,802 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:50:58,802 EPOCH 3 done: loss 0.1023 - lr: 0.000039 2023-10-13 10:50:59,618 DEV : loss 0.13995103538036346 - f1-score (micro avg) 0.8306 2023-10-13 10:50:59,626 saving best model 2023-10-13 10:51:00,104 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:51:01,343 epoch 4 - iter 24/242 - loss 0.06060215 - time (sec): 1.24 - samples/sec: 2040.03 - lr: 0.000038 - momentum: 0.000000 2023-10-13 10:51:02,496 epoch 4 - iter 48/242 - loss 0.07516428 - time (sec): 2.39 - samples/sec: 2023.89 - lr: 0.000038 - momentum: 0.000000 2023-10-13 10:51:03,688 epoch 4 - iter 72/242 - loss 0.06405180 - time (sec): 3.58 - samples/sec: 2061.48 - lr: 0.000037 - momentum: 0.000000 2023-10-13 10:51:04,930 epoch 4 - iter 96/242 - loss 0.06378318 - time (sec): 4.82 - samples/sec: 2091.33 - lr: 0.000037 - momentum: 0.000000 2023-10-13 10:51:06,070 epoch 4 - iter 120/242 - loss 0.07249040 - time (sec): 5.96 - samples/sec: 2094.11 - lr: 0.000036 - momentum: 0.000000 2023-10-13 10:51:07,252 epoch 4 - iter 144/242 - loss 0.06753827 - time (sec): 7.15 - samples/sec: 2045.49 - lr: 0.000036 - momentum: 0.000000 2023-10-13 10:51:08,435 epoch 4 - iter 168/242 - loss 0.07170026 - time (sec): 8.33 - samples/sec: 2041.66 - lr: 0.000035 - momentum: 0.000000 2023-10-13 10:51:09,565 epoch 4 - iter 192/242 - loss 0.07480645 - time (sec): 9.46 - samples/sec: 2067.36 - lr: 0.000035 - momentum: 0.000000 2023-10-13 10:51:10,674 epoch 4 - iter 216/242 - loss 0.07750273 - time (sec): 10.57 - samples/sec: 2094.73 - lr: 0.000034 - momentum: 0.000000 2023-10-13 10:51:11,735 epoch 4 - iter 240/242 - loss 0.07723147 - time (sec): 11.63 - samples/sec: 2121.85 - lr: 0.000033 - momentum: 0.000000 2023-10-13 10:51:11,818 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:51:11,818 EPOCH 4 done: loss 0.0770 - lr: 0.000033 2023-10-13 10:51:12,668 DEV : loss 0.15853728353977203 - f1-score (micro avg) 0.8266 2023-10-13 10:51:12,676 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:51:13,807 epoch 5 - iter 24/242 - loss 0.04444820 - time (sec): 1.13 - samples/sec: 2267.18 - lr: 0.000033 - momentum: 0.000000 2023-10-13 10:51:14,890 epoch 5 - iter 48/242 - loss 0.04709655 - time (sec): 2.21 - samples/sec: 2338.53 - lr: 0.000032 - momentum: 0.000000 2023-10-13 10:51:16,039 epoch 5 - iter 72/242 - loss 0.05138298 - time (sec): 3.36 - samples/sec: 2276.95 - lr: 0.000032 - momentum: 0.000000 2023-10-13 10:51:17,149 epoch 5 - iter 96/242 - loss 0.05103643 - time (sec): 4.47 - samples/sec: 2235.78 - lr: 0.000031 - momentum: 0.000000 2023-10-13 10:51:18,216 epoch 5 - iter 120/242 - loss 0.05302551 - time (sec): 5.54 - samples/sec: 2252.32 - lr: 0.000031 - momentum: 0.000000 2023-10-13 10:51:19,283 epoch 5 - iter 144/242 - loss 0.05299350 - time (sec): 6.61 - samples/sec: 2231.95 - lr: 0.000030 - momentum: 0.000000 2023-10-13 10:51:20,366 epoch 5 - iter 168/242 - loss 0.05396753 - time (sec): 7.69 - samples/sec: 2212.05 - lr: 0.000030 - momentum: 0.000000 2023-10-13 10:51:21,451 epoch 5 - iter 192/242 - loss 0.05703832 - time (sec): 8.77 - samples/sec: 2218.77 - lr: 0.000029 - momentum: 0.000000 2023-10-13 10:51:22,536 epoch 5 - iter 216/242 - loss 0.05423643 - time (sec): 9.86 - samples/sec: 2245.97 - lr: 0.000028 - momentum: 0.000000 2023-10-13 10:51:23,631 epoch 5 - iter 240/242 - loss 0.05505043 - time (sec): 10.95 - samples/sec: 2247.72 - lr: 0.000028 - momentum: 0.000000 2023-10-13 10:51:23,717 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:51:23,718 EPOCH 5 done: loss 0.0556 - lr: 0.000028 2023-10-13 10:51:24,521 DEV : loss 0.1569410264492035 - f1-score (micro avg) 0.8475 2023-10-13 10:51:24,526 saving best model 2023-10-13 10:51:25,042 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:51:26,125 epoch 6 - iter 24/242 - loss 0.03182422 - time (sec): 1.08 - samples/sec: 2184.59 - lr: 0.000027 - momentum: 0.000000 2023-10-13 10:51:27,199 epoch 6 - iter 48/242 - loss 0.03746626 - time (sec): 2.15 - samples/sec: 2279.27 - lr: 0.000027 - momentum: 0.000000 2023-10-13 10:51:28,268 epoch 6 - iter 72/242 - loss 0.03872337 - time (sec): 3.22 - samples/sec: 2169.32 - lr: 0.000026 - momentum: 0.000000 2023-10-13 10:51:29,377 epoch 6 - iter 96/242 - loss 0.03978327 - time (sec): 4.33 - samples/sec: 2277.25 - lr: 0.000026 - momentum: 0.000000 2023-10-13 10:51:30,460 epoch 6 - iter 120/242 - loss 0.03668298 - time (sec): 5.41 - samples/sec: 2286.58 - lr: 0.000025 - momentum: 0.000000 2023-10-13 10:51:31,547 epoch 6 - iter 144/242 - loss 0.04071379 - time (sec): 6.50 - samples/sec: 2270.35 - lr: 0.000025 - momentum: 0.000000 2023-10-13 10:51:32,626 epoch 6 - iter 168/242 - loss 0.03688523 - time (sec): 7.58 - samples/sec: 2266.99 - lr: 0.000024 - momentum: 0.000000 2023-10-13 10:51:33,742 epoch 6 - iter 192/242 - loss 0.03549140 - time (sec): 8.69 - samples/sec: 2274.74 - lr: 0.000023 - momentum: 0.000000 2023-10-13 10:51:34,890 epoch 6 - iter 216/242 - loss 0.04103311 - time (sec): 9.84 - samples/sec: 2260.49 - lr: 0.000023 - momentum: 0.000000 2023-10-13 10:51:36,043 epoch 6 - iter 240/242 - loss 0.04132778 - time (sec): 11.00 - samples/sec: 2235.44 - lr: 0.000022 - momentum: 0.000000 2023-10-13 10:51:36,135 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:51:36,135 EPOCH 6 done: loss 0.0421 - lr: 0.000022 2023-10-13 10:51:37,053 DEV : loss 0.18470272421836853 - f1-score (micro avg) 0.8245 2023-10-13 10:51:37,063 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:51:38,295 epoch 7 - iter 24/242 - loss 0.02842326 - time (sec): 1.23 - samples/sec: 1945.17 - lr: 0.000022 - momentum: 0.000000 2023-10-13 10:51:39,408 epoch 7 - iter 48/242 - loss 0.02527763 - time (sec): 2.34 - samples/sec: 2065.80 - lr: 0.000021 - momentum: 0.000000 2023-10-13 10:51:40,528 epoch 7 - iter 72/242 - loss 0.02478897 - time (sec): 3.46 - samples/sec: 2194.62 - lr: 0.000021 - momentum: 0.000000 2023-10-13 10:51:41,628 epoch 7 - iter 96/242 - loss 0.03023030 - time (sec): 4.56 - samples/sec: 2218.60 - lr: 0.000020 - momentum: 0.000000 2023-10-13 10:51:42,769 epoch 7 - iter 120/242 - loss 0.02850754 - time (sec): 5.70 - samples/sec: 2202.61 - lr: 0.000020 - momentum: 0.000000 2023-10-13 10:51:43,923 epoch 7 - iter 144/242 - loss 0.03048371 - time (sec): 6.86 - samples/sec: 2154.59 - lr: 0.000019 - momentum: 0.000000 2023-10-13 10:51:45,049 epoch 7 - iter 168/242 - loss 0.03003639 - time (sec): 7.98 - samples/sec: 2119.51 - lr: 0.000018 - momentum: 0.000000 2023-10-13 10:51:46,155 epoch 7 - iter 192/242 - loss 0.03007575 - time (sec): 9.09 - samples/sec: 2151.33 - lr: 0.000018 - momentum: 0.000000 2023-10-13 10:51:47,221 epoch 7 - iter 216/242 - loss 0.02821354 - time (sec): 10.16 - samples/sec: 2159.29 - lr: 0.000017 - momentum: 0.000000 2023-10-13 10:51:48,292 epoch 7 - iter 240/242 - loss 0.02769134 - time (sec): 11.23 - samples/sec: 2184.70 - lr: 0.000017 - momentum: 0.000000 2023-10-13 10:51:48,383 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:51:48,383 EPOCH 7 done: loss 0.0275 - lr: 0.000017 2023-10-13 10:51:49,207 DEV : loss 0.2040817141532898 - f1-score (micro avg) 0.831 2023-10-13 10:51:49,217 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:51:50,362 epoch 8 - iter 24/242 - loss 0.01921950 - time (sec): 1.14 - samples/sec: 2132.67 - lr: 0.000016 - momentum: 0.000000 2023-10-13 10:51:51,591 epoch 8 - iter 48/242 - loss 0.01863565 - time (sec): 2.37 - samples/sec: 2075.05 - lr: 0.000016 - momentum: 0.000000 2023-10-13 10:51:53,313 epoch 8 - iter 72/242 - loss 0.02060445 - time (sec): 4.09 - samples/sec: 1736.59 - lr: 0.000015 - momentum: 0.000000 2023-10-13 10:51:54,569 epoch 8 - iter 96/242 - loss 0.01812814 - time (sec): 5.35 - samples/sec: 1761.54 - lr: 0.000015 - momentum: 0.000000 2023-10-13 10:51:55,815 epoch 8 - iter 120/242 - loss 0.01571944 - time (sec): 6.60 - samples/sec: 1811.62 - lr: 0.000014 - momentum: 0.000000 2023-10-13 10:51:57,047 epoch 8 - iter 144/242 - loss 0.02171685 - time (sec): 7.83 - samples/sec: 1836.88 - lr: 0.000013 - momentum: 0.000000 2023-10-13 10:51:58,267 epoch 8 - iter 168/242 - loss 0.02112671 - time (sec): 9.05 - samples/sec: 1883.52 - lr: 0.000013 - momentum: 0.000000 2023-10-13 10:51:59,440 epoch 8 - iter 192/242 - loss 0.02108873 - time (sec): 10.22 - samples/sec: 1907.39 - lr: 0.000012 - momentum: 0.000000 2023-10-13 10:52:00,529 epoch 8 - iter 216/242 - loss 0.02081705 - time (sec): 11.31 - samples/sec: 1921.10 - lr: 0.000012 - momentum: 0.000000 2023-10-13 10:52:01,621 epoch 8 - iter 240/242 - loss 0.01877028 - time (sec): 12.40 - samples/sec: 1978.84 - lr: 0.000011 - momentum: 0.000000 2023-10-13 10:52:01,704 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:52:01,704 EPOCH 8 done: loss 0.0186 - lr: 0.000011 2023-10-13 10:52:02,503 DEV : loss 0.21222510933876038 - f1-score (micro avg) 0.8383 2023-10-13 10:52:02,510 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:52:03,691 epoch 9 - iter 24/242 - loss 0.01756099 - time (sec): 1.18 - samples/sec: 2110.86 - lr: 0.000011 - momentum: 0.000000 2023-10-13 10:52:04,793 epoch 9 - iter 48/242 - loss 0.01518575 - time (sec): 2.28 - samples/sec: 2137.22 - lr: 0.000010 - momentum: 0.000000 2023-10-13 10:52:05,909 epoch 9 - iter 72/242 - loss 0.01436921 - time (sec): 3.40 - samples/sec: 2220.55 - lr: 0.000010 - momentum: 0.000000 2023-10-13 10:52:07,017 epoch 9 - iter 96/242 - loss 0.01396487 - time (sec): 4.51 - samples/sec: 2269.80 - lr: 0.000009 - momentum: 0.000000 2023-10-13 10:52:08,101 epoch 9 - iter 120/242 - loss 0.01865794 - time (sec): 5.59 - samples/sec: 2290.18 - lr: 0.000008 - momentum: 0.000000 2023-10-13 10:52:09,184 epoch 9 - iter 144/242 - loss 0.01648648 - time (sec): 6.67 - samples/sec: 2307.38 - lr: 0.000008 - momentum: 0.000000 2023-10-13 10:52:10,264 epoch 9 - iter 168/242 - loss 0.01685316 - time (sec): 7.75 - samples/sec: 2310.84 - lr: 0.000007 - momentum: 0.000000 2023-10-13 10:52:11,326 epoch 9 - iter 192/242 - loss 0.01595151 - time (sec): 8.81 - samples/sec: 2290.72 - lr: 0.000007 - momentum: 0.000000 2023-10-13 10:52:12,403 epoch 9 - iter 216/242 - loss 0.01624248 - time (sec): 9.89 - samples/sec: 2245.34 - lr: 0.000006 - momentum: 0.000000 2023-10-13 10:52:13,452 epoch 9 - iter 240/242 - loss 0.01522701 - time (sec): 10.94 - samples/sec: 2241.32 - lr: 0.000006 - momentum: 0.000000 2023-10-13 10:52:13,548 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:52:13,549 EPOCH 9 done: loss 0.0151 - lr: 0.000006 2023-10-13 10:52:14,316 DEV : loss 0.21548904478549957 - f1-score (micro avg) 0.8348 2023-10-13 10:52:14,321 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:52:15,390 epoch 10 - iter 24/242 - loss 0.00181269 - time (sec): 1.07 - samples/sec: 2185.04 - lr: 0.000005 - momentum: 0.000000 2023-10-13 10:52:16,465 epoch 10 - iter 48/242 - loss 0.00363681 - time (sec): 2.14 - samples/sec: 2287.64 - lr: 0.000005 - momentum: 0.000000 2023-10-13 10:52:17,569 epoch 10 - iter 72/242 - loss 0.00363198 - time (sec): 3.25 - samples/sec: 2339.47 - lr: 0.000004 - momentum: 0.000000 2023-10-13 10:52:18,622 epoch 10 - iter 96/242 - loss 0.00363173 - time (sec): 4.30 - samples/sec: 2309.91 - lr: 0.000003 - momentum: 0.000000 2023-10-13 10:52:19,702 epoch 10 - iter 120/242 - loss 0.00355786 - time (sec): 5.38 - samples/sec: 2279.61 - lr: 0.000003 - momentum: 0.000000 2023-10-13 10:52:20,846 epoch 10 - iter 144/242 - loss 0.00366563 - time (sec): 6.52 - samples/sec: 2253.79 - lr: 0.000002 - momentum: 0.000000 2023-10-13 10:52:21,951 epoch 10 - iter 168/242 - loss 0.00712674 - time (sec): 7.63 - samples/sec: 2237.09 - lr: 0.000002 - momentum: 0.000000 2023-10-13 10:52:23,005 epoch 10 - iter 192/242 - loss 0.00881386 - time (sec): 8.68 - samples/sec: 2253.37 - lr: 0.000001 - momentum: 0.000000 2023-10-13 10:52:24,086 epoch 10 - iter 216/242 - loss 0.00798649 - time (sec): 9.76 - samples/sec: 2252.77 - lr: 0.000001 - momentum: 0.000000 2023-10-13 10:52:25,155 epoch 10 - iter 240/242 - loss 0.01012614 - time (sec): 10.83 - samples/sec: 2267.63 - lr: 0.000000 - momentum: 0.000000 2023-10-13 10:52:25,238 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:52:25,238 EPOCH 10 done: loss 0.0101 - lr: 0.000000 2023-10-13 10:52:26,058 DEV : loss 0.2147977650165558 - f1-score (micro avg) 0.8383 2023-10-13 10:52:26,463 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:52:26,464 Loading model from best epoch ... 2023-10-13 10:52:27,974 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-13 10:52:28,810 Results: - F-score (micro) 0.8264 - F-score (macro) 0.4735 - Accuracy 0.7207 By class: precision recall f1-score support pers 0.8400 0.9065 0.8720 139 scope 0.8333 0.8915 0.8614 129 work 0.6957 0.8000 0.7442 80 loc 1.0000 0.2222 0.3636 9 date 0.0000 0.0000 0.0000 3 object 0.0000 0.0000 0.0000 0 micro avg 0.8016 0.8528 0.8264 360 macro avg 0.5615 0.4700 0.4735 360 weighted avg 0.8025 0.8528 0.8198 360 2023-10-13 10:52:28,810 ----------------------------------------------------------------------------------------------------