2023-10-13 10:39:10,664 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:39:10,665 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 10:39:10,665 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:39:10,665 MultiCorpus: 966 train + 219 dev + 204 test sentences - NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator 2023-10-13 10:39:10,665 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:39:10,665 Train: 966 sentences 2023-10-13 10:39:10,665 (train_with_dev=False, train_with_test=False) 2023-10-13 10:39:10,665 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:39:10,665 Training Params: 2023-10-13 10:39:10,665 - learning_rate: "3e-05" 2023-10-13 10:39:10,665 - mini_batch_size: "4" 2023-10-13 10:39:10,665 - max_epochs: "10" 2023-10-13 10:39:10,665 - shuffle: "True" 2023-10-13 10:39:10,665 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:39:10,665 Plugins: 2023-10-13 10:39:10,665 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 10:39:10,665 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:39:10,665 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 10:39:10,666 - metric: "('micro avg', 'f1-score')" 2023-10-13 10:39:10,666 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:39:10,666 Computation: 2023-10-13 10:39:10,666 - compute on device: cuda:0 2023-10-13 10:39:10,666 - embedding storage: none 2023-10-13 10:39:10,666 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:39:10,666 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-13 10:39:10,666 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:39:10,666 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:39:11,896 epoch 1 - iter 24/242 - loss 3.50505954 - time (sec): 1.23 - samples/sec: 1950.43 - lr: 0.000003 - momentum: 0.000000 2023-10-13 10:39:13,180 epoch 1 - iter 48/242 - loss 3.14399187 - time (sec): 2.51 - samples/sec: 1969.96 - lr: 0.000006 - momentum: 0.000000 2023-10-13 10:39:14,476 epoch 1 - iter 72/242 - loss 2.59604288 - time (sec): 3.81 - samples/sec: 1835.95 - lr: 0.000009 - momentum: 0.000000 2023-10-13 10:39:15,776 epoch 1 - iter 96/242 - loss 2.03807786 - time (sec): 5.11 - samples/sec: 1920.10 - lr: 0.000012 - momentum: 0.000000 2023-10-13 10:39:17,045 epoch 1 - iter 120/242 - loss 1.74113671 - time (sec): 6.38 - samples/sec: 1928.70 - lr: 0.000015 - momentum: 0.000000 2023-10-13 10:39:18,336 epoch 1 - iter 144/242 - loss 1.53224402 - time (sec): 7.67 - samples/sec: 1926.98 - lr: 0.000018 - momentum: 0.000000 2023-10-13 10:39:19,606 epoch 1 - iter 168/242 - loss 1.37787216 - time (sec): 8.94 - samples/sec: 1923.62 - lr: 0.000021 - momentum: 0.000000 2023-10-13 10:39:20,891 epoch 1 - iter 192/242 - loss 1.25203869 - time (sec): 10.22 - samples/sec: 1934.86 - lr: 0.000024 - momentum: 0.000000 2023-10-13 10:39:22,196 epoch 1 - iter 216/242 - loss 1.14976571 - time (sec): 11.53 - samples/sec: 1927.43 - lr: 0.000027 - momentum: 0.000000 2023-10-13 10:39:23,489 epoch 1 - iter 240/242 - loss 1.06447731 - time (sec): 12.82 - samples/sec: 1925.63 - lr: 0.000030 - momentum: 0.000000 2023-10-13 10:39:23,592 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:39:23,592 EPOCH 1 done: loss 1.0638 - lr: 0.000030 2023-10-13 10:39:24,211 DEV : loss 0.3004940450191498 - f1-score (micro avg) 0.4417 2023-10-13 10:39:24,217 saving best model 2023-10-13 10:39:24,635 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:39:25,904 epoch 2 - iter 24/242 - loss 0.25209222 - time (sec): 1.27 - samples/sec: 2046.37 - lr: 0.000030 - momentum: 0.000000 2023-10-13 10:39:27,177 epoch 2 - iter 48/242 - loss 0.25771553 - time (sec): 2.54 - samples/sec: 1923.26 - lr: 0.000029 - momentum: 0.000000 2023-10-13 10:39:28,598 epoch 2 - iter 72/242 - loss 0.24853386 - time (sec): 3.96 - samples/sec: 1820.51 - lr: 0.000029 - momentum: 0.000000 2023-10-13 10:39:29,891 epoch 2 - iter 96/242 - loss 0.23634609 - time (sec): 5.25 - samples/sec: 1884.57 - lr: 0.000029 - momentum: 0.000000 2023-10-13 10:39:31,188 epoch 2 - iter 120/242 - loss 0.23181754 - time (sec): 6.55 - samples/sec: 1900.93 - lr: 0.000028 - momentum: 0.000000 2023-10-13 10:39:32,444 epoch 2 - iter 144/242 - loss 0.22630778 - time (sec): 7.81 - samples/sec: 1894.36 - lr: 0.000028 - momentum: 0.000000 2023-10-13 10:39:33,675 epoch 2 - iter 168/242 - loss 0.21592368 - time (sec): 9.04 - samples/sec: 1893.44 - lr: 0.000028 - momentum: 0.000000 2023-10-13 10:39:34,850 epoch 2 - iter 192/242 - loss 0.21344229 - time (sec): 10.21 - samples/sec: 1906.72 - lr: 0.000027 - momentum: 0.000000 2023-10-13 10:39:35,964 epoch 2 - iter 216/242 - loss 0.20666082 - time (sec): 11.33 - samples/sec: 1941.42 - lr: 0.000027 - momentum: 0.000000 2023-10-13 10:39:37,104 epoch 2 - iter 240/242 - loss 0.20023393 - time (sec): 12.47 - samples/sec: 1971.16 - lr: 0.000027 - momentum: 0.000000 2023-10-13 10:39:37,196 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:39:37,197 EPOCH 2 done: loss 0.2005 - lr: 0.000027 2023-10-13 10:39:37,974 DEV : loss 0.14757825434207916 - f1-score (micro avg) 0.7767 2023-10-13 10:39:37,981 saving best model 2023-10-13 10:39:38,502 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:39:39,681 epoch 3 - iter 24/242 - loss 0.13100081 - time (sec): 1.18 - samples/sec: 1942.15 - lr: 0.000026 - momentum: 0.000000 2023-10-13 10:39:40,864 epoch 3 - iter 48/242 - loss 0.11968009 - time (sec): 2.36 - samples/sec: 1990.60 - lr: 0.000026 - momentum: 0.000000 2023-10-13 10:39:42,049 epoch 3 - iter 72/242 - loss 0.12717258 - time (sec): 3.54 - samples/sec: 2016.34 - lr: 0.000026 - momentum: 0.000000 2023-10-13 10:39:43,198 epoch 3 - iter 96/242 - loss 0.12377154 - time (sec): 4.69 - samples/sec: 2023.43 - lr: 0.000025 - momentum: 0.000000 2023-10-13 10:39:44,335 epoch 3 - iter 120/242 - loss 0.11871054 - time (sec): 5.83 - samples/sec: 2005.44 - lr: 0.000025 - momentum: 0.000000 2023-10-13 10:39:45,434 epoch 3 - iter 144/242 - loss 0.12012669 - time (sec): 6.93 - samples/sec: 2087.31 - lr: 0.000025 - momentum: 0.000000 2023-10-13 10:39:46,500 epoch 3 - iter 168/242 - loss 0.11717001 - time (sec): 8.00 - samples/sec: 2097.94 - lr: 0.000024 - momentum: 0.000000 2023-10-13 10:39:47,671 epoch 3 - iter 192/242 - loss 0.11359425 - time (sec): 9.17 - samples/sec: 2123.43 - lr: 0.000024 - momentum: 0.000000 2023-10-13 10:39:48,843 epoch 3 - iter 216/242 - loss 0.11445605 - time (sec): 10.34 - samples/sec: 2112.99 - lr: 0.000024 - momentum: 0.000000 2023-10-13 10:39:49,953 epoch 3 - iter 240/242 - loss 0.11245202 - time (sec): 11.45 - samples/sec: 2144.86 - lr: 0.000023 - momentum: 0.000000 2023-10-13 10:39:50,041 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:39:50,041 EPOCH 3 done: loss 0.1121 - lr: 0.000023 2023-10-13 10:39:50,912 DEV : loss 0.120076484978199 - f1-score (micro avg) 0.8311 2023-10-13 10:39:50,923 saving best model 2023-10-13 10:39:51,541 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:39:52,783 epoch 4 - iter 24/242 - loss 0.06349330 - time (sec): 1.24 - samples/sec: 2055.54 - lr: 0.000023 - momentum: 0.000000 2023-10-13 10:39:53,983 epoch 4 - iter 48/242 - loss 0.05407166 - time (sec): 2.44 - samples/sec: 2172.17 - lr: 0.000023 - momentum: 0.000000 2023-10-13 10:39:55,147 epoch 4 - iter 72/242 - loss 0.06279579 - time (sec): 3.60 - samples/sec: 2128.99 - lr: 0.000022 - momentum: 0.000000 2023-10-13 10:39:56,204 epoch 4 - iter 96/242 - loss 0.07193319 - time (sec): 4.66 - samples/sec: 2177.68 - lr: 0.000022 - momentum: 0.000000 2023-10-13 10:39:57,273 epoch 4 - iter 120/242 - loss 0.07394057 - time (sec): 5.73 - samples/sec: 2182.89 - lr: 0.000022 - momentum: 0.000000 2023-10-13 10:39:58,388 epoch 4 - iter 144/242 - loss 0.07232261 - time (sec): 6.84 - samples/sec: 2158.21 - lr: 0.000021 - momentum: 0.000000 2023-10-13 10:39:59,488 epoch 4 - iter 168/242 - loss 0.07504456 - time (sec): 7.94 - samples/sec: 2172.63 - lr: 0.000021 - momentum: 0.000000 2023-10-13 10:40:00,584 epoch 4 - iter 192/242 - loss 0.07469348 - time (sec): 9.04 - samples/sec: 2198.75 - lr: 0.000021 - momentum: 0.000000 2023-10-13 10:40:01,686 epoch 4 - iter 216/242 - loss 0.07453496 - time (sec): 10.14 - samples/sec: 2187.50 - lr: 0.000020 - momentum: 0.000000 2023-10-13 10:40:02,806 epoch 4 - iter 240/242 - loss 0.07654679 - time (sec): 11.26 - samples/sec: 2183.92 - lr: 0.000020 - momentum: 0.000000 2023-10-13 10:40:02,902 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:40:02,903 EPOCH 4 done: loss 0.0761 - lr: 0.000020 2023-10-13 10:40:03,668 DEV : loss 0.12180294096469879 - f1-score (micro avg) 0.8296 2023-10-13 10:40:03,673 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:40:04,854 epoch 5 - iter 24/242 - loss 0.06702398 - time (sec): 1.18 - samples/sec: 2076.38 - lr: 0.000020 - momentum: 0.000000 2023-10-13 10:40:06,069 epoch 5 - iter 48/242 - loss 0.05289707 - time (sec): 2.39 - samples/sec: 2133.43 - lr: 0.000019 - momentum: 0.000000 2023-10-13 10:40:07,153 epoch 5 - iter 72/242 - loss 0.05078674 - time (sec): 3.48 - samples/sec: 2160.22 - lr: 0.000019 - momentum: 0.000000 2023-10-13 10:40:08,259 epoch 5 - iter 96/242 - loss 0.04895896 - time (sec): 4.58 - samples/sec: 2211.24 - lr: 0.000019 - momentum: 0.000000 2023-10-13 10:40:09,303 epoch 5 - iter 120/242 - loss 0.05142016 - time (sec): 5.63 - samples/sec: 2204.15 - lr: 0.000018 - momentum: 0.000000 2023-10-13 10:40:10,390 epoch 5 - iter 144/242 - loss 0.05283804 - time (sec): 6.72 - samples/sec: 2209.95 - lr: 0.000018 - momentum: 0.000000 2023-10-13 10:40:11,454 epoch 5 - iter 168/242 - loss 0.05403169 - time (sec): 7.78 - samples/sec: 2214.73 - lr: 0.000018 - momentum: 0.000000 2023-10-13 10:40:12,550 epoch 5 - iter 192/242 - loss 0.05358624 - time (sec): 8.88 - samples/sec: 2214.36 - lr: 0.000017 - momentum: 0.000000 2023-10-13 10:40:13,631 epoch 5 - iter 216/242 - loss 0.05525948 - time (sec): 9.96 - samples/sec: 2240.49 - lr: 0.000017 - momentum: 0.000000 2023-10-13 10:40:14,709 epoch 5 - iter 240/242 - loss 0.05640588 - time (sec): 11.03 - samples/sec: 2230.11 - lr: 0.000017 - momentum: 0.000000 2023-10-13 10:40:14,797 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:40:14,797 EPOCH 5 done: loss 0.0561 - lr: 0.000017 2023-10-13 10:40:15,575 DEV : loss 0.13739150762557983 - f1-score (micro avg) 0.8377 2023-10-13 10:40:15,583 saving best model 2023-10-13 10:40:16,110 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:40:17,271 epoch 6 - iter 24/242 - loss 0.05642204 - time (sec): 1.16 - samples/sec: 1983.22 - lr: 0.000016 - momentum: 0.000000 2023-10-13 10:40:18,382 epoch 6 - iter 48/242 - loss 0.04450997 - time (sec): 2.27 - samples/sec: 2062.93 - lr: 0.000016 - momentum: 0.000000 2023-10-13 10:40:19,513 epoch 6 - iter 72/242 - loss 0.03984701 - time (sec): 3.40 - samples/sec: 2155.71 - lr: 0.000016 - momentum: 0.000000 2023-10-13 10:40:20,590 epoch 6 - iter 96/242 - loss 0.04303982 - time (sec): 4.48 - samples/sec: 2232.00 - lr: 0.000015 - momentum: 0.000000 2023-10-13 10:40:21,679 epoch 6 - iter 120/242 - loss 0.04228501 - time (sec): 5.57 - samples/sec: 2188.42 - lr: 0.000015 - momentum: 0.000000 2023-10-13 10:40:22,917 epoch 6 - iter 144/242 - loss 0.04297498 - time (sec): 6.81 - samples/sec: 2181.97 - lr: 0.000015 - momentum: 0.000000 2023-10-13 10:40:24,024 epoch 6 - iter 168/242 - loss 0.03939743 - time (sec): 7.91 - samples/sec: 2204.75 - lr: 0.000014 - momentum: 0.000000 2023-10-13 10:40:25,110 epoch 6 - iter 192/242 - loss 0.03998548 - time (sec): 9.00 - samples/sec: 2215.08 - lr: 0.000014 - momentum: 0.000000 2023-10-13 10:40:26,179 epoch 6 - iter 216/242 - loss 0.04177436 - time (sec): 10.07 - samples/sec: 2207.89 - lr: 0.000014 - momentum: 0.000000 2023-10-13 10:40:27,254 epoch 6 - iter 240/242 - loss 0.04298808 - time (sec): 11.14 - samples/sec: 2208.46 - lr: 0.000013 - momentum: 0.000000 2023-10-13 10:40:27,340 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:40:27,341 EPOCH 6 done: loss 0.0428 - lr: 0.000013 2023-10-13 10:40:28,148 DEV : loss 0.16435274481773376 - f1-score (micro avg) 0.8268 2023-10-13 10:40:28,154 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:40:29,255 epoch 7 - iter 24/242 - loss 0.05058947 - time (sec): 1.10 - samples/sec: 2145.04 - lr: 0.000013 - momentum: 0.000000 2023-10-13 10:40:30,331 epoch 7 - iter 48/242 - loss 0.03065735 - time (sec): 2.18 - samples/sec: 2224.83 - lr: 0.000013 - momentum: 0.000000 2023-10-13 10:40:31,495 epoch 7 - iter 72/242 - loss 0.03161440 - time (sec): 3.34 - samples/sec: 2132.98 - lr: 0.000012 - momentum: 0.000000 2023-10-13 10:40:32,793 epoch 7 - iter 96/242 - loss 0.03496312 - time (sec): 4.64 - samples/sec: 2097.09 - lr: 0.000012 - momentum: 0.000000 2023-10-13 10:40:34,029 epoch 7 - iter 120/242 - loss 0.03117756 - time (sec): 5.87 - samples/sec: 2127.87 - lr: 0.000012 - momentum: 0.000000 2023-10-13 10:40:35,111 epoch 7 - iter 144/242 - loss 0.03038171 - time (sec): 6.96 - samples/sec: 2146.08 - lr: 0.000011 - momentum: 0.000000 2023-10-13 10:40:36,226 epoch 7 - iter 168/242 - loss 0.03176321 - time (sec): 8.07 - samples/sec: 2148.23 - lr: 0.000011 - momentum: 0.000000 2023-10-13 10:40:37,297 epoch 7 - iter 192/242 - loss 0.03243985 - time (sec): 9.14 - samples/sec: 2152.90 - lr: 0.000011 - momentum: 0.000000 2023-10-13 10:40:38,358 epoch 7 - iter 216/242 - loss 0.03071036 - time (sec): 10.20 - samples/sec: 2178.76 - lr: 0.000010 - momentum: 0.000000 2023-10-13 10:40:39,427 epoch 7 - iter 240/242 - loss 0.03452261 - time (sec): 11.27 - samples/sec: 2180.68 - lr: 0.000010 - momentum: 0.000000 2023-10-13 10:40:39,516 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:40:39,516 EPOCH 7 done: loss 0.0344 - lr: 0.000010 2023-10-13 10:40:40,330 DEV : loss 0.1909765601158142 - f1-score (micro avg) 0.842 2023-10-13 10:40:40,341 saving best model 2023-10-13 10:40:40,875 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:40:42,190 epoch 8 - iter 24/242 - loss 0.02967753 - time (sec): 1.31 - samples/sec: 1813.07 - lr: 0.000010 - momentum: 0.000000 2023-10-13 10:40:43,442 epoch 8 - iter 48/242 - loss 0.03435432 - time (sec): 2.57 - samples/sec: 1747.34 - lr: 0.000009 - momentum: 0.000000 2023-10-13 10:40:44,676 epoch 8 - iter 72/242 - loss 0.03002380 - time (sec): 3.80 - samples/sec: 1824.05 - lr: 0.000009 - momentum: 0.000000 2023-10-13 10:40:45,929 epoch 8 - iter 96/242 - loss 0.02860239 - time (sec): 5.05 - samples/sec: 1919.22 - lr: 0.000009 - momentum: 0.000000 2023-10-13 10:40:46,989 epoch 8 - iter 120/242 - loss 0.02650884 - time (sec): 6.11 - samples/sec: 1987.37 - lr: 0.000008 - momentum: 0.000000 2023-10-13 10:40:48,038 epoch 8 - iter 144/242 - loss 0.02560617 - time (sec): 7.16 - samples/sec: 2048.68 - lr: 0.000008 - momentum: 0.000000 2023-10-13 10:40:49,119 epoch 8 - iter 168/242 - loss 0.02917173 - time (sec): 8.24 - samples/sec: 2059.97 - lr: 0.000008 - momentum: 0.000000 2023-10-13 10:40:50,196 epoch 8 - iter 192/242 - loss 0.02663108 - time (sec): 9.32 - samples/sec: 2061.82 - lr: 0.000007 - momentum: 0.000000 2023-10-13 10:40:51,322 epoch 8 - iter 216/242 - loss 0.02454383 - time (sec): 10.45 - samples/sec: 2106.54 - lr: 0.000007 - momentum: 0.000000 2023-10-13 10:40:52,417 epoch 8 - iter 240/242 - loss 0.02333313 - time (sec): 11.54 - samples/sec: 2132.82 - lr: 0.000007 - momentum: 0.000000 2023-10-13 10:40:52,508 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:40:52,508 EPOCH 8 done: loss 0.0232 - lr: 0.000007 2023-10-13 10:40:53,287 DEV : loss 0.18003101646900177 - f1-score (micro avg) 0.8333 2023-10-13 10:40:53,292 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:40:54,327 epoch 9 - iter 24/242 - loss 0.00889814 - time (sec): 1.03 - samples/sec: 2356.26 - lr: 0.000006 - momentum: 0.000000 2023-10-13 10:40:55,368 epoch 9 - iter 48/242 - loss 0.01381746 - time (sec): 2.08 - samples/sec: 2358.84 - lr: 0.000006 - momentum: 0.000000 2023-10-13 10:40:56,455 epoch 9 - iter 72/242 - loss 0.01312606 - time (sec): 3.16 - samples/sec: 2381.07 - lr: 0.000006 - momentum: 0.000000 2023-10-13 10:40:57,523 epoch 9 - iter 96/242 - loss 0.01363630 - time (sec): 4.23 - samples/sec: 2413.66 - lr: 0.000005 - momentum: 0.000000 2023-10-13 10:40:58,589 epoch 9 - iter 120/242 - loss 0.01636900 - time (sec): 5.30 - samples/sec: 2360.47 - lr: 0.000005 - momentum: 0.000000 2023-10-13 10:40:59,678 epoch 9 - iter 144/242 - loss 0.01971543 - time (sec): 6.38 - samples/sec: 2325.95 - lr: 0.000005 - momentum: 0.000000 2023-10-13 10:41:00,768 epoch 9 - iter 168/242 - loss 0.01805574 - time (sec): 7.48 - samples/sec: 2332.96 - lr: 0.000004 - momentum: 0.000000 2023-10-13 10:41:01,855 epoch 9 - iter 192/242 - loss 0.02089072 - time (sec): 8.56 - samples/sec: 2327.19 - lr: 0.000004 - momentum: 0.000000 2023-10-13 10:41:02,922 epoch 9 - iter 216/242 - loss 0.01991797 - time (sec): 9.63 - samples/sec: 2318.75 - lr: 0.000004 - momentum: 0.000000 2023-10-13 10:41:04,003 epoch 9 - iter 240/242 - loss 0.01827566 - time (sec): 10.71 - samples/sec: 2300.99 - lr: 0.000003 - momentum: 0.000000 2023-10-13 10:41:04,091 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:41:04,091 EPOCH 9 done: loss 0.0182 - lr: 0.000003 2023-10-13 10:41:04,862 DEV : loss 0.18278612196445465 - f1-score (micro avg) 0.8519 2023-10-13 10:41:04,867 saving best model 2023-10-13 10:41:05,405 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:41:06,503 epoch 10 - iter 24/242 - loss 0.02556896 - time (sec): 1.10 - samples/sec: 2225.76 - lr: 0.000003 - momentum: 0.000000 2023-10-13 10:41:07,598 epoch 10 - iter 48/242 - loss 0.01566657 - time (sec): 2.19 - samples/sec: 2349.00 - lr: 0.000003 - momentum: 0.000000 2023-10-13 10:41:08,700 epoch 10 - iter 72/242 - loss 0.01372692 - time (sec): 3.29 - samples/sec: 2251.17 - lr: 0.000002 - momentum: 0.000000 2023-10-13 10:41:09,841 epoch 10 - iter 96/242 - loss 0.01474622 - time (sec): 4.43 - samples/sec: 2233.41 - lr: 0.000002 - momentum: 0.000000 2023-10-13 10:41:10,936 epoch 10 - iter 120/242 - loss 0.01488522 - time (sec): 5.53 - samples/sec: 2237.68 - lr: 0.000002 - momentum: 0.000000 2023-10-13 10:41:12,019 epoch 10 - iter 144/242 - loss 0.01598091 - time (sec): 6.61 - samples/sec: 2254.68 - lr: 0.000001 - momentum: 0.000000 2023-10-13 10:41:13,078 epoch 10 - iter 168/242 - loss 0.01468703 - time (sec): 7.67 - samples/sec: 2243.53 - lr: 0.000001 - momentum: 0.000000 2023-10-13 10:41:14,137 epoch 10 - iter 192/242 - loss 0.01641705 - time (sec): 8.73 - samples/sec: 2249.28 - lr: 0.000001 - momentum: 0.000000 2023-10-13 10:41:15,209 epoch 10 - iter 216/242 - loss 0.01602209 - time (sec): 9.80 - samples/sec: 2239.47 - lr: 0.000000 - momentum: 0.000000 2023-10-13 10:41:16,297 epoch 10 - iter 240/242 - loss 0.01510578 - time (sec): 10.89 - samples/sec: 2259.77 - lr: 0.000000 - momentum: 0.000000 2023-10-13 10:41:16,380 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:41:16,380 EPOCH 10 done: loss 0.0150 - lr: 0.000000 2023-10-13 10:41:17,179 DEV : loss 0.18673823773860931 - f1-score (micro avg) 0.8462 2023-10-13 10:41:17,576 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:41:17,578 Loading model from best epoch ... 2023-10-13 10:41:19,469 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-13 10:41:20,129 Results: - F-score (micro) 0.8178 - F-score (macro) 0.5511 - Accuracy 0.7113 By class: precision recall f1-score support pers 0.8356 0.8777 0.8561 139 scope 0.8603 0.9070 0.8830 129 work 0.6526 0.7750 0.7086 80 loc 0.5000 0.2222 0.3077 9 date 0.0000 0.0000 0.0000 3 micro avg 0.7953 0.8417 0.8178 360 macro avg 0.5697 0.5564 0.5511 360 weighted avg 0.7884 0.8417 0.8121 360 2023-10-13 10:41:20,129 ----------------------------------------------------------------------------------------------------