stefan-it's picture
Upload folder using huggingface_hub
2899c1f
raw
history blame contribute delete
No virus
23.9 kB
2023-10-13 10:47:53,859 ----------------------------------------------------------------------------------------------------
2023-10-13 10:47:53,860 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 10:47:53,860 ----------------------------------------------------------------------------------------------------
2023-10-13 10:47:53,860 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-13 10:47:53,860 ----------------------------------------------------------------------------------------------------
2023-10-13 10:47:53,860 Train: 966 sentences
2023-10-13 10:47:53,860 (train_with_dev=False, train_with_test=False)
2023-10-13 10:47:53,860 ----------------------------------------------------------------------------------------------------
2023-10-13 10:47:53,860 Training Params:
2023-10-13 10:47:53,860 - learning_rate: "3e-05"
2023-10-13 10:47:53,860 - mini_batch_size: "4"
2023-10-13 10:47:53,861 - max_epochs: "10"
2023-10-13 10:47:53,861 - shuffle: "True"
2023-10-13 10:47:53,861 ----------------------------------------------------------------------------------------------------
2023-10-13 10:47:53,861 Plugins:
2023-10-13 10:47:53,861 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 10:47:53,861 ----------------------------------------------------------------------------------------------------
2023-10-13 10:47:53,861 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 10:47:53,861 - metric: "('micro avg', 'f1-score')"
2023-10-13 10:47:53,861 ----------------------------------------------------------------------------------------------------
2023-10-13 10:47:53,861 Computation:
2023-10-13 10:47:53,861 - compute on device: cuda:0
2023-10-13 10:47:53,861 - embedding storage: none
2023-10-13 10:47:53,861 ----------------------------------------------------------------------------------------------------
2023-10-13 10:47:53,861 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-13 10:47:53,861 ----------------------------------------------------------------------------------------------------
2023-10-13 10:47:53,861 ----------------------------------------------------------------------------------------------------
2023-10-13 10:47:55,015 epoch 1 - iter 24/242 - loss 3.13003262 - time (sec): 1.15 - samples/sec: 2161.80 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:47:56,148 epoch 1 - iter 48/242 - loss 2.75856781 - time (sec): 2.29 - samples/sec: 2180.16 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:47:57,299 epoch 1 - iter 72/242 - loss 2.19092998 - time (sec): 3.44 - samples/sec: 2154.66 - lr: 0.000009 - momentum: 0.000000
2023-10-13 10:47:58,447 epoch 1 - iter 96/242 - loss 1.77854119 - time (sec): 4.59 - samples/sec: 2200.86 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:47:59,545 epoch 1 - iter 120/242 - loss 1.55395191 - time (sec): 5.68 - samples/sec: 2176.25 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:48:00,641 epoch 1 - iter 144/242 - loss 1.38979327 - time (sec): 6.78 - samples/sec: 2156.02 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:48:01,874 epoch 1 - iter 168/242 - loss 1.25407378 - time (sec): 8.01 - samples/sec: 2150.68 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:48:03,164 epoch 1 - iter 192/242 - loss 1.13701640 - time (sec): 9.30 - samples/sec: 2116.67 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:48:04,450 epoch 1 - iter 216/242 - loss 1.04657703 - time (sec): 10.59 - samples/sec: 2074.49 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:48:05,698 epoch 1 - iter 240/242 - loss 0.96086965 - time (sec): 11.84 - samples/sec: 2079.63 - lr: 0.000030 - momentum: 0.000000
2023-10-13 10:48:05,794 ----------------------------------------------------------------------------------------------------
2023-10-13 10:48:05,794 EPOCH 1 done: loss 0.9579 - lr: 0.000030
2023-10-13 10:48:06,820 DEV : loss 0.2311297208070755 - f1-score (micro avg) 0.5526
2023-10-13 10:48:06,827 saving best model
2023-10-13 10:48:07,175 ----------------------------------------------------------------------------------------------------
2023-10-13 10:48:08,412 epoch 2 - iter 24/242 - loss 0.19252288 - time (sec): 1.24 - samples/sec: 1890.40 - lr: 0.000030 - momentum: 0.000000
2023-10-13 10:48:09,585 epoch 2 - iter 48/242 - loss 0.19820910 - time (sec): 2.41 - samples/sec: 2130.51 - lr: 0.000029 - momentum: 0.000000
2023-10-13 10:48:10,665 epoch 2 - iter 72/242 - loss 0.22613094 - time (sec): 3.49 - samples/sec: 2173.82 - lr: 0.000029 - momentum: 0.000000
2023-10-13 10:48:11,757 epoch 2 - iter 96/242 - loss 0.21088375 - time (sec): 4.58 - samples/sec: 2193.77 - lr: 0.000029 - momentum: 0.000000
2023-10-13 10:48:12,828 epoch 2 - iter 120/242 - loss 0.20564494 - time (sec): 5.65 - samples/sec: 2189.72 - lr: 0.000028 - momentum: 0.000000
2023-10-13 10:48:13,904 epoch 2 - iter 144/242 - loss 0.19855435 - time (sec): 6.73 - samples/sec: 2231.35 - lr: 0.000028 - momentum: 0.000000
2023-10-13 10:48:14,998 epoch 2 - iter 168/242 - loss 0.19107056 - time (sec): 7.82 - samples/sec: 2248.92 - lr: 0.000028 - momentum: 0.000000
2023-10-13 10:48:16,041 epoch 2 - iter 192/242 - loss 0.19275419 - time (sec): 8.86 - samples/sec: 2224.91 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:48:17,056 epoch 2 - iter 216/242 - loss 0.18972065 - time (sec): 9.88 - samples/sec: 2227.16 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:48:18,077 epoch 2 - iter 240/242 - loss 0.18183087 - time (sec): 10.90 - samples/sec: 2253.42 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:48:18,160 ----------------------------------------------------------------------------------------------------
2023-10-13 10:48:18,160 EPOCH 2 done: loss 0.1813 - lr: 0.000027
2023-10-13 10:48:18,923 DEV : loss 0.1384759545326233 - f1-score (micro avg) 0.797
2023-10-13 10:48:18,927 saving best model
2023-10-13 10:48:19,386 ----------------------------------------------------------------------------------------------------
2023-10-13 10:48:20,511 epoch 3 - iter 24/242 - loss 0.11716937 - time (sec): 1.12 - samples/sec: 2225.32 - lr: 0.000026 - momentum: 0.000000
2023-10-13 10:48:21,596 epoch 3 - iter 48/242 - loss 0.10872166 - time (sec): 2.21 - samples/sec: 2158.68 - lr: 0.000026 - momentum: 0.000000
2023-10-13 10:48:22,672 epoch 3 - iter 72/242 - loss 0.10620498 - time (sec): 3.28 - samples/sec: 2178.29 - lr: 0.000026 - momentum: 0.000000
2023-10-13 10:48:23,739 epoch 3 - iter 96/242 - loss 0.11615336 - time (sec): 4.35 - samples/sec: 2193.15 - lr: 0.000025 - momentum: 0.000000
2023-10-13 10:48:24,794 epoch 3 - iter 120/242 - loss 0.11830967 - time (sec): 5.41 - samples/sec: 2205.38 - lr: 0.000025 - momentum: 0.000000
2023-10-13 10:48:25,875 epoch 3 - iter 144/242 - loss 0.11441029 - time (sec): 6.49 - samples/sec: 2235.60 - lr: 0.000025 - momentum: 0.000000
2023-10-13 10:48:26,948 epoch 3 - iter 168/242 - loss 0.10605450 - time (sec): 7.56 - samples/sec: 2240.85 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:48:28,024 epoch 3 - iter 192/242 - loss 0.10357313 - time (sec): 8.64 - samples/sec: 2269.02 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:48:29,109 epoch 3 - iter 216/242 - loss 0.10387747 - time (sec): 9.72 - samples/sec: 2239.64 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:48:30,228 epoch 3 - iter 240/242 - loss 0.10206518 - time (sec): 10.84 - samples/sec: 2270.44 - lr: 0.000023 - momentum: 0.000000
2023-10-13 10:48:30,312 ----------------------------------------------------------------------------------------------------
2023-10-13 10:48:30,312 EPOCH 3 done: loss 0.1020 - lr: 0.000023
2023-10-13 10:48:31,102 DEV : loss 0.13693803548812866 - f1-score (micro avg) 0.8128
2023-10-13 10:48:31,107 saving best model
2023-10-13 10:48:31,579 ----------------------------------------------------------------------------------------------------
2023-10-13 10:48:32,692 epoch 4 - iter 24/242 - loss 0.05483466 - time (sec): 1.11 - samples/sec: 2277.02 - lr: 0.000023 - momentum: 0.000000
2023-10-13 10:48:33,787 epoch 4 - iter 48/242 - loss 0.06636085 - time (sec): 2.20 - samples/sec: 2195.33 - lr: 0.000023 - momentum: 0.000000
2023-10-13 10:48:34,899 epoch 4 - iter 72/242 - loss 0.05503188 - time (sec): 3.32 - samples/sec: 2227.16 - lr: 0.000022 - momentum: 0.000000
2023-10-13 10:48:35,985 epoch 4 - iter 96/242 - loss 0.06042093 - time (sec): 4.40 - samples/sec: 2291.90 - lr: 0.000022 - momentum: 0.000000
2023-10-13 10:48:37,093 epoch 4 - iter 120/242 - loss 0.07036762 - time (sec): 5.51 - samples/sec: 2266.87 - lr: 0.000022 - momentum: 0.000000
2023-10-13 10:48:38,172 epoch 4 - iter 144/242 - loss 0.06852479 - time (sec): 6.59 - samples/sec: 2218.64 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:48:39,278 epoch 4 - iter 168/242 - loss 0.07130847 - time (sec): 7.69 - samples/sec: 2209.89 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:48:40,350 epoch 4 - iter 192/242 - loss 0.07430387 - time (sec): 8.77 - samples/sec: 2230.60 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:48:41,411 epoch 4 - iter 216/242 - loss 0.07667453 - time (sec): 9.83 - samples/sec: 2252.68 - lr: 0.000020 - momentum: 0.000000
2023-10-13 10:48:42,477 epoch 4 - iter 240/242 - loss 0.07606467 - time (sec): 10.89 - samples/sec: 2265.12 - lr: 0.000020 - momentum: 0.000000
2023-10-13 10:48:42,561 ----------------------------------------------------------------------------------------------------
2023-10-13 10:48:42,561 EPOCH 4 done: loss 0.0759 - lr: 0.000020
2023-10-13 10:48:43,344 DEV : loss 0.14181385934352875 - f1-score (micro avg) 0.8294
2023-10-13 10:48:43,349 saving best model
2023-10-13 10:48:43,819 ----------------------------------------------------------------------------------------------------
2023-10-13 10:48:45,012 epoch 5 - iter 24/242 - loss 0.04479677 - time (sec): 1.18 - samples/sec: 2163.10 - lr: 0.000020 - momentum: 0.000000
2023-10-13 10:48:46,151 epoch 5 - iter 48/242 - loss 0.04692686 - time (sec): 2.32 - samples/sec: 2227.84 - lr: 0.000019 - momentum: 0.000000
2023-10-13 10:48:47,280 epoch 5 - iter 72/242 - loss 0.05314195 - time (sec): 3.45 - samples/sec: 2217.21 - lr: 0.000019 - momentum: 0.000000
2023-10-13 10:48:48,436 epoch 5 - iter 96/242 - loss 0.05034553 - time (sec): 4.61 - samples/sec: 2169.73 - lr: 0.000019 - momentum: 0.000000
2023-10-13 10:48:49,503 epoch 5 - iter 120/242 - loss 0.05228931 - time (sec): 5.67 - samples/sec: 2198.25 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:48:50,569 epoch 5 - iter 144/242 - loss 0.05391865 - time (sec): 6.74 - samples/sec: 2187.02 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:48:51,665 epoch 5 - iter 168/242 - loss 0.05744337 - time (sec): 7.84 - samples/sec: 2170.23 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:48:52,753 epoch 5 - iter 192/242 - loss 0.05935951 - time (sec): 8.92 - samples/sec: 2181.24 - lr: 0.000017 - momentum: 0.000000
2023-10-13 10:48:53,838 epoch 5 - iter 216/242 - loss 0.05699843 - time (sec): 10.01 - samples/sec: 2211.94 - lr: 0.000017 - momentum: 0.000000
2023-10-13 10:48:54,907 epoch 5 - iter 240/242 - loss 0.05727682 - time (sec): 11.08 - samples/sec: 2222.44 - lr: 0.000017 - momentum: 0.000000
2023-10-13 10:48:54,993 ----------------------------------------------------------------------------------------------------
2023-10-13 10:48:54,993 EPOCH 5 done: loss 0.0579 - lr: 0.000017
2023-10-13 10:48:55,769 DEV : loss 0.14726245403289795 - f1-score (micro avg) 0.8434
2023-10-13 10:48:55,774 saving best model
2023-10-13 10:48:56,239 ----------------------------------------------------------------------------------------------------
2023-10-13 10:48:57,364 epoch 6 - iter 24/242 - loss 0.05146366 - time (sec): 1.12 - samples/sec: 2100.14 - lr: 0.000016 - momentum: 0.000000
2023-10-13 10:48:58,596 epoch 6 - iter 48/242 - loss 0.05125553 - time (sec): 2.35 - samples/sec: 2083.72 - lr: 0.000016 - momentum: 0.000000
2023-10-13 10:48:59,752 epoch 6 - iter 72/242 - loss 0.05035674 - time (sec): 3.51 - samples/sec: 1990.82 - lr: 0.000016 - momentum: 0.000000
2023-10-13 10:49:00,948 epoch 6 - iter 96/242 - loss 0.04704035 - time (sec): 4.71 - samples/sec: 2095.76 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:49:02,045 epoch 6 - iter 120/242 - loss 0.04175777 - time (sec): 5.80 - samples/sec: 2132.85 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:49:03,120 epoch 6 - iter 144/242 - loss 0.04166679 - time (sec): 6.88 - samples/sec: 2145.67 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:49:04,186 epoch 6 - iter 168/242 - loss 0.03841489 - time (sec): 7.94 - samples/sec: 2162.85 - lr: 0.000014 - momentum: 0.000000
2023-10-13 10:49:05,276 epoch 6 - iter 192/242 - loss 0.03596465 - time (sec): 9.03 - samples/sec: 2189.29 - lr: 0.000014 - momentum: 0.000000
2023-10-13 10:49:06,351 epoch 6 - iter 216/242 - loss 0.04193606 - time (sec): 10.11 - samples/sec: 2201.00 - lr: 0.000014 - momentum: 0.000000
2023-10-13 10:49:07,421 epoch 6 - iter 240/242 - loss 0.04253577 - time (sec): 11.18 - samples/sec: 2198.84 - lr: 0.000013 - momentum: 0.000000
2023-10-13 10:49:07,509 ----------------------------------------------------------------------------------------------------
2023-10-13 10:49:07,509 EPOCH 6 done: loss 0.0429 - lr: 0.000013
2023-10-13 10:49:08,299 DEV : loss 0.17159917950630188 - f1-score (micro avg) 0.8356
2023-10-13 10:49:08,306 ----------------------------------------------------------------------------------------------------
2023-10-13 10:49:09,399 epoch 7 - iter 24/242 - loss 0.03111648 - time (sec): 1.09 - samples/sec: 2193.15 - lr: 0.000013 - momentum: 0.000000
2023-10-13 10:49:10,493 epoch 7 - iter 48/242 - loss 0.02717241 - time (sec): 2.19 - samples/sec: 2215.80 - lr: 0.000013 - momentum: 0.000000
2023-10-13 10:49:11,571 epoch 7 - iter 72/242 - loss 0.02752756 - time (sec): 3.26 - samples/sec: 2329.21 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:49:12,685 epoch 7 - iter 96/242 - loss 0.03236364 - time (sec): 4.38 - samples/sec: 2313.01 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:49:13,767 epoch 7 - iter 120/242 - loss 0.02918724 - time (sec): 5.46 - samples/sec: 2301.63 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:49:14,828 epoch 7 - iter 144/242 - loss 0.03104367 - time (sec): 6.52 - samples/sec: 2266.30 - lr: 0.000011 - momentum: 0.000000
2023-10-13 10:49:15,885 epoch 7 - iter 168/242 - loss 0.03112205 - time (sec): 7.58 - samples/sec: 2233.37 - lr: 0.000011 - momentum: 0.000000
2023-10-13 10:49:16,961 epoch 7 - iter 192/242 - loss 0.03365939 - time (sec): 8.65 - samples/sec: 2260.23 - lr: 0.000011 - momentum: 0.000000
2023-10-13 10:49:17,975 epoch 7 - iter 216/242 - loss 0.03144889 - time (sec): 9.67 - samples/sec: 2268.47 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:49:18,991 epoch 7 - iter 240/242 - loss 0.03122881 - time (sec): 10.68 - samples/sec: 2296.18 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:49:19,079 ----------------------------------------------------------------------------------------------------
2023-10-13 10:49:19,079 EPOCH 7 done: loss 0.0313 - lr: 0.000010
2023-10-13 10:49:19,885 DEV : loss 0.18848736584186554 - f1-score (micro avg) 0.8409
2023-10-13 10:49:19,891 ----------------------------------------------------------------------------------------------------
2023-10-13 10:49:20,932 epoch 8 - iter 24/242 - loss 0.01927510 - time (sec): 1.04 - samples/sec: 2347.61 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:49:22,005 epoch 8 - iter 48/242 - loss 0.02463260 - time (sec): 2.11 - samples/sec: 2330.37 - lr: 0.000009 - momentum: 0.000000
2023-10-13 10:49:23,076 epoch 8 - iter 72/242 - loss 0.02726866 - time (sec): 3.18 - samples/sec: 2233.21 - lr: 0.000009 - momentum: 0.000000
2023-10-13 10:49:24,239 epoch 8 - iter 96/242 - loss 0.02300393 - time (sec): 4.35 - samples/sec: 2168.54 - lr: 0.000009 - momentum: 0.000000
2023-10-13 10:49:25,323 epoch 8 - iter 120/242 - loss 0.02083412 - time (sec): 5.43 - samples/sec: 2200.59 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:49:26,392 epoch 8 - iter 144/242 - loss 0.02701189 - time (sec): 6.50 - samples/sec: 2212.51 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:49:27,474 epoch 8 - iter 168/242 - loss 0.02613039 - time (sec): 7.58 - samples/sec: 2248.12 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:49:28,563 epoch 8 - iter 192/242 - loss 0.02688347 - time (sec): 8.67 - samples/sec: 2248.58 - lr: 0.000007 - momentum: 0.000000
2023-10-13 10:49:29,612 epoch 8 - iter 216/242 - loss 0.02593304 - time (sec): 9.72 - samples/sec: 2235.60 - lr: 0.000007 - momentum: 0.000000
2023-10-13 10:49:30,710 epoch 8 - iter 240/242 - loss 0.02401868 - time (sec): 10.82 - samples/sec: 2268.81 - lr: 0.000007 - momentum: 0.000000
2023-10-13 10:49:30,798 ----------------------------------------------------------------------------------------------------
2023-10-13 10:49:30,799 EPOCH 8 done: loss 0.0240 - lr: 0.000007
2023-10-13 10:49:31,591 DEV : loss 0.20379725098609924 - f1-score (micro avg) 0.8309
2023-10-13 10:49:31,603 ----------------------------------------------------------------------------------------------------
2023-10-13 10:49:32,940 epoch 9 - iter 24/242 - loss 0.01808826 - time (sec): 1.34 - samples/sec: 1864.21 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:49:34,165 epoch 9 - iter 48/242 - loss 0.01843967 - time (sec): 2.56 - samples/sec: 1904.33 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:49:35,239 epoch 9 - iter 72/242 - loss 0.01893930 - time (sec): 3.63 - samples/sec: 2075.55 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:49:36,320 epoch 9 - iter 96/242 - loss 0.01931122 - time (sec): 4.72 - samples/sec: 2168.66 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:49:37,392 epoch 9 - iter 120/242 - loss 0.02302219 - time (sec): 5.79 - samples/sec: 2212.05 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:49:38,448 epoch 9 - iter 144/242 - loss 0.02045805 - time (sec): 6.84 - samples/sec: 2249.74 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:49:39,497 epoch 9 - iter 168/242 - loss 0.01946463 - time (sec): 7.89 - samples/sec: 2269.78 - lr: 0.000004 - momentum: 0.000000
2023-10-13 10:49:40,554 epoch 9 - iter 192/242 - loss 0.01891538 - time (sec): 8.95 - samples/sec: 2256.21 - lr: 0.000004 - momentum: 0.000000
2023-10-13 10:49:41,607 epoch 9 - iter 216/242 - loss 0.01897938 - time (sec): 10.00 - samples/sec: 2220.33 - lr: 0.000004 - momentum: 0.000000
2023-10-13 10:49:42,685 epoch 9 - iter 240/242 - loss 0.01792783 - time (sec): 11.08 - samples/sec: 2213.06 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:49:42,784 ----------------------------------------------------------------------------------------------------
2023-10-13 10:49:42,784 EPOCH 9 done: loss 0.0178 - lr: 0.000003
2023-10-13 10:49:43,596 DEV : loss 0.21329720318317413 - f1-score (micro avg) 0.8264
2023-10-13 10:49:43,602 ----------------------------------------------------------------------------------------------------
2023-10-13 10:49:44,626 epoch 10 - iter 24/242 - loss 0.01256773 - time (sec): 1.02 - samples/sec: 2279.91 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:49:45,674 epoch 10 - iter 48/242 - loss 0.01101680 - time (sec): 2.07 - samples/sec: 2366.36 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:49:46,762 epoch 10 - iter 72/242 - loss 0.01117024 - time (sec): 3.16 - samples/sec: 2404.12 - lr: 0.000002 - momentum: 0.000000
2023-10-13 10:49:47,795 epoch 10 - iter 96/242 - loss 0.00960110 - time (sec): 4.19 - samples/sec: 2368.89 - lr: 0.000002 - momentum: 0.000000
2023-10-13 10:49:48,839 epoch 10 - iter 120/242 - loss 0.00973317 - time (sec): 5.24 - samples/sec: 2341.92 - lr: 0.000002 - momentum: 0.000000
2023-10-13 10:49:49,916 epoch 10 - iter 144/242 - loss 0.00905820 - time (sec): 6.31 - samples/sec: 2329.13 - lr: 0.000001 - momentum: 0.000000
2023-10-13 10:49:51,017 epoch 10 - iter 168/242 - loss 0.01208288 - time (sec): 7.41 - samples/sec: 2301.57 - lr: 0.000001 - momentum: 0.000000
2023-10-13 10:49:52,155 epoch 10 - iter 192/242 - loss 0.01328828 - time (sec): 8.55 - samples/sec: 2287.93 - lr: 0.000001 - momentum: 0.000000
2023-10-13 10:49:53,270 epoch 10 - iter 216/242 - loss 0.01311083 - time (sec): 9.67 - samples/sec: 2275.11 - lr: 0.000000 - momentum: 0.000000
2023-10-13 10:49:54,342 epoch 10 - iter 240/242 - loss 0.01454627 - time (sec): 10.74 - samples/sec: 2287.32 - lr: 0.000000 - momentum: 0.000000
2023-10-13 10:49:54,429 ----------------------------------------------------------------------------------------------------
2023-10-13 10:49:54,429 EPOCH 10 done: loss 0.0144 - lr: 0.000000
2023-10-13 10:49:55,197 DEV : loss 0.21672213077545166 - f1-score (micro avg) 0.8341
2023-10-13 10:49:55,589 ----------------------------------------------------------------------------------------------------
2023-10-13 10:49:55,590 Loading model from best epoch ...
2023-10-13 10:49:57,041 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-13 10:49:57,941
Results:
- F-score (micro) 0.8136
- F-score (macro) 0.5894
- Accuracy 0.7002
By class:
precision recall f1-score support
pers 0.8333 0.8633 0.8481 139
scope 0.8029 0.8527 0.8271 129
work 0.7253 0.8250 0.7719 80
loc 1.0000 0.3333 0.5000 9
date 0.0000 0.0000 0.0000 3
micro avg 0.7973 0.8306 0.8136 360
macro avg 0.6723 0.5749 0.5894 360
weighted avg 0.7956 0.8306 0.8078 360
2023-10-13 10:49:57,942 ----------------------------------------------------------------------------------------------------