stefan-it's picture
Upload folder using huggingface_hub
0766310
2023-10-13 10:37:09,830 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:09,831 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 10:37:09,831 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:09,831 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-13 10:37:09,831 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:09,831 Train: 966 sentences
2023-10-13 10:37:09,831 (train_with_dev=False, train_with_test=False)
2023-10-13 10:37:09,832 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:09,832 Training Params:
2023-10-13 10:37:09,832 - learning_rate: "5e-05"
2023-10-13 10:37:09,832 - mini_batch_size: "8"
2023-10-13 10:37:09,832 - max_epochs: "10"
2023-10-13 10:37:09,832 - shuffle: "True"
2023-10-13 10:37:09,832 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:09,832 Plugins:
2023-10-13 10:37:09,832 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 10:37:09,832 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:09,832 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 10:37:09,832 - metric: "('micro avg', 'f1-score')"
2023-10-13 10:37:09,832 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:09,832 Computation:
2023-10-13 10:37:09,832 - compute on device: cuda:0
2023-10-13 10:37:09,832 - embedding storage: none
2023-10-13 10:37:09,832 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:09,832 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-13 10:37:09,832 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:09,832 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:10,544 epoch 1 - iter 12/121 - loss 3.37368518 - time (sec): 0.71 - samples/sec: 3283.62 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:37:11,390 epoch 1 - iter 24/121 - loss 3.10830298 - time (sec): 1.56 - samples/sec: 3220.70 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:37:12,150 epoch 1 - iter 36/121 - loss 2.59152083 - time (sec): 2.32 - samples/sec: 3304.19 - lr: 0.000014 - momentum: 0.000000
2023-10-13 10:37:12,878 epoch 1 - iter 48/121 - loss 2.14443777 - time (sec): 3.05 - samples/sec: 3304.14 - lr: 0.000019 - momentum: 0.000000
2023-10-13 10:37:13,546 epoch 1 - iter 60/121 - loss 1.89276798 - time (sec): 3.71 - samples/sec: 3291.30 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:37:14,312 epoch 1 - iter 72/121 - loss 1.66470130 - time (sec): 4.48 - samples/sec: 3319.18 - lr: 0.000029 - momentum: 0.000000
2023-10-13 10:37:15,049 epoch 1 - iter 84/121 - loss 1.50708292 - time (sec): 5.22 - samples/sec: 3315.88 - lr: 0.000034 - momentum: 0.000000
2023-10-13 10:37:15,866 epoch 1 - iter 96/121 - loss 1.37611929 - time (sec): 6.03 - samples/sec: 3297.89 - lr: 0.000039 - momentum: 0.000000
2023-10-13 10:37:16,566 epoch 1 - iter 108/121 - loss 1.26833368 - time (sec): 6.73 - samples/sec: 3295.61 - lr: 0.000044 - momentum: 0.000000
2023-10-13 10:37:17,359 epoch 1 - iter 120/121 - loss 1.17773191 - time (sec): 7.53 - samples/sec: 3262.56 - lr: 0.000049 - momentum: 0.000000
2023-10-13 10:37:17,413 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:17,413 EPOCH 1 done: loss 1.1712 - lr: 0.000049
2023-10-13 10:37:18,112 DEV : loss 0.3314380943775177 - f1-score (micro avg) 0.4677
2023-10-13 10:37:18,119 saving best model
2023-10-13 10:37:18,511 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:19,248 epoch 2 - iter 12/121 - loss 0.31576988 - time (sec): 0.73 - samples/sec: 3322.67 - lr: 0.000049 - momentum: 0.000000
2023-10-13 10:37:19,977 epoch 2 - iter 24/121 - loss 0.34306470 - time (sec): 1.46 - samples/sec: 3259.10 - lr: 0.000049 - momentum: 0.000000
2023-10-13 10:37:21,092 epoch 2 - iter 36/121 - loss 0.31292154 - time (sec): 2.58 - samples/sec: 2863.64 - lr: 0.000048 - momentum: 0.000000
2023-10-13 10:37:21,910 epoch 2 - iter 48/121 - loss 0.29648981 - time (sec): 3.40 - samples/sec: 2926.92 - lr: 0.000048 - momentum: 0.000000
2023-10-13 10:37:22,674 epoch 2 - iter 60/121 - loss 0.29468175 - time (sec): 4.16 - samples/sec: 2964.10 - lr: 0.000047 - momentum: 0.000000
2023-10-13 10:37:23,403 epoch 2 - iter 72/121 - loss 0.28361730 - time (sec): 4.89 - samples/sec: 2972.86 - lr: 0.000047 - momentum: 0.000000
2023-10-13 10:37:24,164 epoch 2 - iter 84/121 - loss 0.26817243 - time (sec): 5.65 - samples/sec: 3017.49 - lr: 0.000046 - momentum: 0.000000
2023-10-13 10:37:24,958 epoch 2 - iter 96/121 - loss 0.26411978 - time (sec): 6.45 - samples/sec: 3044.43 - lr: 0.000046 - momentum: 0.000000
2023-10-13 10:37:25,831 epoch 2 - iter 108/121 - loss 0.25546258 - time (sec): 7.32 - samples/sec: 3074.49 - lr: 0.000045 - momentum: 0.000000
2023-10-13 10:37:26,576 epoch 2 - iter 120/121 - loss 0.25199092 - time (sec): 8.06 - samples/sec: 3054.74 - lr: 0.000045 - momentum: 0.000000
2023-10-13 10:37:26,632 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:26,633 EPOCH 2 done: loss 0.2515 - lr: 0.000045
2023-10-13 10:37:27,440 DEV : loss 0.17766954004764557 - f1-score (micro avg) 0.6024
2023-10-13 10:37:27,446 saving best model
2023-10-13 10:37:27,979 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:28,755 epoch 3 - iter 12/121 - loss 0.22685084 - time (sec): 0.77 - samples/sec: 3188.83 - lr: 0.000044 - momentum: 0.000000
2023-10-13 10:37:29,562 epoch 3 - iter 24/121 - loss 0.18735251 - time (sec): 1.57 - samples/sec: 3168.75 - lr: 0.000043 - momentum: 0.000000
2023-10-13 10:37:30,421 epoch 3 - iter 36/121 - loss 0.16249113 - time (sec): 2.43 - samples/sec: 3039.43 - lr: 0.000043 - momentum: 0.000000
2023-10-13 10:37:31,178 epoch 3 - iter 48/121 - loss 0.15237792 - time (sec): 3.19 - samples/sec: 3045.43 - lr: 0.000042 - momentum: 0.000000
2023-10-13 10:37:31,938 epoch 3 - iter 60/121 - loss 0.14494580 - time (sec): 3.95 - samples/sec: 3086.89 - lr: 0.000042 - momentum: 0.000000
2023-10-13 10:37:32,747 epoch 3 - iter 72/121 - loss 0.13571616 - time (sec): 4.76 - samples/sec: 3083.83 - lr: 0.000041 - momentum: 0.000000
2023-10-13 10:37:33,491 epoch 3 - iter 84/121 - loss 0.14151344 - time (sec): 5.50 - samples/sec: 3107.08 - lr: 0.000041 - momentum: 0.000000
2023-10-13 10:37:34,282 epoch 3 - iter 96/121 - loss 0.13409286 - time (sec): 6.30 - samples/sec: 3127.02 - lr: 0.000040 - momentum: 0.000000
2023-10-13 10:37:35,097 epoch 3 - iter 108/121 - loss 0.13146597 - time (sec): 7.11 - samples/sec: 3118.39 - lr: 0.000040 - momentum: 0.000000
2023-10-13 10:37:35,888 epoch 3 - iter 120/121 - loss 0.13131231 - time (sec): 7.90 - samples/sec: 3112.81 - lr: 0.000039 - momentum: 0.000000
2023-10-13 10:37:35,943 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:35,944 EPOCH 3 done: loss 0.1318 - lr: 0.000039
2023-10-13 10:37:36,724 DEV : loss 0.1347845047712326 - f1-score (micro avg) 0.8042
2023-10-13 10:37:36,730 saving best model
2023-10-13 10:37:37,231 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:38,005 epoch 4 - iter 12/121 - loss 0.11228080 - time (sec): 0.77 - samples/sec: 3270.53 - lr: 0.000038 - momentum: 0.000000
2023-10-13 10:37:38,813 epoch 4 - iter 24/121 - loss 0.09026772 - time (sec): 1.58 - samples/sec: 3266.98 - lr: 0.000038 - momentum: 0.000000
2023-10-13 10:37:39,584 epoch 4 - iter 36/121 - loss 0.08089175 - time (sec): 2.35 - samples/sec: 3259.77 - lr: 0.000037 - momentum: 0.000000
2023-10-13 10:37:40,349 epoch 4 - iter 48/121 - loss 0.08447303 - time (sec): 3.12 - samples/sec: 3133.46 - lr: 0.000037 - momentum: 0.000000
2023-10-13 10:37:41,189 epoch 4 - iter 60/121 - loss 0.07925997 - time (sec): 3.96 - samples/sec: 3168.10 - lr: 0.000036 - momentum: 0.000000
2023-10-13 10:37:41,940 epoch 4 - iter 72/121 - loss 0.07889240 - time (sec): 4.71 - samples/sec: 3170.78 - lr: 0.000036 - momentum: 0.000000
2023-10-13 10:37:42,749 epoch 4 - iter 84/121 - loss 0.07996881 - time (sec): 5.52 - samples/sec: 3149.67 - lr: 0.000035 - momentum: 0.000000
2023-10-13 10:37:43,596 epoch 4 - iter 96/121 - loss 0.08050200 - time (sec): 6.36 - samples/sec: 3101.42 - lr: 0.000035 - momentum: 0.000000
2023-10-13 10:37:44,352 epoch 4 - iter 108/121 - loss 0.07822267 - time (sec): 7.12 - samples/sec: 3090.86 - lr: 0.000034 - momentum: 0.000000
2023-10-13 10:37:45,161 epoch 4 - iter 120/121 - loss 0.07942352 - time (sec): 7.93 - samples/sec: 3096.60 - lr: 0.000034 - momentum: 0.000000
2023-10-13 10:37:45,220 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:45,221 EPOCH 4 done: loss 0.0791 - lr: 0.000034
2023-10-13 10:37:45,993 DEV : loss 0.13045471906661987 - f1-score (micro avg) 0.828
2023-10-13 10:37:46,000 saving best model
2023-10-13 10:37:46,454 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:47,233 epoch 5 - iter 12/121 - loss 0.05702991 - time (sec): 0.78 - samples/sec: 3259.80 - lr: 0.000033 - momentum: 0.000000
2023-10-13 10:37:48,016 epoch 5 - iter 24/121 - loss 0.06750735 - time (sec): 1.56 - samples/sec: 3087.77 - lr: 0.000032 - momentum: 0.000000
2023-10-13 10:37:48,793 epoch 5 - iter 36/121 - loss 0.06733838 - time (sec): 2.34 - samples/sec: 3125.37 - lr: 0.000032 - momentum: 0.000000
2023-10-13 10:37:49,531 epoch 5 - iter 48/121 - loss 0.06395984 - time (sec): 3.07 - samples/sec: 3133.17 - lr: 0.000031 - momentum: 0.000000
2023-10-13 10:37:50,396 epoch 5 - iter 60/121 - loss 0.06148417 - time (sec): 3.94 - samples/sec: 3135.92 - lr: 0.000031 - momentum: 0.000000
2023-10-13 10:37:51,191 epoch 5 - iter 72/121 - loss 0.06256314 - time (sec): 4.73 - samples/sec: 3188.41 - lr: 0.000030 - momentum: 0.000000
2023-10-13 10:37:51,931 epoch 5 - iter 84/121 - loss 0.06116325 - time (sec): 5.47 - samples/sec: 3225.78 - lr: 0.000030 - momentum: 0.000000
2023-10-13 10:37:52,723 epoch 5 - iter 96/121 - loss 0.06002168 - time (sec): 6.26 - samples/sec: 3188.00 - lr: 0.000029 - momentum: 0.000000
2023-10-13 10:37:53,427 epoch 5 - iter 108/121 - loss 0.05950433 - time (sec): 6.97 - samples/sec: 3157.89 - lr: 0.000029 - momentum: 0.000000
2023-10-13 10:37:54,183 epoch 5 - iter 120/121 - loss 0.06115160 - time (sec): 7.73 - samples/sec: 3172.16 - lr: 0.000028 - momentum: 0.000000
2023-10-13 10:37:54,253 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:54,254 EPOCH 5 done: loss 0.0617 - lr: 0.000028
2023-10-13 10:37:55,052 DEV : loss 0.15638893842697144 - f1-score (micro avg) 0.7915
2023-10-13 10:37:55,058 ----------------------------------------------------------------------------------------------------
2023-10-13 10:37:55,839 epoch 6 - iter 12/121 - loss 0.03278663 - time (sec): 0.78 - samples/sec: 3268.01 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:37:56,584 epoch 6 - iter 24/121 - loss 0.04414064 - time (sec): 1.52 - samples/sec: 3061.85 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:37:57,362 epoch 6 - iter 36/121 - loss 0.04465714 - time (sec): 2.30 - samples/sec: 3052.83 - lr: 0.000026 - momentum: 0.000000
2023-10-13 10:37:58,115 epoch 6 - iter 48/121 - loss 0.04285887 - time (sec): 3.06 - samples/sec: 3059.45 - lr: 0.000026 - momentum: 0.000000
2023-10-13 10:37:58,922 epoch 6 - iter 60/121 - loss 0.04498895 - time (sec): 3.86 - samples/sec: 3119.11 - lr: 0.000025 - momentum: 0.000000
2023-10-13 10:37:59,658 epoch 6 - iter 72/121 - loss 0.04532703 - time (sec): 4.60 - samples/sec: 3159.96 - lr: 0.000025 - momentum: 0.000000
2023-10-13 10:38:00,462 epoch 6 - iter 84/121 - loss 0.04301846 - time (sec): 5.40 - samples/sec: 3185.44 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:38:01,218 epoch 6 - iter 96/121 - loss 0.04154085 - time (sec): 6.16 - samples/sec: 3238.35 - lr: 0.000024 - momentum: 0.000000
2023-10-13 10:38:02,003 epoch 6 - iter 108/121 - loss 0.04183211 - time (sec): 6.94 - samples/sec: 3235.49 - lr: 0.000023 - momentum: 0.000000
2023-10-13 10:38:02,702 epoch 6 - iter 120/121 - loss 0.04186647 - time (sec): 7.64 - samples/sec: 3221.47 - lr: 0.000022 - momentum: 0.000000
2023-10-13 10:38:02,761 ----------------------------------------------------------------------------------------------------
2023-10-13 10:38:02,761 EPOCH 6 done: loss 0.0418 - lr: 0.000022
2023-10-13 10:38:03,570 DEV : loss 0.14503213763237 - f1-score (micro avg) 0.824
2023-10-13 10:38:03,575 ----------------------------------------------------------------------------------------------------
2023-10-13 10:38:04,319 epoch 7 - iter 12/121 - loss 0.03044972 - time (sec): 0.74 - samples/sec: 2975.10 - lr: 0.000022 - momentum: 0.000000
2023-10-13 10:38:05,121 epoch 7 - iter 24/121 - loss 0.02231224 - time (sec): 1.54 - samples/sec: 2988.63 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:38:05,938 epoch 7 - iter 36/121 - loss 0.02247562 - time (sec): 2.36 - samples/sec: 3042.46 - lr: 0.000021 - momentum: 0.000000
2023-10-13 10:38:06,671 epoch 7 - iter 48/121 - loss 0.02698753 - time (sec): 3.10 - samples/sec: 3099.46 - lr: 0.000020 - momentum: 0.000000
2023-10-13 10:38:07,407 epoch 7 - iter 60/121 - loss 0.02945475 - time (sec): 3.83 - samples/sec: 3146.80 - lr: 0.000020 - momentum: 0.000000
2023-10-13 10:38:08,216 epoch 7 - iter 72/121 - loss 0.02952242 - time (sec): 4.64 - samples/sec: 3161.81 - lr: 0.000019 - momentum: 0.000000
2023-10-13 10:38:09,025 epoch 7 - iter 84/121 - loss 0.03063607 - time (sec): 5.45 - samples/sec: 3179.50 - lr: 0.000019 - momentum: 0.000000
2023-10-13 10:38:09,818 epoch 7 - iter 96/121 - loss 0.02801346 - time (sec): 6.24 - samples/sec: 3173.03 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:38:10,551 epoch 7 - iter 108/121 - loss 0.02699956 - time (sec): 6.98 - samples/sec: 3159.19 - lr: 0.000017 - momentum: 0.000000
2023-10-13 10:38:11,404 epoch 7 - iter 120/121 - loss 0.02753590 - time (sec): 7.83 - samples/sec: 3152.14 - lr: 0.000017 - momentum: 0.000000
2023-10-13 10:38:11,462 ----------------------------------------------------------------------------------------------------
2023-10-13 10:38:11,462 EPOCH 7 done: loss 0.0276 - lr: 0.000017
2023-10-13 10:38:12,274 DEV : loss 0.15853901207447052 - f1-score (micro avg) 0.8333
2023-10-13 10:38:12,279 saving best model
2023-10-13 10:38:12,794 ----------------------------------------------------------------------------------------------------
2023-10-13 10:38:13,585 epoch 8 - iter 12/121 - loss 0.01639123 - time (sec): 0.79 - samples/sec: 3098.15 - lr: 0.000016 - momentum: 0.000000
2023-10-13 10:38:14,399 epoch 8 - iter 24/121 - loss 0.01632867 - time (sec): 1.60 - samples/sec: 2877.12 - lr: 0.000016 - momentum: 0.000000
2023-10-13 10:38:15,152 epoch 8 - iter 36/121 - loss 0.01579476 - time (sec): 2.35 - samples/sec: 3001.16 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:38:15,874 epoch 8 - iter 48/121 - loss 0.01445154 - time (sec): 3.08 - samples/sec: 3023.25 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:38:16,727 epoch 8 - iter 60/121 - loss 0.01509986 - time (sec): 3.93 - samples/sec: 3096.65 - lr: 0.000014 - momentum: 0.000000
2023-10-13 10:38:17,460 epoch 8 - iter 72/121 - loss 0.01373362 - time (sec): 4.66 - samples/sec: 3139.63 - lr: 0.000014 - momentum: 0.000000
2023-10-13 10:38:18,225 epoch 8 - iter 84/121 - loss 0.01333895 - time (sec): 5.43 - samples/sec: 3122.22 - lr: 0.000013 - momentum: 0.000000
2023-10-13 10:38:18,994 epoch 8 - iter 96/121 - loss 0.01410936 - time (sec): 6.20 - samples/sec: 3157.16 - lr: 0.000013 - momentum: 0.000000
2023-10-13 10:38:19,748 epoch 8 - iter 108/121 - loss 0.01870261 - time (sec): 6.95 - samples/sec: 3181.03 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:38:20,514 epoch 8 - iter 120/121 - loss 0.01979377 - time (sec): 7.72 - samples/sec: 3189.10 - lr: 0.000011 - momentum: 0.000000
2023-10-13 10:38:20,569 ----------------------------------------------------------------------------------------------------
2023-10-13 10:38:20,569 EPOCH 8 done: loss 0.0197 - lr: 0.000011
2023-10-13 10:38:21,355 DEV : loss 0.17500457167625427 - f1-score (micro avg) 0.8308
2023-10-13 10:38:21,361 ----------------------------------------------------------------------------------------------------
2023-10-13 10:38:22,213 epoch 9 - iter 12/121 - loss 0.00825075 - time (sec): 0.85 - samples/sec: 3192.23 - lr: 0.000011 - momentum: 0.000000
2023-10-13 10:38:23,011 epoch 9 - iter 24/121 - loss 0.00809065 - time (sec): 1.65 - samples/sec: 3203.59 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:38:23,801 epoch 9 - iter 36/121 - loss 0.01100156 - time (sec): 2.44 - samples/sec: 3046.89 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:38:24,572 epoch 9 - iter 48/121 - loss 0.01511375 - time (sec): 3.21 - samples/sec: 3106.47 - lr: 0.000009 - momentum: 0.000000
2023-10-13 10:38:25,353 epoch 9 - iter 60/121 - loss 0.01313352 - time (sec): 3.99 - samples/sec: 3130.98 - lr: 0.000009 - momentum: 0.000000
2023-10-13 10:38:26,134 epoch 9 - iter 72/121 - loss 0.01775137 - time (sec): 4.77 - samples/sec: 3196.91 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:38:26,894 epoch 9 - iter 84/121 - loss 0.01634282 - time (sec): 5.53 - samples/sec: 3189.39 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:38:27,684 epoch 9 - iter 96/121 - loss 0.01607601 - time (sec): 6.32 - samples/sec: 3143.75 - lr: 0.000007 - momentum: 0.000000
2023-10-13 10:38:28,529 epoch 9 - iter 108/121 - loss 0.01541910 - time (sec): 7.17 - samples/sec: 3141.32 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:38:29,262 epoch 9 - iter 120/121 - loss 0.01452552 - time (sec): 7.90 - samples/sec: 3117.79 - lr: 0.000006 - momentum: 0.000000
2023-10-13 10:38:29,315 ----------------------------------------------------------------------------------------------------
2023-10-13 10:38:29,316 EPOCH 9 done: loss 0.0145 - lr: 0.000006
2023-10-13 10:38:30,183 DEV : loss 0.18977651000022888 - f1-score (micro avg) 0.8291
2023-10-13 10:38:30,189 ----------------------------------------------------------------------------------------------------
2023-10-13 10:38:30,924 epoch 10 - iter 12/121 - loss 0.02411538 - time (sec): 0.73 - samples/sec: 3081.40 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:38:31,702 epoch 10 - iter 24/121 - loss 0.02367473 - time (sec): 1.51 - samples/sec: 3104.66 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:38:32,515 epoch 10 - iter 36/121 - loss 0.02248721 - time (sec): 2.32 - samples/sec: 3130.56 - lr: 0.000004 - momentum: 0.000000
2023-10-13 10:38:33,292 epoch 10 - iter 48/121 - loss 0.01733333 - time (sec): 3.10 - samples/sec: 3192.98 - lr: 0.000004 - momentum: 0.000000
2023-10-13 10:38:34,064 epoch 10 - iter 60/121 - loss 0.01485533 - time (sec): 3.87 - samples/sec: 3204.82 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:38:34,826 epoch 10 - iter 72/121 - loss 0.01387192 - time (sec): 4.64 - samples/sec: 3146.95 - lr: 0.000003 - momentum: 0.000000
2023-10-13 10:38:35,572 epoch 10 - iter 84/121 - loss 0.01314430 - time (sec): 5.38 - samples/sec: 3131.59 - lr: 0.000002 - momentum: 0.000000
2023-10-13 10:38:36,354 epoch 10 - iter 96/121 - loss 0.01286429 - time (sec): 6.16 - samples/sec: 3102.54 - lr: 0.000001 - momentum: 0.000000
2023-10-13 10:38:37,127 epoch 10 - iter 108/121 - loss 0.01159406 - time (sec): 6.94 - samples/sec: 3136.98 - lr: 0.000001 - momentum: 0.000000
2023-10-13 10:38:38,023 epoch 10 - iter 120/121 - loss 0.01076949 - time (sec): 7.83 - samples/sec: 3132.60 - lr: 0.000000 - momentum: 0.000000
2023-10-13 10:38:38,078 ----------------------------------------------------------------------------------------------------
2023-10-13 10:38:38,079 EPOCH 10 done: loss 0.0107 - lr: 0.000000
2023-10-13 10:38:38,921 DEV : loss 0.1880086213350296 - f1-score (micro avg) 0.8252
2023-10-13 10:38:39,331 ----------------------------------------------------------------------------------------------------
2023-10-13 10:38:39,332 Loading model from best epoch ...
2023-10-13 10:38:41,014 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-13 10:38:42,075
Results:
- F-score (micro) 0.7979
- F-score (macro) 0.4599
- Accuracy 0.6865
By class:
precision recall f1-score support
pers 0.8311 0.8849 0.8571 139
scope 0.7603 0.8605 0.8073 129
work 0.6737 0.8000 0.7314 80
loc 1.0000 0.2222 0.3636 9
date 0.0000 0.0000 0.0000 3
object 0.0000 0.0000 0.0000 0
micro avg 0.7653 0.8333 0.7979 360
macro avg 0.5442 0.4613 0.4599 360
weighted avg 0.7680 0.8333 0.7919 360
2023-10-13 10:38:42,076 ----------------------------------------------------------------------------------------------------