|
2023-10-14 20:44:51,075 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:44:51,076 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-14 20:44:51,076 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:44:51,076 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences |
|
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator |
|
2023-10-14 20:44:51,076 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:44:51,076 Train: 14465 sentences |
|
2023-10-14 20:44:51,076 (train_with_dev=False, train_with_test=False) |
|
2023-10-14 20:44:51,076 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:44:51,077 Training Params: |
|
2023-10-14 20:44:51,077 - learning_rate: "5e-05" |
|
2023-10-14 20:44:51,077 - mini_batch_size: "4" |
|
2023-10-14 20:44:51,077 - max_epochs: "10" |
|
2023-10-14 20:44:51,077 - shuffle: "True" |
|
2023-10-14 20:44:51,077 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:44:51,077 Plugins: |
|
2023-10-14 20:44:51,077 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-14 20:44:51,077 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:44:51,077 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-14 20:44:51,077 - metric: "('micro avg', 'f1-score')" |
|
2023-10-14 20:44:51,077 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:44:51,077 Computation: |
|
2023-10-14 20:44:51,077 - compute on device: cuda:0 |
|
2023-10-14 20:44:51,077 - embedding storage: none |
|
2023-10-14 20:44:51,077 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:44:51,077 Model training base path: "hmbench-letemps/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-14 20:44:51,077 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:44:51,077 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:45:08,016 epoch 1 - iter 361/3617 - loss 1.21049643 - time (sec): 16.94 - samples/sec: 2222.06 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 20:45:26,819 epoch 1 - iter 722/3617 - loss 0.69887718 - time (sec): 35.74 - samples/sec: 2111.51 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 20:45:43,412 epoch 1 - iter 1083/3617 - loss 0.52197079 - time (sec): 52.33 - samples/sec: 2129.26 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 20:45:59,782 epoch 1 - iter 1444/3617 - loss 0.42681095 - time (sec): 68.70 - samples/sec: 2161.97 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 20:46:16,622 epoch 1 - iter 1805/3617 - loss 0.36380391 - time (sec): 85.54 - samples/sec: 2201.08 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 20:46:32,999 epoch 1 - iter 2166/3617 - loss 0.32380187 - time (sec): 101.92 - samples/sec: 2217.20 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 20:46:49,761 epoch 1 - iter 2527/3617 - loss 0.29392513 - time (sec): 118.68 - samples/sec: 2240.07 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-14 20:47:05,904 epoch 1 - iter 2888/3617 - loss 0.27259528 - time (sec): 134.83 - samples/sec: 2251.82 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-14 20:47:22,100 epoch 1 - iter 3249/3617 - loss 0.25444813 - time (sec): 151.02 - samples/sec: 2258.31 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-14 20:47:38,309 epoch 1 - iter 3610/3617 - loss 0.24111479 - time (sec): 167.23 - samples/sec: 2267.41 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-14 20:47:38,617 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:47:38,617 EPOCH 1 done: loss 0.2409 - lr: 0.000050 |
|
2023-10-14 20:47:43,951 DEV : loss 0.12323478609323502 - f1-score (micro avg) 0.5865 |
|
2023-10-14 20:47:43,990 saving best model |
|
2023-10-14 20:47:44,376 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:48:00,735 epoch 2 - iter 361/3617 - loss 0.10632803 - time (sec): 16.36 - samples/sec: 2344.09 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-14 20:48:16,934 epoch 2 - iter 722/3617 - loss 0.10322207 - time (sec): 32.56 - samples/sec: 2320.21 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-14 20:48:33,240 epoch 2 - iter 1083/3617 - loss 0.10593850 - time (sec): 48.86 - samples/sec: 2307.09 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-14 20:48:50,075 epoch 2 - iter 1444/3617 - loss 0.10664188 - time (sec): 65.70 - samples/sec: 2323.73 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-14 20:49:06,509 epoch 2 - iter 1805/3617 - loss 0.10605572 - time (sec): 82.13 - samples/sec: 2311.63 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-14 20:49:22,832 epoch 2 - iter 2166/3617 - loss 0.10448677 - time (sec): 98.45 - samples/sec: 2324.11 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-14 20:49:39,417 epoch 2 - iter 2527/3617 - loss 0.10397977 - time (sec): 115.04 - samples/sec: 2330.95 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-14 20:49:55,725 epoch 2 - iter 2888/3617 - loss 0.10404474 - time (sec): 131.35 - samples/sec: 2325.67 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-14 20:50:11,903 epoch 2 - iter 3249/3617 - loss 0.10371473 - time (sec): 147.52 - samples/sec: 2313.23 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-14 20:50:28,258 epoch 2 - iter 3610/3617 - loss 0.10388679 - time (sec): 163.88 - samples/sec: 2314.48 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-14 20:50:28,578 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:50:28,578 EPOCH 2 done: loss 0.1038 - lr: 0.000044 |
|
2023-10-14 20:50:34,850 DEV : loss 0.13535423576831818 - f1-score (micro avg) 0.6246 |
|
2023-10-14 20:50:34,883 saving best model |
|
2023-10-14 20:50:35,338 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:50:52,031 epoch 3 - iter 361/3617 - loss 0.07083650 - time (sec): 16.69 - samples/sec: 2214.49 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-14 20:51:08,168 epoch 3 - iter 722/3617 - loss 0.07700446 - time (sec): 32.83 - samples/sec: 2276.17 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-14 20:51:24,969 epoch 3 - iter 1083/3617 - loss 0.08564511 - time (sec): 49.63 - samples/sec: 2271.86 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-14 20:51:41,074 epoch 3 - iter 1444/3617 - loss 0.08752690 - time (sec): 65.73 - samples/sec: 2285.24 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-14 20:51:57,079 epoch 3 - iter 1805/3617 - loss 0.08547374 - time (sec): 81.74 - samples/sec: 2299.19 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-14 20:52:13,234 epoch 3 - iter 2166/3617 - loss 0.08336422 - time (sec): 97.89 - samples/sec: 2311.65 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-14 20:52:29,645 epoch 3 - iter 2527/3617 - loss 0.08433411 - time (sec): 114.31 - samples/sec: 2322.61 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-14 20:52:46,055 epoch 3 - iter 2888/3617 - loss 0.08461852 - time (sec): 130.72 - samples/sec: 2322.03 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-14 20:53:04,567 epoch 3 - iter 3249/3617 - loss 0.08613219 - time (sec): 149.23 - samples/sec: 2288.28 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-14 20:53:23,351 epoch 3 - iter 3610/3617 - loss 0.08630119 - time (sec): 168.01 - samples/sec: 2257.30 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-14 20:53:23,704 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:53:23,705 EPOCH 3 done: loss 0.0863 - lr: 0.000039 |
|
2023-10-14 20:53:30,042 DEV : loss 0.20893967151641846 - f1-score (micro avg) 0.6222 |
|
2023-10-14 20:53:30,072 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:53:46,674 epoch 4 - iter 361/3617 - loss 0.05714291 - time (sec): 16.60 - samples/sec: 2213.44 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-14 20:54:02,975 epoch 4 - iter 722/3617 - loss 0.06279731 - time (sec): 32.90 - samples/sec: 2283.66 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-14 20:54:19,182 epoch 4 - iter 1083/3617 - loss 0.06156439 - time (sec): 49.11 - samples/sec: 2298.82 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-14 20:54:35,242 epoch 4 - iter 1444/3617 - loss 0.06231476 - time (sec): 65.17 - samples/sec: 2310.81 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-14 20:54:51,551 epoch 4 - iter 1805/3617 - loss 0.06294749 - time (sec): 81.48 - samples/sec: 2330.34 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-14 20:55:07,654 epoch 4 - iter 2166/3617 - loss 0.06283706 - time (sec): 97.58 - samples/sec: 2342.77 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-14 20:55:23,712 epoch 4 - iter 2527/3617 - loss 0.06347340 - time (sec): 113.64 - samples/sec: 2342.74 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-14 20:55:39,816 epoch 4 - iter 2888/3617 - loss 0.06460083 - time (sec): 129.74 - samples/sec: 2348.91 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-14 20:55:56,184 epoch 4 - iter 3249/3617 - loss 0.06589441 - time (sec): 146.11 - samples/sec: 2343.03 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-14 20:56:12,534 epoch 4 - iter 3610/3617 - loss 0.06532681 - time (sec): 162.46 - samples/sec: 2333.94 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-14 20:56:12,840 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:56:12,840 EPOCH 4 done: loss 0.0652 - lr: 0.000033 |
|
2023-10-14 20:56:20,148 DEV : loss 0.2746644914150238 - f1-score (micro avg) 0.6113 |
|
2023-10-14 20:56:20,191 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:56:37,963 epoch 5 - iter 361/3617 - loss 0.05107162 - time (sec): 17.77 - samples/sec: 2078.16 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-14 20:56:55,752 epoch 5 - iter 722/3617 - loss 0.04825107 - time (sec): 35.56 - samples/sec: 2145.48 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-14 20:57:12,168 epoch 5 - iter 1083/3617 - loss 0.04805194 - time (sec): 51.97 - samples/sec: 2198.24 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-14 20:57:28,412 epoch 5 - iter 1444/3617 - loss 0.04873195 - time (sec): 68.22 - samples/sec: 2220.55 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-14 20:57:44,708 epoch 5 - iter 1805/3617 - loss 0.04639778 - time (sec): 84.52 - samples/sec: 2230.82 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-14 20:58:01,016 epoch 5 - iter 2166/3617 - loss 0.04761930 - time (sec): 100.82 - samples/sec: 2265.67 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 20:58:17,191 epoch 5 - iter 2527/3617 - loss 0.04831656 - time (sec): 117.00 - samples/sec: 2277.51 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 20:58:33,296 epoch 5 - iter 2888/3617 - loss 0.04792017 - time (sec): 133.10 - samples/sec: 2288.04 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 20:58:49,211 epoch 5 - iter 3249/3617 - loss 0.04879276 - time (sec): 149.02 - samples/sec: 2294.34 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-14 20:59:05,207 epoch 5 - iter 3610/3617 - loss 0.04751972 - time (sec): 165.01 - samples/sec: 2297.36 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-14 20:59:05,509 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:59:05,509 EPOCH 5 done: loss 0.0475 - lr: 0.000028 |
|
2023-10-14 20:59:11,197 DEV : loss 0.2522399425506592 - f1-score (micro avg) 0.615 |
|
2023-10-14 20:59:11,235 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 20:59:27,227 epoch 6 - iter 361/3617 - loss 0.04060810 - time (sec): 15.99 - samples/sec: 2417.59 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 20:59:43,109 epoch 6 - iter 722/3617 - loss 0.03470326 - time (sec): 31.87 - samples/sec: 2383.55 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 20:59:59,354 epoch 6 - iter 1083/3617 - loss 0.03448545 - time (sec): 48.12 - samples/sec: 2373.29 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-14 21:00:15,500 epoch 6 - iter 1444/3617 - loss 0.03572893 - time (sec): 64.26 - samples/sec: 2345.33 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-14 21:00:32,080 epoch 6 - iter 1805/3617 - loss 0.03790498 - time (sec): 80.84 - samples/sec: 2323.64 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 21:00:48,399 epoch 6 - iter 2166/3617 - loss 0.03756533 - time (sec): 97.16 - samples/sec: 2318.62 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 21:01:04,962 epoch 6 - iter 2527/3617 - loss 0.03637689 - time (sec): 113.73 - samples/sec: 2311.06 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 21:01:21,260 epoch 6 - iter 2888/3617 - loss 0.03628099 - time (sec): 130.02 - samples/sec: 2318.72 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 21:01:39,237 epoch 6 - iter 3249/3617 - loss 0.03563678 - time (sec): 148.00 - samples/sec: 2296.92 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 21:01:55,520 epoch 6 - iter 3610/3617 - loss 0.03644492 - time (sec): 164.28 - samples/sec: 2308.78 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 21:01:55,832 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 21:01:55,832 EPOCH 6 done: loss 0.0364 - lr: 0.000022 |
|
2023-10-14 21:02:01,379 DEV : loss 0.29494383931159973 - f1-score (micro avg) 0.6292 |
|
2023-10-14 21:02:01,413 saving best model |
|
2023-10-14 21:02:01,869 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 21:02:18,077 epoch 7 - iter 361/3617 - loss 0.03617598 - time (sec): 16.20 - samples/sec: 2387.11 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 21:02:34,376 epoch 7 - iter 722/3617 - loss 0.03091675 - time (sec): 32.50 - samples/sec: 2354.05 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 21:02:50,448 epoch 7 - iter 1083/3617 - loss 0.02948926 - time (sec): 48.58 - samples/sec: 2350.56 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 21:03:07,558 epoch 7 - iter 1444/3617 - loss 0.02797629 - time (sec): 65.69 - samples/sec: 2320.25 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 21:03:24,994 epoch 7 - iter 1805/3617 - loss 0.02853652 - time (sec): 83.12 - samples/sec: 2290.94 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 21:03:41,182 epoch 7 - iter 2166/3617 - loss 0.02802450 - time (sec): 99.31 - samples/sec: 2291.42 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 21:03:57,540 epoch 7 - iter 2527/3617 - loss 0.02675150 - time (sec): 115.67 - samples/sec: 2291.73 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 21:04:14,018 epoch 7 - iter 2888/3617 - loss 0.02637225 - time (sec): 132.15 - samples/sec: 2306.72 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 21:04:30,290 epoch 7 - iter 3249/3617 - loss 0.02610657 - time (sec): 148.42 - samples/sec: 2304.06 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 21:04:46,584 epoch 7 - iter 3610/3617 - loss 0.02552403 - time (sec): 164.71 - samples/sec: 2303.82 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 21:04:46,886 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 21:04:46,886 EPOCH 7 done: loss 0.0255 - lr: 0.000017 |
|
2023-10-14 21:04:53,190 DEV : loss 0.3446139395236969 - f1-score (micro avg) 0.615 |
|
2023-10-14 21:04:53,220 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 21:05:09,555 epoch 8 - iter 361/3617 - loss 0.01650971 - time (sec): 16.33 - samples/sec: 2325.96 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 21:05:26,581 epoch 8 - iter 722/3617 - loss 0.01768386 - time (sec): 33.36 - samples/sec: 2297.34 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 21:05:42,941 epoch 8 - iter 1083/3617 - loss 0.01765876 - time (sec): 49.72 - samples/sec: 2279.93 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 21:05:59,688 epoch 8 - iter 1444/3617 - loss 0.01706874 - time (sec): 66.47 - samples/sec: 2283.33 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 21:06:15,920 epoch 8 - iter 1805/3617 - loss 0.01715211 - time (sec): 82.70 - samples/sec: 2302.95 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 21:06:32,165 epoch 8 - iter 2166/3617 - loss 0.01796261 - time (sec): 98.94 - samples/sec: 2304.19 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 21:06:48,452 epoch 8 - iter 2527/3617 - loss 0.01775176 - time (sec): 115.23 - samples/sec: 2306.13 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 21:07:05,449 epoch 8 - iter 2888/3617 - loss 0.01711009 - time (sec): 132.23 - samples/sec: 2300.93 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 21:07:22,138 epoch 8 - iter 3249/3617 - loss 0.01724597 - time (sec): 148.92 - samples/sec: 2294.00 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 21:07:38,424 epoch 8 - iter 3610/3617 - loss 0.01721940 - time (sec): 165.20 - samples/sec: 2296.29 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 21:07:38,727 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 21:07:38,727 EPOCH 8 done: loss 0.0172 - lr: 0.000011 |
|
2023-10-14 21:07:46,067 DEV : loss 0.35566461086273193 - f1-score (micro avg) 0.6203 |
|
2023-10-14 21:07:46,100 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 21:08:02,982 epoch 9 - iter 361/3617 - loss 0.01334246 - time (sec): 16.88 - samples/sec: 2253.22 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 21:08:19,223 epoch 9 - iter 722/3617 - loss 0.01262507 - time (sec): 33.12 - samples/sec: 2305.55 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 21:08:35,886 epoch 9 - iter 1083/3617 - loss 0.01123698 - time (sec): 49.78 - samples/sec: 2327.81 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 21:08:52,230 epoch 9 - iter 1444/3617 - loss 0.01016665 - time (sec): 66.13 - samples/sec: 2311.20 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 21:09:08,463 epoch 9 - iter 1805/3617 - loss 0.01156230 - time (sec): 82.36 - samples/sec: 2304.33 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-14 21:09:24,577 epoch 9 - iter 2166/3617 - loss 0.01168601 - time (sec): 98.48 - samples/sec: 2306.12 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-14 21:09:40,951 epoch 9 - iter 2527/3617 - loss 0.01191755 - time (sec): 114.85 - samples/sec: 2306.41 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 21:09:57,560 epoch 9 - iter 2888/3617 - loss 0.01167735 - time (sec): 131.46 - samples/sec: 2303.86 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 21:10:14,092 epoch 9 - iter 3249/3617 - loss 0.01159086 - time (sec): 147.99 - samples/sec: 2304.37 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 21:10:30,396 epoch 9 - iter 3610/3617 - loss 0.01173161 - time (sec): 164.29 - samples/sec: 2308.88 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 21:10:30,693 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 21:10:30,693 EPOCH 9 done: loss 0.0117 - lr: 0.000006 |
|
2023-10-14 21:10:37,244 DEV : loss 0.36872944235801697 - f1-score (micro avg) 0.6175 |
|
2023-10-14 21:10:37,276 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 21:10:54,918 epoch 10 - iter 361/3617 - loss 0.00910245 - time (sec): 17.64 - samples/sec: 2148.65 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 21:11:11,348 epoch 10 - iter 722/3617 - loss 0.00776251 - time (sec): 34.07 - samples/sec: 2213.30 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 21:11:28,723 epoch 10 - iter 1083/3617 - loss 0.00798966 - time (sec): 51.45 - samples/sec: 2214.46 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 21:11:45,035 epoch 10 - iter 1444/3617 - loss 0.00742183 - time (sec): 67.76 - samples/sec: 2228.67 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 21:12:01,459 epoch 10 - iter 1805/3617 - loss 0.00832240 - time (sec): 84.18 - samples/sec: 2252.30 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 21:12:17,443 epoch 10 - iter 2166/3617 - loss 0.00906769 - time (sec): 100.17 - samples/sec: 2258.24 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 21:12:33,753 epoch 10 - iter 2527/3617 - loss 0.00897717 - time (sec): 116.48 - samples/sec: 2266.17 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 21:12:50,112 epoch 10 - iter 2888/3617 - loss 0.00846428 - time (sec): 132.83 - samples/sec: 2283.85 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 21:13:06,556 epoch 10 - iter 3249/3617 - loss 0.00812779 - time (sec): 149.28 - samples/sec: 2286.02 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 21:13:22,906 epoch 10 - iter 3610/3617 - loss 0.00783079 - time (sec): 165.63 - samples/sec: 2289.65 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-14 21:13:23,217 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 21:13:23,218 EPOCH 10 done: loss 0.0078 - lr: 0.000000 |
|
2023-10-14 21:13:30,387 DEV : loss 0.3962736129760742 - f1-score (micro avg) 0.628 |
|
2023-10-14 21:13:30,829 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 21:13:30,830 Loading model from best epoch ... |
|
2023-10-14 21:13:32,581 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org |
|
2023-10-14 21:13:40,099 |
|
Results: |
|
- F-score (micro) 0.6398 |
|
- F-score (macro) 0.4724 |
|
- Accuracy 0.4905 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.6204 0.7631 0.6844 591 |
|
pers 0.5745 0.7451 0.6488 357 |
|
org 0.1250 0.0633 0.0840 79 |
|
|
|
micro avg 0.5870 0.7030 0.6398 1027 |
|
macro avg 0.4400 0.5238 0.4724 1027 |
|
weighted avg 0.5663 0.7030 0.6258 1027 |
|
|
|
2023-10-14 21:13:40,099 ---------------------------------------------------------------------------------------------------- |
|
|