|
2023-10-25 21:15:35,815 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:15:35,816 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 21:15:35,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:15:35,816 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-25 21:15:35,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:15:35,816 Train: 1166 sentences |
|
2023-10-25 21:15:35,816 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 21:15:35,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:15:35,816 Training Params: |
|
2023-10-25 21:15:35,816 - learning_rate: "3e-05" |
|
2023-10-25 21:15:35,817 - mini_batch_size: "8" |
|
2023-10-25 21:15:35,817 - max_epochs: "10" |
|
2023-10-25 21:15:35,817 - shuffle: "True" |
|
2023-10-25 21:15:35,817 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:15:35,817 Plugins: |
|
2023-10-25 21:15:35,817 - TensorboardLogger |
|
2023-10-25 21:15:35,817 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 21:15:35,817 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:15:35,817 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 21:15:35,817 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 21:15:35,817 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:15:35,817 Computation: |
|
2023-10-25 21:15:35,817 - compute on device: cuda:0 |
|
2023-10-25 21:15:35,817 - embedding storage: none |
|
2023-10-25 21:15:35,817 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:15:35,817 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-25 21:15:35,817 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:15:35,817 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:15:35,817 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 21:15:36,790 epoch 1 - iter 14/146 - loss 3.31880862 - time (sec): 0.97 - samples/sec: 4680.15 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 21:15:37,663 epoch 1 - iter 28/146 - loss 2.83940349 - time (sec): 1.85 - samples/sec: 4665.27 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 21:15:38,450 epoch 1 - iter 42/146 - loss 2.37233835 - time (sec): 2.63 - samples/sec: 4704.10 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 21:15:39,289 epoch 1 - iter 56/146 - loss 1.92518471 - time (sec): 3.47 - samples/sec: 4831.73 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 21:15:40,152 epoch 1 - iter 70/146 - loss 1.66955582 - time (sec): 4.33 - samples/sec: 4747.69 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 21:15:41,232 epoch 1 - iter 84/146 - loss 1.44430740 - time (sec): 5.41 - samples/sec: 4675.85 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 21:15:42,100 epoch 1 - iter 98/146 - loss 1.29525834 - time (sec): 6.28 - samples/sec: 4722.35 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 21:15:42,936 epoch 1 - iter 112/146 - loss 1.18172702 - time (sec): 7.12 - samples/sec: 4692.17 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 21:15:43,881 epoch 1 - iter 126/146 - loss 1.05862291 - time (sec): 8.06 - samples/sec: 4750.93 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 21:15:44,791 epoch 1 - iter 140/146 - loss 0.97914154 - time (sec): 8.97 - samples/sec: 4704.19 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 21:15:45,285 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:15:45,285 EPOCH 1 done: loss 0.9399 - lr: 0.000029 |
|
2023-10-25 21:15:45,797 DEV : loss 0.17592628300189972 - f1-score (micro avg) 0.5057 |
|
2023-10-25 21:15:45,804 saving best model |
|
2023-10-25 21:15:46,340 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:15:47,257 epoch 2 - iter 14/146 - loss 0.27459429 - time (sec): 0.92 - samples/sec: 4613.22 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 21:15:48,209 epoch 2 - iter 28/146 - loss 0.20873210 - time (sec): 1.87 - samples/sec: 4769.99 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 21:15:49,000 epoch 2 - iter 42/146 - loss 0.19579956 - time (sec): 2.66 - samples/sec: 4908.76 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 21:15:49,915 epoch 2 - iter 56/146 - loss 0.20388411 - time (sec): 3.57 - samples/sec: 4896.50 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 21:15:50,857 epoch 2 - iter 70/146 - loss 0.19597627 - time (sec): 4.52 - samples/sec: 4725.59 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 21:15:51,643 epoch 2 - iter 84/146 - loss 0.19552040 - time (sec): 5.30 - samples/sec: 4695.35 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 21:15:52,486 epoch 2 - iter 98/146 - loss 0.19191093 - time (sec): 6.14 - samples/sec: 4722.30 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 21:15:53,480 epoch 2 - iter 112/146 - loss 0.19199910 - time (sec): 7.14 - samples/sec: 4693.68 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 21:15:54,441 epoch 2 - iter 126/146 - loss 0.18518667 - time (sec): 8.10 - samples/sec: 4712.56 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 21:15:55,342 epoch 2 - iter 140/146 - loss 0.18152334 - time (sec): 9.00 - samples/sec: 4749.87 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 21:15:55,719 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:15:55,719 EPOCH 2 done: loss 0.1802 - lr: 0.000027 |
|
2023-10-25 21:15:56,802 DEV : loss 0.1134551540017128 - f1-score (micro avg) 0.6803 |
|
2023-10-25 21:15:56,807 saving best model |
|
2023-10-25 21:15:57,478 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:15:58,384 epoch 3 - iter 14/146 - loss 0.09704610 - time (sec): 0.90 - samples/sec: 5208.79 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 21:15:59,243 epoch 3 - iter 28/146 - loss 0.10318376 - time (sec): 1.76 - samples/sec: 4982.24 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 21:16:00,029 epoch 3 - iter 42/146 - loss 0.10850531 - time (sec): 2.55 - samples/sec: 4960.64 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 21:16:01,034 epoch 3 - iter 56/146 - loss 0.12139462 - time (sec): 3.55 - samples/sec: 4812.73 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 21:16:01,943 epoch 3 - iter 70/146 - loss 0.12819076 - time (sec): 4.46 - samples/sec: 4826.37 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 21:16:02,713 epoch 3 - iter 84/146 - loss 0.12323830 - time (sec): 5.23 - samples/sec: 4776.90 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 21:16:03,519 epoch 3 - iter 98/146 - loss 0.11656376 - time (sec): 6.04 - samples/sec: 4809.24 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 21:16:04,396 epoch 3 - iter 112/146 - loss 0.11841299 - time (sec): 6.92 - samples/sec: 4791.93 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 21:16:05,416 epoch 3 - iter 126/146 - loss 0.12592799 - time (sec): 7.94 - samples/sec: 4751.77 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 21:16:06,354 epoch 3 - iter 140/146 - loss 0.12063353 - time (sec): 8.87 - samples/sec: 4763.85 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 21:16:06,780 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:06,781 EPOCH 3 done: loss 0.1174 - lr: 0.000024 |
|
2023-10-25 21:16:07,701 DEV : loss 0.09478442370891571 - f1-score (micro avg) 0.7702 |
|
2023-10-25 21:16:07,706 saving best model |
|
2023-10-25 21:16:08,223 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:09,122 epoch 4 - iter 14/146 - loss 0.05203944 - time (sec): 0.90 - samples/sec: 4657.75 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 21:16:10,078 epoch 4 - iter 28/146 - loss 0.05457589 - time (sec): 1.85 - samples/sec: 4904.91 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 21:16:11,077 epoch 4 - iter 42/146 - loss 0.06328594 - time (sec): 2.85 - samples/sec: 4862.79 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 21:16:11,928 epoch 4 - iter 56/146 - loss 0.05976392 - time (sec): 3.70 - samples/sec: 4901.62 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 21:16:12,800 epoch 4 - iter 70/146 - loss 0.06522269 - time (sec): 4.58 - samples/sec: 4859.69 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 21:16:13,587 epoch 4 - iter 84/146 - loss 0.07044788 - time (sec): 5.36 - samples/sec: 4829.01 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 21:16:14,389 epoch 4 - iter 98/146 - loss 0.06828926 - time (sec): 6.16 - samples/sec: 4845.43 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 21:16:15,306 epoch 4 - iter 112/146 - loss 0.06561645 - time (sec): 7.08 - samples/sec: 4845.63 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 21:16:16,164 epoch 4 - iter 126/146 - loss 0.06604040 - time (sec): 7.94 - samples/sec: 4842.00 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 21:16:17,121 epoch 4 - iter 140/146 - loss 0.06471436 - time (sec): 8.90 - samples/sec: 4828.05 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 21:16:17,538 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:17,539 EPOCH 4 done: loss 0.0647 - lr: 0.000020 |
|
2023-10-25 21:16:18,464 DEV : loss 0.10669375211000443 - f1-score (micro avg) 0.7394 |
|
2023-10-25 21:16:18,469 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:19,473 epoch 5 - iter 14/146 - loss 0.02805617 - time (sec): 1.00 - samples/sec: 4532.92 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 21:16:20,406 epoch 5 - iter 28/146 - loss 0.04003122 - time (sec): 1.94 - samples/sec: 4597.96 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 21:16:21,326 epoch 5 - iter 42/146 - loss 0.03988139 - time (sec): 2.86 - samples/sec: 4730.62 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 21:16:22,332 epoch 5 - iter 56/146 - loss 0.03736783 - time (sec): 3.86 - samples/sec: 4693.02 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 21:16:23,266 epoch 5 - iter 70/146 - loss 0.03674800 - time (sec): 4.80 - samples/sec: 4753.66 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 21:16:24,118 epoch 5 - iter 84/146 - loss 0.03884696 - time (sec): 5.65 - samples/sec: 4730.77 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 21:16:24,936 epoch 5 - iter 98/146 - loss 0.03861859 - time (sec): 6.47 - samples/sec: 4694.59 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 21:16:25,745 epoch 5 - iter 112/146 - loss 0.04109054 - time (sec): 7.28 - samples/sec: 4749.93 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 21:16:26,576 epoch 5 - iter 126/146 - loss 0.04162815 - time (sec): 8.11 - samples/sec: 4742.72 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 21:16:27,416 epoch 5 - iter 140/146 - loss 0.04205143 - time (sec): 8.95 - samples/sec: 4769.31 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 21:16:27,784 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:27,784 EPOCH 5 done: loss 0.0419 - lr: 0.000017 |
|
2023-10-25 21:16:28,858 DEV : loss 0.10857772827148438 - f1-score (micro avg) 0.7352 |
|
2023-10-25 21:16:28,863 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:29,735 epoch 6 - iter 14/146 - loss 0.03746632 - time (sec): 0.87 - samples/sec: 4769.99 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 21:16:30,634 epoch 6 - iter 28/146 - loss 0.03397273 - time (sec): 1.77 - samples/sec: 4573.87 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 21:16:31,568 epoch 6 - iter 42/146 - loss 0.03073910 - time (sec): 2.70 - samples/sec: 4487.59 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 21:16:32,391 epoch 6 - iter 56/146 - loss 0.03333079 - time (sec): 3.53 - samples/sec: 4571.12 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 21:16:33,315 epoch 6 - iter 70/146 - loss 0.02997522 - time (sec): 4.45 - samples/sec: 4610.15 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 21:16:34,277 epoch 6 - iter 84/146 - loss 0.03191011 - time (sec): 5.41 - samples/sec: 4590.95 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 21:16:35,167 epoch 6 - iter 98/146 - loss 0.03171068 - time (sec): 6.30 - samples/sec: 4608.29 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 21:16:36,254 epoch 6 - iter 112/146 - loss 0.03140399 - time (sec): 7.39 - samples/sec: 4678.96 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 21:16:37,062 epoch 6 - iter 126/146 - loss 0.03108208 - time (sec): 8.20 - samples/sec: 4701.18 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 21:16:38,016 epoch 6 - iter 140/146 - loss 0.02980445 - time (sec): 9.15 - samples/sec: 4690.75 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 21:16:38,351 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:38,352 EPOCH 6 done: loss 0.0295 - lr: 0.000014 |
|
2023-10-25 21:16:39,271 DEV : loss 0.12395735830068588 - f1-score (micro avg) 0.738 |
|
2023-10-25 21:16:39,276 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:40,107 epoch 7 - iter 14/146 - loss 0.01140103 - time (sec): 0.83 - samples/sec: 4409.89 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 21:16:41,000 epoch 7 - iter 28/146 - loss 0.02379471 - time (sec): 1.72 - samples/sec: 4593.84 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 21:16:42,030 epoch 7 - iter 42/146 - loss 0.02046241 - time (sec): 2.75 - samples/sec: 4605.88 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 21:16:42,855 epoch 7 - iter 56/146 - loss 0.02186969 - time (sec): 3.58 - samples/sec: 4603.38 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 21:16:43,670 epoch 7 - iter 70/146 - loss 0.02110426 - time (sec): 4.39 - samples/sec: 4583.07 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 21:16:44,694 epoch 7 - iter 84/146 - loss 0.01970689 - time (sec): 5.42 - samples/sec: 4578.54 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 21:16:45,654 epoch 7 - iter 98/146 - loss 0.02011816 - time (sec): 6.38 - samples/sec: 4712.52 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 21:16:46,475 epoch 7 - iter 112/146 - loss 0.02094152 - time (sec): 7.20 - samples/sec: 4730.30 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 21:16:47,406 epoch 7 - iter 126/146 - loss 0.02223192 - time (sec): 8.13 - samples/sec: 4698.01 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 21:16:48,363 epoch 7 - iter 140/146 - loss 0.02211426 - time (sec): 9.09 - samples/sec: 4691.89 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 21:16:48,706 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:48,706 EPOCH 7 done: loss 0.0220 - lr: 0.000010 |
|
2023-10-25 21:16:49,627 DEV : loss 0.14144070446491241 - f1-score (micro avg) 0.7588 |
|
2023-10-25 21:16:49,632 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:50,520 epoch 8 - iter 14/146 - loss 0.01067314 - time (sec): 0.89 - samples/sec: 4530.35 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 21:16:51,508 epoch 8 - iter 28/146 - loss 0.01377843 - time (sec): 1.87 - samples/sec: 4914.76 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 21:16:52,324 epoch 8 - iter 42/146 - loss 0.01796832 - time (sec): 2.69 - samples/sec: 4805.15 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 21:16:53,172 epoch 8 - iter 56/146 - loss 0.01572929 - time (sec): 3.54 - samples/sec: 4887.26 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 21:16:54,011 epoch 8 - iter 70/146 - loss 0.01527731 - time (sec): 4.38 - samples/sec: 4881.07 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 21:16:55,023 epoch 8 - iter 84/146 - loss 0.01649619 - time (sec): 5.39 - samples/sec: 4840.89 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 21:16:55,942 epoch 8 - iter 98/146 - loss 0.01594214 - time (sec): 6.31 - samples/sec: 4775.98 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 21:16:56,808 epoch 8 - iter 112/146 - loss 0.01578526 - time (sec): 7.18 - samples/sec: 4732.45 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 21:16:57,713 epoch 8 - iter 126/146 - loss 0.01606300 - time (sec): 8.08 - samples/sec: 4749.22 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 21:16:58,790 epoch 8 - iter 140/146 - loss 0.01643747 - time (sec): 9.16 - samples/sec: 4721.15 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 21:16:59,104 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:59,104 EPOCH 8 done: loss 0.0161 - lr: 0.000007 |
|
2023-10-25 21:17:00,024 DEV : loss 0.15467968583106995 - f1-score (micro avg) 0.7424 |
|
2023-10-25 21:17:00,029 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:17:01,157 epoch 9 - iter 14/146 - loss 0.00288768 - time (sec): 1.13 - samples/sec: 3985.42 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 21:17:02,071 epoch 9 - iter 28/146 - loss 0.00859361 - time (sec): 2.04 - samples/sec: 4161.83 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 21:17:02,916 epoch 9 - iter 42/146 - loss 0.00973239 - time (sec): 2.89 - samples/sec: 4379.12 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 21:17:03,871 epoch 9 - iter 56/146 - loss 0.00984653 - time (sec): 3.84 - samples/sec: 4530.98 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 21:17:04,712 epoch 9 - iter 70/146 - loss 0.01184400 - time (sec): 4.68 - samples/sec: 4625.32 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 21:17:05,616 epoch 9 - iter 84/146 - loss 0.01206385 - time (sec): 5.59 - samples/sec: 4615.17 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 21:17:06,444 epoch 9 - iter 98/146 - loss 0.01143554 - time (sec): 6.41 - samples/sec: 4663.53 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 21:17:07,191 epoch 9 - iter 112/146 - loss 0.01143731 - time (sec): 7.16 - samples/sec: 4613.30 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 21:17:08,147 epoch 9 - iter 126/146 - loss 0.01086007 - time (sec): 8.12 - samples/sec: 4661.50 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 21:17:09,170 epoch 9 - iter 140/146 - loss 0.01234488 - time (sec): 9.14 - samples/sec: 4657.04 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 21:17:09,545 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:17:09,545 EPOCH 9 done: loss 0.0121 - lr: 0.000004 |
|
2023-10-25 21:17:10,469 DEV : loss 0.1692580282688141 - f1-score (micro avg) 0.7277 |
|
2023-10-25 21:17:10,474 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:17:11,401 epoch 10 - iter 14/146 - loss 0.01200424 - time (sec): 0.93 - samples/sec: 4743.15 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 21:17:12,233 epoch 10 - iter 28/146 - loss 0.01257013 - time (sec): 1.76 - samples/sec: 4679.71 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 21:17:13,044 epoch 10 - iter 42/146 - loss 0.01023577 - time (sec): 2.57 - samples/sec: 4679.81 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 21:17:14,009 epoch 10 - iter 56/146 - loss 0.01250361 - time (sec): 3.53 - samples/sec: 4676.31 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 21:17:14,859 epoch 10 - iter 70/146 - loss 0.01150593 - time (sec): 4.38 - samples/sec: 4809.32 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 21:17:15,772 epoch 10 - iter 84/146 - loss 0.01022687 - time (sec): 5.30 - samples/sec: 4821.53 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 21:17:16,892 epoch 10 - iter 98/146 - loss 0.01001899 - time (sec): 6.42 - samples/sec: 4800.88 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 21:17:17,707 epoch 10 - iter 112/146 - loss 0.01060283 - time (sec): 7.23 - samples/sec: 4758.86 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 21:17:18,601 epoch 10 - iter 126/146 - loss 0.01066012 - time (sec): 8.13 - samples/sec: 4714.79 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 21:17:19,602 epoch 10 - iter 140/146 - loss 0.01073663 - time (sec): 9.13 - samples/sec: 4665.36 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 21:17:19,923 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:17:19,923 EPOCH 10 done: loss 0.0103 - lr: 0.000000 |
|
2023-10-25 21:17:20,844 DEV : loss 0.17254038155078888 - f1-score (micro avg) 0.7242 |
|
2023-10-25 21:17:21,379 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:17:21,381 Loading model from best epoch ... |
|
2023-10-25 21:17:23,108 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 21:17:24,656 |
|
Results: |
|
- F-score (micro) 0.7652 |
|
- F-score (macro) 0.6761 |
|
- Accuracy 0.6435 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7813 0.8420 0.8105 348 |
|
LOC 0.7536 0.7969 0.7747 261 |
|
ORG 0.4222 0.3654 0.3918 52 |
|
HumanProd 0.7273 0.7273 0.7273 22 |
|
|
|
micro avg 0.7465 0.7848 0.7652 683 |
|
macro avg 0.6711 0.6829 0.6761 683 |
|
weighted avg 0.7417 0.7848 0.7623 683 |
|
|
|
2023-10-25 21:17:24,656 ---------------------------------------------------------------------------------------------------- |
|
|