|
2023-10-16 18:54:34,035 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:54:34,035 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 18:54:34,036 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:54:34,036 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-16 18:54:34,036 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:54:34,036 Train: 1166 sentences |
|
2023-10-16 18:54:34,036 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 18:54:34,036 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:54:34,036 Training Params: |
|
2023-10-16 18:54:34,036 - learning_rate: "3e-05" |
|
2023-10-16 18:54:34,036 - mini_batch_size: "8" |
|
2023-10-16 18:54:34,036 - max_epochs: "10" |
|
2023-10-16 18:54:34,036 - shuffle: "True" |
|
2023-10-16 18:54:34,036 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:54:34,036 Plugins: |
|
2023-10-16 18:54:34,036 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 18:54:34,036 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:54:34,036 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 18:54:34,036 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 18:54:34,036 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:54:34,036 Computation: |
|
2023-10-16 18:54:34,036 - compute on device: cuda:0 |
|
2023-10-16 18:54:34,036 - embedding storage: none |
|
2023-10-16 18:54:34,036 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:54:34,036 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-16 18:54:34,036 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:54:34,036 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:54:35,681 epoch 1 - iter 14/146 - loss 2.89280552 - time (sec): 1.64 - samples/sec: 2610.50 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:54:37,076 epoch 1 - iter 28/146 - loss 2.73874156 - time (sec): 3.04 - samples/sec: 2888.75 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:54:38,584 epoch 1 - iter 42/146 - loss 2.29234143 - time (sec): 4.55 - samples/sec: 2965.93 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:54:39,920 epoch 1 - iter 56/146 - loss 1.88317374 - time (sec): 5.88 - samples/sec: 3007.81 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:54:41,047 epoch 1 - iter 70/146 - loss 1.66977161 - time (sec): 7.01 - samples/sec: 3025.32 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:54:42,406 epoch 1 - iter 84/146 - loss 1.48284175 - time (sec): 8.37 - samples/sec: 3068.98 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:54:43,755 epoch 1 - iter 98/146 - loss 1.33440979 - time (sec): 9.72 - samples/sec: 3101.76 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:54:45,240 epoch 1 - iter 112/146 - loss 1.22069852 - time (sec): 11.20 - samples/sec: 3069.37 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:54:46,565 epoch 1 - iter 126/146 - loss 1.12851814 - time (sec): 12.53 - samples/sec: 3056.37 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:54:47,875 epoch 1 - iter 140/146 - loss 1.04787068 - time (sec): 13.84 - samples/sec: 3059.53 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:54:48,571 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:54:48,571 EPOCH 1 done: loss 1.0198 - lr: 0.000029 |
|
2023-10-16 18:54:49,351 DEV : loss 0.236769899725914 - f1-score (micro avg) 0.4108 |
|
2023-10-16 18:54:49,355 saving best model |
|
2023-10-16 18:54:49,712 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:54:51,269 epoch 2 - iter 14/146 - loss 0.27287495 - time (sec): 1.56 - samples/sec: 3346.29 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:54:52,612 epoch 2 - iter 28/146 - loss 0.23811369 - time (sec): 2.90 - samples/sec: 3279.52 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:54:54,155 epoch 2 - iter 42/146 - loss 0.23885534 - time (sec): 4.44 - samples/sec: 3039.18 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:54:55,553 epoch 2 - iter 56/146 - loss 0.24920038 - time (sec): 5.84 - samples/sec: 3078.63 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:54:57,214 epoch 2 - iter 70/146 - loss 0.25812295 - time (sec): 7.50 - samples/sec: 3058.61 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:54:58,622 epoch 2 - iter 84/146 - loss 0.26043063 - time (sec): 8.91 - samples/sec: 3077.57 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:54:59,716 epoch 2 - iter 98/146 - loss 0.25029883 - time (sec): 10.00 - samples/sec: 3104.15 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:55:00,911 epoch 2 - iter 112/146 - loss 0.24933642 - time (sec): 11.20 - samples/sec: 3086.48 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:55:02,263 epoch 2 - iter 126/146 - loss 0.23808665 - time (sec): 12.55 - samples/sec: 3113.09 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:55:03,515 epoch 2 - iter 140/146 - loss 0.23137554 - time (sec): 13.80 - samples/sec: 3103.82 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:55:04,018 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:55:04,018 EPOCH 2 done: loss 0.2292 - lr: 0.000027 |
|
2023-10-16 18:55:05,412 DEV : loss 0.15563829243183136 - f1-score (micro avg) 0.5482 |
|
2023-10-16 18:55:05,416 saving best model |
|
2023-10-16 18:55:05,890 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:55:07,247 epoch 3 - iter 14/146 - loss 0.17712828 - time (sec): 1.36 - samples/sec: 2920.16 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:55:08,550 epoch 3 - iter 28/146 - loss 0.18540699 - time (sec): 2.66 - samples/sec: 3131.81 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:55:10,053 epoch 3 - iter 42/146 - loss 0.15430173 - time (sec): 4.16 - samples/sec: 3064.67 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:55:11,470 epoch 3 - iter 56/146 - loss 0.15241278 - time (sec): 5.58 - samples/sec: 3051.84 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:55:13,005 epoch 3 - iter 70/146 - loss 0.14300126 - time (sec): 7.11 - samples/sec: 3061.86 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:55:14,469 epoch 3 - iter 84/146 - loss 0.13630069 - time (sec): 8.58 - samples/sec: 3035.39 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:55:15,955 epoch 3 - iter 98/146 - loss 0.14274446 - time (sec): 10.06 - samples/sec: 2994.63 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:55:17,298 epoch 3 - iter 112/146 - loss 0.14023351 - time (sec): 11.41 - samples/sec: 2996.09 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:55:18,689 epoch 3 - iter 126/146 - loss 0.13527969 - time (sec): 12.80 - samples/sec: 3000.38 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:55:19,901 epoch 3 - iter 140/146 - loss 0.13010389 - time (sec): 14.01 - samples/sec: 3004.41 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:55:20,702 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:55:20,702 EPOCH 3 done: loss 0.1321 - lr: 0.000024 |
|
2023-10-16 18:55:21,960 DEV : loss 0.12500031292438507 - f1-score (micro avg) 0.6379 |
|
2023-10-16 18:55:21,965 saving best model |
|
2023-10-16 18:55:22,411 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:55:23,772 epoch 4 - iter 14/146 - loss 0.10475455 - time (sec): 1.36 - samples/sec: 3129.71 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:55:25,231 epoch 4 - iter 28/146 - loss 0.08547014 - time (sec): 2.82 - samples/sec: 2900.87 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:55:26,586 epoch 4 - iter 42/146 - loss 0.08510852 - time (sec): 4.17 - samples/sec: 2925.21 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:55:27,936 epoch 4 - iter 56/146 - loss 0.08462186 - time (sec): 5.52 - samples/sec: 2879.49 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:55:29,630 epoch 4 - iter 70/146 - loss 0.08581286 - time (sec): 7.22 - samples/sec: 2980.98 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:55:30,941 epoch 4 - iter 84/146 - loss 0.08601370 - time (sec): 8.53 - samples/sec: 2968.75 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:55:32,216 epoch 4 - iter 98/146 - loss 0.08409253 - time (sec): 9.80 - samples/sec: 2992.89 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:55:33,639 epoch 4 - iter 112/146 - loss 0.08729243 - time (sec): 11.23 - samples/sec: 3004.44 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:55:35,173 epoch 4 - iter 126/146 - loss 0.08748336 - time (sec): 12.76 - samples/sec: 2998.96 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:55:36,841 epoch 4 - iter 140/146 - loss 0.08579317 - time (sec): 14.43 - samples/sec: 2973.70 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:55:37,315 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:55:37,315 EPOCH 4 done: loss 0.0855 - lr: 0.000020 |
|
2023-10-16 18:55:38,531 DEV : loss 0.1333526223897934 - f1-score (micro avg) 0.6763 |
|
2023-10-16 18:55:38,535 saving best model |
|
2023-10-16 18:55:38,990 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:55:40,503 epoch 5 - iter 14/146 - loss 0.06206087 - time (sec): 1.51 - samples/sec: 2656.04 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:55:41,840 epoch 5 - iter 28/146 - loss 0.05012670 - time (sec): 2.85 - samples/sec: 2861.11 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:55:43,584 epoch 5 - iter 42/146 - loss 0.06215053 - time (sec): 4.59 - samples/sec: 2766.92 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:55:44,940 epoch 5 - iter 56/146 - loss 0.05842795 - time (sec): 5.95 - samples/sec: 2912.88 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:55:46,219 epoch 5 - iter 70/146 - loss 0.06284010 - time (sec): 7.23 - samples/sec: 2977.43 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:55:47,319 epoch 5 - iter 84/146 - loss 0.06350029 - time (sec): 8.33 - samples/sec: 3032.44 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:55:48,804 epoch 5 - iter 98/146 - loss 0.06178057 - time (sec): 9.81 - samples/sec: 3048.15 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:55:50,202 epoch 5 - iter 112/146 - loss 0.05904463 - time (sec): 11.21 - samples/sec: 3045.78 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:55:51,581 epoch 5 - iter 126/146 - loss 0.05893695 - time (sec): 12.59 - samples/sec: 3040.24 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:55:53,253 epoch 5 - iter 140/146 - loss 0.05809481 - time (sec): 14.26 - samples/sec: 3008.75 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:55:53,721 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:55:53,721 EPOCH 5 done: loss 0.0587 - lr: 0.000017 |
|
2023-10-16 18:55:55,378 DEV : loss 0.12143861502408981 - f1-score (micro avg) 0.7161 |
|
2023-10-16 18:55:55,385 saving best model |
|
2023-10-16 18:55:55,933 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:55:57,461 epoch 6 - iter 14/146 - loss 0.03607615 - time (sec): 1.53 - samples/sec: 2876.46 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:55:58,944 epoch 6 - iter 28/146 - loss 0.04013210 - time (sec): 3.01 - samples/sec: 2893.84 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:56:00,407 epoch 6 - iter 42/146 - loss 0.03731182 - time (sec): 4.47 - samples/sec: 2934.51 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:56:01,782 epoch 6 - iter 56/146 - loss 0.03654630 - time (sec): 5.85 - samples/sec: 2938.20 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:56:03,614 epoch 6 - iter 70/146 - loss 0.03587700 - time (sec): 7.68 - samples/sec: 2902.37 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:56:05,056 epoch 6 - iter 84/146 - loss 0.03669627 - time (sec): 9.12 - samples/sec: 2903.38 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:56:06,226 epoch 6 - iter 98/146 - loss 0.03589081 - time (sec): 10.29 - samples/sec: 2922.68 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:56:07,783 epoch 6 - iter 112/146 - loss 0.03981007 - time (sec): 11.85 - samples/sec: 2936.13 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:56:08,944 epoch 6 - iter 126/146 - loss 0.03963558 - time (sec): 13.01 - samples/sec: 2928.91 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:56:10,418 epoch 6 - iter 140/146 - loss 0.04046337 - time (sec): 14.48 - samples/sec: 2941.68 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:56:11,044 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:56:11,045 EPOCH 6 done: loss 0.0402 - lr: 0.000014 |
|
2023-10-16 18:56:12,269 DEV : loss 0.1350891888141632 - f1-score (micro avg) 0.7328 |
|
2023-10-16 18:56:12,274 saving best model |
|
2023-10-16 18:56:12,725 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:56:14,584 epoch 7 - iter 14/146 - loss 0.04059843 - time (sec): 1.85 - samples/sec: 3017.29 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:56:15,941 epoch 7 - iter 28/146 - loss 0.03439852 - time (sec): 3.21 - samples/sec: 3017.46 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:56:17,252 epoch 7 - iter 42/146 - loss 0.03464459 - time (sec): 4.52 - samples/sec: 3035.40 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:56:18,668 epoch 7 - iter 56/146 - loss 0.03616284 - time (sec): 5.94 - samples/sec: 3026.48 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:56:20,057 epoch 7 - iter 70/146 - loss 0.03702150 - time (sec): 7.33 - samples/sec: 2918.68 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:56:21,626 epoch 7 - iter 84/146 - loss 0.03595560 - time (sec): 8.90 - samples/sec: 2929.39 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:56:22,843 epoch 7 - iter 98/146 - loss 0.03386155 - time (sec): 10.11 - samples/sec: 2956.83 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:56:24,376 epoch 7 - iter 112/146 - loss 0.03576602 - time (sec): 11.65 - samples/sec: 2931.94 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:56:25,581 epoch 7 - iter 126/146 - loss 0.03449554 - time (sec): 12.85 - samples/sec: 2986.90 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:56:27,198 epoch 7 - iter 140/146 - loss 0.03287051 - time (sec): 14.47 - samples/sec: 2963.53 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:56:27,799 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:56:27,799 EPOCH 7 done: loss 0.0323 - lr: 0.000010 |
|
2023-10-16 18:56:29,039 DEV : loss 0.1452094465494156 - f1-score (micro avg) 0.7069 |
|
2023-10-16 18:56:29,044 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:56:30,420 epoch 8 - iter 14/146 - loss 0.03251634 - time (sec): 1.37 - samples/sec: 3243.02 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:56:31,782 epoch 8 - iter 28/146 - loss 0.02633798 - time (sec): 2.74 - samples/sec: 3119.42 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:56:33,117 epoch 8 - iter 42/146 - loss 0.02260259 - time (sec): 4.07 - samples/sec: 2996.68 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:56:34,622 epoch 8 - iter 56/146 - loss 0.02145170 - time (sec): 5.58 - samples/sec: 3035.64 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:56:36,008 epoch 8 - iter 70/146 - loss 0.02157167 - time (sec): 6.96 - samples/sec: 2995.86 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:56:37,531 epoch 8 - iter 84/146 - loss 0.02074857 - time (sec): 8.49 - samples/sec: 3027.40 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:56:39,133 epoch 8 - iter 98/146 - loss 0.02173471 - time (sec): 10.09 - samples/sec: 2933.21 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:56:40,348 epoch 8 - iter 112/146 - loss 0.02670667 - time (sec): 11.30 - samples/sec: 2958.94 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:56:42,102 epoch 8 - iter 126/146 - loss 0.02625262 - time (sec): 13.06 - samples/sec: 2921.14 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:56:43,519 epoch 8 - iter 140/146 - loss 0.02732306 - time (sec): 14.47 - samples/sec: 2943.76 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:56:44,116 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:56:44,116 EPOCH 8 done: loss 0.0267 - lr: 0.000007 |
|
2023-10-16 18:56:45,516 DEV : loss 0.15337024629116058 - f1-score (micro avg) 0.7076 |
|
2023-10-16 18:56:45,521 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:56:47,100 epoch 9 - iter 14/146 - loss 0.00916726 - time (sec): 1.58 - samples/sec: 3041.55 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:56:48,428 epoch 9 - iter 28/146 - loss 0.00863155 - time (sec): 2.91 - samples/sec: 3022.22 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:56:49,833 epoch 9 - iter 42/146 - loss 0.01507339 - time (sec): 4.31 - samples/sec: 3022.94 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:56:51,260 epoch 9 - iter 56/146 - loss 0.01738903 - time (sec): 5.74 - samples/sec: 3053.70 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:56:52,983 epoch 9 - iter 70/146 - loss 0.01812648 - time (sec): 7.46 - samples/sec: 3057.65 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:56:54,276 epoch 9 - iter 84/146 - loss 0.01855582 - time (sec): 8.75 - samples/sec: 3027.00 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:56:55,586 epoch 9 - iter 98/146 - loss 0.01935023 - time (sec): 10.06 - samples/sec: 2994.77 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:56:57,016 epoch 9 - iter 112/146 - loss 0.01757770 - time (sec): 11.49 - samples/sec: 3000.50 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:56:58,360 epoch 9 - iter 126/146 - loss 0.01929319 - time (sec): 12.84 - samples/sec: 3006.26 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:56:59,890 epoch 9 - iter 140/146 - loss 0.02150382 - time (sec): 14.37 - samples/sec: 2987.86 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:57:00,424 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:57:00,424 EPOCH 9 done: loss 0.0221 - lr: 0.000004 |
|
2023-10-16 18:57:01,640 DEV : loss 0.1633269190788269 - f1-score (micro avg) 0.7034 |
|
2023-10-16 18:57:01,644 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:57:03,014 epoch 10 - iter 14/146 - loss 0.00942726 - time (sec): 1.37 - samples/sec: 2886.55 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:57:04,499 epoch 10 - iter 28/146 - loss 0.01186661 - time (sec): 2.85 - samples/sec: 2953.45 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:57:06,043 epoch 10 - iter 42/146 - loss 0.01639072 - time (sec): 4.40 - samples/sec: 2945.84 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:57:07,550 epoch 10 - iter 56/146 - loss 0.01613294 - time (sec): 5.91 - samples/sec: 3068.01 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:57:09,143 epoch 10 - iter 70/146 - loss 0.01704983 - time (sec): 7.50 - samples/sec: 3022.87 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:57:10,531 epoch 10 - iter 84/146 - loss 0.01632056 - time (sec): 8.89 - samples/sec: 3029.73 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:57:11,894 epoch 10 - iter 98/146 - loss 0.01654833 - time (sec): 10.25 - samples/sec: 2976.32 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:57:13,276 epoch 10 - iter 112/146 - loss 0.01649142 - time (sec): 11.63 - samples/sec: 2988.32 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:57:14,483 epoch 10 - iter 126/146 - loss 0.01757757 - time (sec): 12.84 - samples/sec: 3008.33 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:57:15,888 epoch 10 - iter 140/146 - loss 0.01985176 - time (sec): 14.24 - samples/sec: 3007.60 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 18:57:16,400 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:57:16,400 EPOCH 10 done: loss 0.0195 - lr: 0.000000 |
|
2023-10-16 18:57:17,645 DEV : loss 0.16719266772270203 - f1-score (micro avg) 0.7106 |
|
2023-10-16 18:57:18,010 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:57:18,011 Loading model from best epoch ... |
|
2023-10-16 18:57:19,453 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-16 18:57:21,781 |
|
Results: |
|
- F-score (micro) 0.751 |
|
- F-score (macro) 0.685 |
|
- Accuracy 0.6206 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8106 0.8362 0.8232 348 |
|
LOC 0.6495 0.8238 0.7264 261 |
|
ORG 0.3774 0.3846 0.3810 52 |
|
HumanProd 0.8500 0.7727 0.8095 22 |
|
|
|
micro avg 0.7117 0.7950 0.7510 683 |
|
macro avg 0.6719 0.7043 0.6850 683 |
|
weighted avg 0.7173 0.7950 0.7521 683 |
|
|
|
2023-10-16 18:57:21,781 ---------------------------------------------------------------------------------------------------- |
|
|