|
2023-10-16 19:41:39,866 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:41:39,867 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 19:41:39,867 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:41:39,867 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-16 19:41:39,867 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:41:39,867 Train: 1085 sentences |
|
2023-10-16 19:41:39,867 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 19:41:39,867 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:41:39,867 Training Params: |
|
2023-10-16 19:41:39,867 - learning_rate: "3e-05" |
|
2023-10-16 19:41:39,867 - mini_batch_size: "4" |
|
2023-10-16 19:41:39,867 - max_epochs: "10" |
|
2023-10-16 19:41:39,867 - shuffle: "True" |
|
2023-10-16 19:41:39,867 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:41:39,867 Plugins: |
|
2023-10-16 19:41:39,867 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 19:41:39,867 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:41:39,867 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 19:41:39,867 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 19:41:39,867 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:41:39,867 Computation: |
|
2023-10-16 19:41:39,867 - compute on device: cuda:0 |
|
2023-10-16 19:41:39,867 - embedding storage: none |
|
2023-10-16 19:41:39,868 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:41:39,868 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-16 19:41:39,868 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:41:39,868 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:41:41,668 epoch 1 - iter 27/272 - loss 3.04057792 - time (sec): 1.80 - samples/sec: 3488.66 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 19:41:43,128 epoch 1 - iter 54/272 - loss 2.72238985 - time (sec): 3.26 - samples/sec: 3362.51 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 19:41:44,623 epoch 1 - iter 81/272 - loss 2.12940151 - time (sec): 4.75 - samples/sec: 3384.70 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 19:41:46,156 epoch 1 - iter 108/272 - loss 1.75711148 - time (sec): 6.29 - samples/sec: 3316.36 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 19:41:47,862 epoch 1 - iter 135/272 - loss 1.46487207 - time (sec): 7.99 - samples/sec: 3304.13 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 19:41:49,429 epoch 1 - iter 162/272 - loss 1.28960596 - time (sec): 9.56 - samples/sec: 3305.75 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 19:41:51,075 epoch 1 - iter 189/272 - loss 1.13468045 - time (sec): 11.21 - samples/sec: 3345.50 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 19:41:52,592 epoch 1 - iter 216/272 - loss 1.04586991 - time (sec): 12.72 - samples/sec: 3294.89 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 19:41:54,223 epoch 1 - iter 243/272 - loss 0.96986172 - time (sec): 14.35 - samples/sec: 3259.73 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 19:41:55,991 epoch 1 - iter 270/272 - loss 0.90362345 - time (sec): 16.12 - samples/sec: 3202.49 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 19:41:56,121 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:41:56,121 EPOCH 1 done: loss 0.9000 - lr: 0.000030 |
|
2023-10-16 19:41:57,268 DEV : loss 0.17124275863170624 - f1-score (micro avg) 0.6407 |
|
2023-10-16 19:41:57,273 saving best model |
|
2023-10-16 19:41:57,710 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:41:59,416 epoch 2 - iter 27/272 - loss 0.16607614 - time (sec): 1.70 - samples/sec: 3155.34 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 19:42:01,169 epoch 2 - iter 54/272 - loss 0.16049108 - time (sec): 3.46 - samples/sec: 3150.41 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 19:42:03,073 epoch 2 - iter 81/272 - loss 0.16338692 - time (sec): 5.36 - samples/sec: 3221.47 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 19:42:04,801 epoch 2 - iter 108/272 - loss 0.17411210 - time (sec): 7.09 - samples/sec: 3138.75 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 19:42:06,360 epoch 2 - iter 135/272 - loss 0.17518551 - time (sec): 8.65 - samples/sec: 3103.12 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 19:42:08,005 epoch 2 - iter 162/272 - loss 0.16738460 - time (sec): 10.29 - samples/sec: 3111.22 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 19:42:09,542 epoch 2 - iter 189/272 - loss 0.16655385 - time (sec): 11.83 - samples/sec: 3067.92 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 19:42:11,189 epoch 2 - iter 216/272 - loss 0.15624382 - time (sec): 13.48 - samples/sec: 3131.93 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 19:42:12,694 epoch 2 - iter 243/272 - loss 0.15627730 - time (sec): 14.98 - samples/sec: 3110.53 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 19:42:14,304 epoch 2 - iter 270/272 - loss 0.15603499 - time (sec): 16.59 - samples/sec: 3109.70 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 19:42:14,439 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:42:14,440 EPOCH 2 done: loss 0.1554 - lr: 0.000027 |
|
2023-10-16 19:42:15,902 DEV : loss 0.10554348677396774 - f1-score (micro avg) 0.763 |
|
2023-10-16 19:42:15,907 saving best model |
|
2023-10-16 19:42:16,452 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:42:18,054 epoch 3 - iter 27/272 - loss 0.10392696 - time (sec): 1.60 - samples/sec: 3355.97 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 19:42:19,712 epoch 3 - iter 54/272 - loss 0.10890746 - time (sec): 3.26 - samples/sec: 3313.26 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 19:42:21,067 epoch 3 - iter 81/272 - loss 0.10086706 - time (sec): 4.61 - samples/sec: 3274.70 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 19:42:22,742 epoch 3 - iter 108/272 - loss 0.09780563 - time (sec): 6.29 - samples/sec: 3247.84 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 19:42:24,396 epoch 3 - iter 135/272 - loss 0.10021849 - time (sec): 7.94 - samples/sec: 3219.47 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 19:42:25,911 epoch 3 - iter 162/272 - loss 0.09720403 - time (sec): 9.46 - samples/sec: 3237.74 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 19:42:27,360 epoch 3 - iter 189/272 - loss 0.09662984 - time (sec): 10.90 - samples/sec: 3204.15 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 19:42:28,951 epoch 3 - iter 216/272 - loss 0.09189039 - time (sec): 12.50 - samples/sec: 3262.50 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 19:42:30,509 epoch 3 - iter 243/272 - loss 0.08816254 - time (sec): 14.05 - samples/sec: 3272.51 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 19:42:32,346 epoch 3 - iter 270/272 - loss 0.08535618 - time (sec): 15.89 - samples/sec: 3254.24 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 19:42:32,451 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:42:32,452 EPOCH 3 done: loss 0.0850 - lr: 0.000023 |
|
2023-10-16 19:42:33,920 DEV : loss 0.12464497238397598 - f1-score (micro avg) 0.7804 |
|
2023-10-16 19:42:33,925 saving best model |
|
2023-10-16 19:42:34,466 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:42:36,154 epoch 4 - iter 27/272 - loss 0.05096666 - time (sec): 1.69 - samples/sec: 2912.70 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 19:42:37,750 epoch 4 - iter 54/272 - loss 0.05596561 - time (sec): 3.28 - samples/sec: 2826.09 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 19:42:39,690 epoch 4 - iter 81/272 - loss 0.06377073 - time (sec): 5.22 - samples/sec: 2841.32 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 19:42:41,408 epoch 4 - iter 108/272 - loss 0.05733990 - time (sec): 6.94 - samples/sec: 2887.32 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 19:42:43,042 epoch 4 - iter 135/272 - loss 0.05884014 - time (sec): 8.57 - samples/sec: 2922.45 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 19:42:44,622 epoch 4 - iter 162/272 - loss 0.05374785 - time (sec): 10.15 - samples/sec: 2978.84 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 19:42:46,425 epoch 4 - iter 189/272 - loss 0.05327700 - time (sec): 11.96 - samples/sec: 2977.07 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 19:42:48,339 epoch 4 - iter 216/272 - loss 0.05113520 - time (sec): 13.87 - samples/sec: 2960.29 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 19:42:50,077 epoch 4 - iter 243/272 - loss 0.04944726 - time (sec): 15.61 - samples/sec: 2958.44 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 19:42:51,797 epoch 4 - iter 270/272 - loss 0.04935030 - time (sec): 17.33 - samples/sec: 2994.60 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 19:42:51,895 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:42:51,895 EPOCH 4 done: loss 0.0502 - lr: 0.000020 |
|
2023-10-16 19:42:53,362 DEV : loss 0.13258929550647736 - f1-score (micro avg) 0.7653 |
|
2023-10-16 19:42:53,367 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:42:54,971 epoch 5 - iter 27/272 - loss 0.02364095 - time (sec): 1.60 - samples/sec: 2831.61 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 19:42:56,597 epoch 5 - iter 54/272 - loss 0.02260480 - time (sec): 3.23 - samples/sec: 2967.66 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 19:42:58,320 epoch 5 - iter 81/272 - loss 0.02465063 - time (sec): 4.95 - samples/sec: 3063.40 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 19:43:00,000 epoch 5 - iter 108/272 - loss 0.02741689 - time (sec): 6.63 - samples/sec: 3103.35 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 19:43:01,706 epoch 5 - iter 135/272 - loss 0.02512919 - time (sec): 8.34 - samples/sec: 3071.33 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 19:43:03,345 epoch 5 - iter 162/272 - loss 0.03049677 - time (sec): 9.98 - samples/sec: 3105.01 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 19:43:05,128 epoch 5 - iter 189/272 - loss 0.03343267 - time (sec): 11.76 - samples/sec: 3086.84 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 19:43:06,775 epoch 5 - iter 216/272 - loss 0.03294967 - time (sec): 13.41 - samples/sec: 3096.49 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 19:43:08,424 epoch 5 - iter 243/272 - loss 0.03334278 - time (sec): 15.06 - samples/sec: 3078.80 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 19:43:10,071 epoch 5 - iter 270/272 - loss 0.03395116 - time (sec): 16.70 - samples/sec: 3089.94 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 19:43:10,180 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:43:10,180 EPOCH 5 done: loss 0.0341 - lr: 0.000017 |
|
2023-10-16 19:43:11,658 DEV : loss 0.13392110168933868 - f1-score (micro avg) 0.7877 |
|
2023-10-16 19:43:11,665 saving best model |
|
2023-10-16 19:43:12,154 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:43:13,763 epoch 6 - iter 27/272 - loss 0.03778512 - time (sec): 1.61 - samples/sec: 3168.40 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 19:43:15,374 epoch 6 - iter 54/272 - loss 0.02722681 - time (sec): 3.22 - samples/sec: 3206.30 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 19:43:16,848 epoch 6 - iter 81/272 - loss 0.02809537 - time (sec): 4.69 - samples/sec: 3251.07 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 19:43:18,290 epoch 6 - iter 108/272 - loss 0.02507468 - time (sec): 6.13 - samples/sec: 3230.08 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 19:43:19,863 epoch 6 - iter 135/272 - loss 0.02724857 - time (sec): 7.71 - samples/sec: 3296.89 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 19:43:21,484 epoch 6 - iter 162/272 - loss 0.02825793 - time (sec): 9.33 - samples/sec: 3335.91 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 19:43:23,031 epoch 6 - iter 189/272 - loss 0.02674067 - time (sec): 10.87 - samples/sec: 3349.13 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 19:43:24,758 epoch 6 - iter 216/272 - loss 0.02551610 - time (sec): 12.60 - samples/sec: 3350.01 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 19:43:26,259 epoch 6 - iter 243/272 - loss 0.02385448 - time (sec): 14.10 - samples/sec: 3348.26 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 19:43:27,780 epoch 6 - iter 270/272 - loss 0.02470491 - time (sec): 15.62 - samples/sec: 3322.18 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 19:43:27,864 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:43:27,864 EPOCH 6 done: loss 0.0248 - lr: 0.000013 |
|
2023-10-16 19:43:29,321 DEV : loss 0.14046281576156616 - f1-score (micro avg) 0.8 |
|
2023-10-16 19:43:29,326 saving best model |
|
2023-10-16 19:43:29,947 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:43:31,708 epoch 7 - iter 27/272 - loss 0.00860986 - time (sec): 1.76 - samples/sec: 3047.31 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 19:43:33,205 epoch 7 - iter 54/272 - loss 0.01502937 - time (sec): 3.25 - samples/sec: 3087.39 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 19:43:34,844 epoch 7 - iter 81/272 - loss 0.01762492 - time (sec): 4.89 - samples/sec: 3264.70 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 19:43:36,281 epoch 7 - iter 108/272 - loss 0.02130141 - time (sec): 6.33 - samples/sec: 3192.36 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 19:43:37,828 epoch 7 - iter 135/272 - loss 0.01919742 - time (sec): 7.88 - samples/sec: 3163.69 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 19:43:39,439 epoch 7 - iter 162/272 - loss 0.01892265 - time (sec): 9.49 - samples/sec: 3243.27 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 19:43:41,107 epoch 7 - iter 189/272 - loss 0.01706367 - time (sec): 11.16 - samples/sec: 3282.93 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 19:43:42,766 epoch 7 - iter 216/272 - loss 0.01961053 - time (sec): 12.81 - samples/sec: 3272.44 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 19:43:44,301 epoch 7 - iter 243/272 - loss 0.01988748 - time (sec): 14.35 - samples/sec: 3269.06 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 19:43:45,857 epoch 7 - iter 270/272 - loss 0.01967655 - time (sec): 15.91 - samples/sec: 3261.43 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 19:43:45,938 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:43:45,938 EPOCH 7 done: loss 0.0196 - lr: 0.000010 |
|
2023-10-16 19:43:47,701 DEV : loss 0.15851223468780518 - f1-score (micro avg) 0.8296 |
|
2023-10-16 19:43:47,706 saving best model |
|
2023-10-16 19:43:48,144 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:43:49,737 epoch 8 - iter 27/272 - loss 0.00730779 - time (sec): 1.59 - samples/sec: 3356.08 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 19:43:51,346 epoch 8 - iter 54/272 - loss 0.01329991 - time (sec): 3.20 - samples/sec: 3283.33 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 19:43:52,910 epoch 8 - iter 81/272 - loss 0.01513609 - time (sec): 4.76 - samples/sec: 3232.24 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 19:43:54,423 epoch 8 - iter 108/272 - loss 0.01642677 - time (sec): 6.28 - samples/sec: 3279.86 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 19:43:55,951 epoch 8 - iter 135/272 - loss 0.01633534 - time (sec): 7.80 - samples/sec: 3248.26 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 19:43:57,780 epoch 8 - iter 162/272 - loss 0.01601286 - time (sec): 9.63 - samples/sec: 3295.91 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 19:43:59,225 epoch 8 - iter 189/272 - loss 0.01457084 - time (sec): 11.08 - samples/sec: 3283.86 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 19:44:00,755 epoch 8 - iter 216/272 - loss 0.01424786 - time (sec): 12.61 - samples/sec: 3295.96 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 19:44:02,216 epoch 8 - iter 243/272 - loss 0.01473901 - time (sec): 14.07 - samples/sec: 3267.45 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 19:44:03,955 epoch 8 - iter 270/272 - loss 0.01629383 - time (sec): 15.81 - samples/sec: 3280.70 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 19:44:04,041 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:44:04,042 EPOCH 8 done: loss 0.0163 - lr: 0.000007 |
|
2023-10-16 19:44:05,481 DEV : loss 0.1702307164669037 - f1-score (micro avg) 0.8185 |
|
2023-10-16 19:44:05,485 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:44:07,344 epoch 9 - iter 27/272 - loss 0.01425725 - time (sec): 1.86 - samples/sec: 3673.07 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 19:44:08,891 epoch 9 - iter 54/272 - loss 0.01007186 - time (sec): 3.40 - samples/sec: 3473.96 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 19:44:10,403 epoch 9 - iter 81/272 - loss 0.01074541 - time (sec): 4.92 - samples/sec: 3349.36 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 19:44:11,931 epoch 9 - iter 108/272 - loss 0.01073820 - time (sec): 6.44 - samples/sec: 3369.06 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 19:44:13,587 epoch 9 - iter 135/272 - loss 0.01165407 - time (sec): 8.10 - samples/sec: 3351.69 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 19:44:15,139 epoch 9 - iter 162/272 - loss 0.01157834 - time (sec): 9.65 - samples/sec: 3307.49 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 19:44:16,667 epoch 9 - iter 189/272 - loss 0.01155416 - time (sec): 11.18 - samples/sec: 3328.55 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 19:44:18,214 epoch 9 - iter 216/272 - loss 0.01153189 - time (sec): 12.73 - samples/sec: 3300.75 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 19:44:19,737 epoch 9 - iter 243/272 - loss 0.01160906 - time (sec): 14.25 - samples/sec: 3308.12 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 19:44:21,191 epoch 9 - iter 270/272 - loss 0.01116786 - time (sec): 15.70 - samples/sec: 3300.45 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 19:44:21,271 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:44:21,271 EPOCH 9 done: loss 0.0111 - lr: 0.000003 |
|
2023-10-16 19:44:22,704 DEV : loss 0.1694680005311966 - f1-score (micro avg) 0.8231 |
|
2023-10-16 19:44:22,708 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:44:24,190 epoch 10 - iter 27/272 - loss 0.01260807 - time (sec): 1.48 - samples/sec: 3696.73 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 19:44:25,500 epoch 10 - iter 54/272 - loss 0.00936227 - time (sec): 2.79 - samples/sec: 3440.58 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 19:44:27,074 epoch 10 - iter 81/272 - loss 0.00640857 - time (sec): 4.36 - samples/sec: 3416.96 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 19:44:28,554 epoch 10 - iter 108/272 - loss 0.00757260 - time (sec): 5.84 - samples/sec: 3438.35 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 19:44:30,020 epoch 10 - iter 135/272 - loss 0.00855156 - time (sec): 7.31 - samples/sec: 3436.34 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 19:44:31,751 epoch 10 - iter 162/272 - loss 0.00838144 - time (sec): 9.04 - samples/sec: 3406.32 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 19:44:33,309 epoch 10 - iter 189/272 - loss 0.00867403 - time (sec): 10.60 - samples/sec: 3403.41 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 19:44:34,793 epoch 10 - iter 216/272 - loss 0.00894913 - time (sec): 12.08 - samples/sec: 3383.57 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 19:44:36,325 epoch 10 - iter 243/272 - loss 0.00909194 - time (sec): 13.62 - samples/sec: 3365.60 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 19:44:37,949 epoch 10 - iter 270/272 - loss 0.00844017 - time (sec): 15.24 - samples/sec: 3399.90 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 19:44:38,030 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:44:38,030 EPOCH 10 done: loss 0.0084 - lr: 0.000000 |
|
2023-10-16 19:44:39,488 DEV : loss 0.17159999907016754 - f1-score (micro avg) 0.8185 |
|
2023-10-16 19:44:39,915 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:44:39,917 Loading model from best epoch ... |
|
2023-10-16 19:44:41,460 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-16 19:44:43,901 |
|
Results: |
|
- F-score (micro) 0.7683 |
|
- F-score (macro) 0.7159 |
|
- Accuracy 0.6392 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.7867 0.8750 0.8285 312 |
|
PER 0.6579 0.8413 0.7384 208 |
|
ORG 0.5556 0.3636 0.4396 55 |
|
HumanProd 0.7778 0.9545 0.8571 22 |
|
|
|
micro avg 0.7234 0.8191 0.7683 597 |
|
macro avg 0.6945 0.7586 0.7159 597 |
|
weighted avg 0.7202 0.8191 0.7623 597 |
|
|
|
2023-10-16 19:44:43,901 ---------------------------------------------------------------------------------------------------- |
|
|