|
2023-10-16 20:21:36,765 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:21:36,766 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 20:21:36,766 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:21:36,767 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-16 20:21:36,767 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:21:36,767 Train: 1085 sentences |
|
2023-10-16 20:21:36,767 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 20:21:36,767 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:21:36,767 Training Params: |
|
2023-10-16 20:21:36,767 - learning_rate: "3e-05" |
|
2023-10-16 20:21:36,767 - mini_batch_size: "4" |
|
2023-10-16 20:21:36,767 - max_epochs: "10" |
|
2023-10-16 20:21:36,767 - shuffle: "True" |
|
2023-10-16 20:21:36,767 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:21:36,767 Plugins: |
|
2023-10-16 20:21:36,767 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 20:21:36,767 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:21:36,767 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 20:21:36,767 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 20:21:36,767 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:21:36,767 Computation: |
|
2023-10-16 20:21:36,767 - compute on device: cuda:0 |
|
2023-10-16 20:21:36,767 - embedding storage: none |
|
2023-10-16 20:21:36,767 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:21:36,767 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-16 20:21:36,767 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:21:36,767 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:21:38,339 epoch 1 - iter 27/272 - loss 2.78403560 - time (sec): 1.57 - samples/sec: 3229.32 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 20:21:40,094 epoch 1 - iter 54/272 - loss 2.40057045 - time (sec): 3.33 - samples/sec: 3248.68 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 20:21:41,731 epoch 1 - iter 81/272 - loss 1.82624760 - time (sec): 4.96 - samples/sec: 3345.86 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 20:21:43,315 epoch 1 - iter 108/272 - loss 1.52858521 - time (sec): 6.55 - samples/sec: 3282.52 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 20:21:44,848 epoch 1 - iter 135/272 - loss 1.32380649 - time (sec): 8.08 - samples/sec: 3297.07 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 20:21:46,418 epoch 1 - iter 162/272 - loss 1.18197180 - time (sec): 9.65 - samples/sec: 3300.84 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 20:21:47,884 epoch 1 - iter 189/272 - loss 1.05227075 - time (sec): 11.12 - samples/sec: 3314.13 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 20:21:49,438 epoch 1 - iter 216/272 - loss 0.95821963 - time (sec): 12.67 - samples/sec: 3311.69 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 20:21:50,892 epoch 1 - iter 243/272 - loss 0.89248220 - time (sec): 14.12 - samples/sec: 3299.70 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 20:21:52,460 epoch 1 - iter 270/272 - loss 0.82752649 - time (sec): 15.69 - samples/sec: 3304.71 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 20:21:52,549 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:21:52,550 EPOCH 1 done: loss 0.8257 - lr: 0.000030 |
|
2023-10-16 20:21:53,648 DEV : loss 0.20883718132972717 - f1-score (micro avg) 0.5635 |
|
2023-10-16 20:21:53,653 saving best model |
|
2023-10-16 20:21:54,009 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:21:55,510 epoch 2 - iter 27/272 - loss 0.23865211 - time (sec): 1.50 - samples/sec: 3012.03 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 20:21:57,145 epoch 2 - iter 54/272 - loss 0.21111499 - time (sec): 3.13 - samples/sec: 2990.38 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 20:21:58,855 epoch 2 - iter 81/272 - loss 0.18236592 - time (sec): 4.84 - samples/sec: 3134.51 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 20:22:00,503 epoch 2 - iter 108/272 - loss 0.16340264 - time (sec): 6.49 - samples/sec: 3219.69 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 20:22:01,918 epoch 2 - iter 135/272 - loss 0.17367697 - time (sec): 7.91 - samples/sec: 3226.50 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 20:22:03,448 epoch 2 - iter 162/272 - loss 0.16437164 - time (sec): 9.44 - samples/sec: 3253.57 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 20:22:05,013 epoch 2 - iter 189/272 - loss 0.16007430 - time (sec): 11.00 - samples/sec: 3234.67 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 20:22:06,521 epoch 2 - iter 216/272 - loss 0.15664843 - time (sec): 12.51 - samples/sec: 3289.42 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 20:22:08,169 epoch 2 - iter 243/272 - loss 0.16161312 - time (sec): 14.16 - samples/sec: 3267.43 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 20:22:09,760 epoch 2 - iter 270/272 - loss 0.15717506 - time (sec): 15.75 - samples/sec: 3288.18 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 20:22:09,865 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:22:09,865 EPOCH 2 done: loss 0.1566 - lr: 0.000027 |
|
2023-10-16 20:22:11,292 DEV : loss 0.11095353215932846 - f1-score (micro avg) 0.7437 |
|
2023-10-16 20:22:11,299 saving best model |
|
2023-10-16 20:22:11,759 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:22:13,277 epoch 3 - iter 27/272 - loss 0.11749999 - time (sec): 1.52 - samples/sec: 2858.87 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 20:22:14,910 epoch 3 - iter 54/272 - loss 0.11281766 - time (sec): 3.15 - samples/sec: 3016.40 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 20:22:16,349 epoch 3 - iter 81/272 - loss 0.10816546 - time (sec): 4.59 - samples/sec: 3086.19 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 20:22:17,971 epoch 3 - iter 108/272 - loss 0.09957259 - time (sec): 6.21 - samples/sec: 3190.98 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 20:22:19,509 epoch 3 - iter 135/272 - loss 0.09598081 - time (sec): 7.75 - samples/sec: 3248.08 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 20:22:21,106 epoch 3 - iter 162/272 - loss 0.09855270 - time (sec): 9.35 - samples/sec: 3319.72 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 20:22:22,721 epoch 3 - iter 189/272 - loss 0.09542228 - time (sec): 10.96 - samples/sec: 3281.50 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 20:22:24,159 epoch 3 - iter 216/272 - loss 0.09125236 - time (sec): 12.40 - samples/sec: 3244.45 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 20:22:25,925 epoch 3 - iter 243/272 - loss 0.09163871 - time (sec): 14.16 - samples/sec: 3249.55 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 20:22:27,581 epoch 3 - iter 270/272 - loss 0.09515133 - time (sec): 15.82 - samples/sec: 3269.62 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 20:22:27,698 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:22:27,698 EPOCH 3 done: loss 0.0950 - lr: 0.000023 |
|
2023-10-16 20:22:29,168 DEV : loss 0.1084960475564003 - f1-score (micro avg) 0.7687 |
|
2023-10-16 20:22:29,173 saving best model |
|
2023-10-16 20:22:29,660 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:22:31,232 epoch 4 - iter 27/272 - loss 0.08673095 - time (sec): 1.57 - samples/sec: 3367.90 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 20:22:32,904 epoch 4 - iter 54/272 - loss 0.06096268 - time (sec): 3.24 - samples/sec: 3366.04 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 20:22:34,929 epoch 4 - iter 81/272 - loss 0.05277234 - time (sec): 5.27 - samples/sec: 3191.79 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 20:22:36,440 epoch 4 - iter 108/272 - loss 0.05214582 - time (sec): 6.78 - samples/sec: 3170.65 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 20:22:37,846 epoch 4 - iter 135/272 - loss 0.05275227 - time (sec): 8.18 - samples/sec: 3140.46 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 20:22:39,337 epoch 4 - iter 162/272 - loss 0.05294019 - time (sec): 9.68 - samples/sec: 3194.65 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 20:22:40,847 epoch 4 - iter 189/272 - loss 0.05256511 - time (sec): 11.19 - samples/sec: 3191.38 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 20:22:42,403 epoch 4 - iter 216/272 - loss 0.05388890 - time (sec): 12.74 - samples/sec: 3187.35 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 20:22:43,889 epoch 4 - iter 243/272 - loss 0.05397102 - time (sec): 14.23 - samples/sec: 3216.73 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 20:22:45,507 epoch 4 - iter 270/272 - loss 0.05116201 - time (sec): 15.85 - samples/sec: 3274.71 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 20:22:45,590 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:22:45,590 EPOCH 4 done: loss 0.0511 - lr: 0.000020 |
|
2023-10-16 20:22:47,042 DEV : loss 0.11970058083534241 - f1-score (micro avg) 0.7956 |
|
2023-10-16 20:22:47,047 saving best model |
|
2023-10-16 20:22:47,506 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:22:49,124 epoch 5 - iter 27/272 - loss 0.02705900 - time (sec): 1.62 - samples/sec: 3193.66 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 20:22:50,547 epoch 5 - iter 54/272 - loss 0.02506314 - time (sec): 3.04 - samples/sec: 3127.97 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 20:22:52,155 epoch 5 - iter 81/272 - loss 0.03031274 - time (sec): 4.65 - samples/sec: 3124.88 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 20:22:53,730 epoch 5 - iter 108/272 - loss 0.03097207 - time (sec): 6.22 - samples/sec: 3159.63 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 20:22:55,225 epoch 5 - iter 135/272 - loss 0.02985994 - time (sec): 7.72 - samples/sec: 3242.28 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 20:22:56,755 epoch 5 - iter 162/272 - loss 0.03208358 - time (sec): 9.25 - samples/sec: 3243.43 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 20:22:58,587 epoch 5 - iter 189/272 - loss 0.03465943 - time (sec): 11.08 - samples/sec: 3296.53 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 20:23:00,310 epoch 5 - iter 216/272 - loss 0.03639915 - time (sec): 12.80 - samples/sec: 3304.91 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 20:23:01,828 epoch 5 - iter 243/272 - loss 0.03542833 - time (sec): 14.32 - samples/sec: 3266.68 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 20:23:03,373 epoch 5 - iter 270/272 - loss 0.03755505 - time (sec): 15.87 - samples/sec: 3259.66 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 20:23:03,466 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:23:03,466 EPOCH 5 done: loss 0.0374 - lr: 0.000017 |
|
2023-10-16 20:23:04,927 DEV : loss 0.1268547922372818 - f1-score (micro avg) 0.7971 |
|
2023-10-16 20:23:04,931 saving best model |
|
2023-10-16 20:23:05,417 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:23:07,117 epoch 6 - iter 27/272 - loss 0.00811967 - time (sec): 1.70 - samples/sec: 3380.93 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 20:23:08,639 epoch 6 - iter 54/272 - loss 0.01425739 - time (sec): 3.22 - samples/sec: 3222.56 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 20:23:10,127 epoch 6 - iter 81/272 - loss 0.01768857 - time (sec): 4.71 - samples/sec: 3212.83 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 20:23:11,760 epoch 6 - iter 108/272 - loss 0.02554340 - time (sec): 6.34 - samples/sec: 3228.81 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 20:23:13,459 epoch 6 - iter 135/272 - loss 0.02481300 - time (sec): 8.04 - samples/sec: 3216.75 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 20:23:14,910 epoch 6 - iter 162/272 - loss 0.02810161 - time (sec): 9.49 - samples/sec: 3189.64 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 20:23:16,351 epoch 6 - iter 189/272 - loss 0.02581547 - time (sec): 10.93 - samples/sec: 3166.18 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 20:23:17,969 epoch 6 - iter 216/272 - loss 0.02439654 - time (sec): 12.55 - samples/sec: 3191.87 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 20:23:19,576 epoch 6 - iter 243/272 - loss 0.02443384 - time (sec): 14.16 - samples/sec: 3219.27 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 20:23:21,284 epoch 6 - iter 270/272 - loss 0.02556831 - time (sec): 15.87 - samples/sec: 3253.43 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 20:23:21,383 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:23:21,383 EPOCH 6 done: loss 0.0254 - lr: 0.000013 |
|
2023-10-16 20:23:22,856 DEV : loss 0.15687650442123413 - f1-score (micro avg) 0.8 |
|
2023-10-16 20:23:22,860 saving best model |
|
2023-10-16 20:23:23,331 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:23:25,063 epoch 7 - iter 27/272 - loss 0.01894596 - time (sec): 1.73 - samples/sec: 3546.75 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 20:23:26,630 epoch 7 - iter 54/272 - loss 0.01748641 - time (sec): 3.30 - samples/sec: 3316.07 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 20:23:28,014 epoch 7 - iter 81/272 - loss 0.02121757 - time (sec): 4.68 - samples/sec: 3177.52 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 20:23:29,587 epoch 7 - iter 108/272 - loss 0.02299563 - time (sec): 6.25 - samples/sec: 3206.07 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 20:23:31,102 epoch 7 - iter 135/272 - loss 0.02088311 - time (sec): 7.77 - samples/sec: 3207.48 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 20:23:32,693 epoch 7 - iter 162/272 - loss 0.01951963 - time (sec): 9.36 - samples/sec: 3182.35 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 20:23:34,322 epoch 7 - iter 189/272 - loss 0.01987080 - time (sec): 10.99 - samples/sec: 3220.45 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 20:23:36,323 epoch 7 - iter 216/272 - loss 0.02118412 - time (sec): 12.99 - samples/sec: 3197.06 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 20:23:37,912 epoch 7 - iter 243/272 - loss 0.02066351 - time (sec): 14.58 - samples/sec: 3201.54 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 20:23:39,484 epoch 7 - iter 270/272 - loss 0.02156439 - time (sec): 16.15 - samples/sec: 3210.50 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 20:23:39,569 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:23:39,569 EPOCH 7 done: loss 0.0215 - lr: 0.000010 |
|
2023-10-16 20:23:41,023 DEV : loss 0.14880990982055664 - f1-score (micro avg) 0.8177 |
|
2023-10-16 20:23:41,028 saving best model |
|
2023-10-16 20:23:41,490 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:23:43,041 epoch 8 - iter 27/272 - loss 0.02885847 - time (sec): 1.55 - samples/sec: 3406.76 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 20:23:44,729 epoch 8 - iter 54/272 - loss 0.02025245 - time (sec): 3.24 - samples/sec: 3261.21 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 20:23:46,167 epoch 8 - iter 81/272 - loss 0.01914734 - time (sec): 4.68 - samples/sec: 3213.83 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 20:23:47,822 epoch 8 - iter 108/272 - loss 0.01626159 - time (sec): 6.33 - samples/sec: 3259.52 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 20:23:49,237 epoch 8 - iter 135/272 - loss 0.01407233 - time (sec): 7.75 - samples/sec: 3303.45 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 20:23:50,883 epoch 8 - iter 162/272 - loss 0.01494253 - time (sec): 9.39 - samples/sec: 3278.17 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 20:23:52,476 epoch 8 - iter 189/272 - loss 0.01366276 - time (sec): 10.99 - samples/sec: 3228.69 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 20:23:54,141 epoch 8 - iter 216/272 - loss 0.01370213 - time (sec): 12.65 - samples/sec: 3285.37 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 20:23:55,686 epoch 8 - iter 243/272 - loss 0.01409608 - time (sec): 14.19 - samples/sec: 3286.25 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 20:23:57,435 epoch 8 - iter 270/272 - loss 0.01362443 - time (sec): 15.94 - samples/sec: 3254.98 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 20:23:57,523 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:23:57,524 EPOCH 8 done: loss 0.0136 - lr: 0.000007 |
|
2023-10-16 20:23:58,981 DEV : loss 0.16685882210731506 - f1-score (micro avg) 0.8103 |
|
2023-10-16 20:23:58,988 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:24:00,624 epoch 9 - iter 27/272 - loss 0.00599380 - time (sec): 1.63 - samples/sec: 3348.82 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 20:24:02,276 epoch 9 - iter 54/272 - loss 0.01311148 - time (sec): 3.29 - samples/sec: 3429.61 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 20:24:03,891 epoch 9 - iter 81/272 - loss 0.01123314 - time (sec): 4.90 - samples/sec: 3443.51 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 20:24:05,413 epoch 9 - iter 108/272 - loss 0.01353945 - time (sec): 6.42 - samples/sec: 3323.39 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 20:24:07,043 epoch 9 - iter 135/272 - loss 0.01225708 - time (sec): 8.05 - samples/sec: 3282.12 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 20:24:08,759 epoch 9 - iter 162/272 - loss 0.01190655 - time (sec): 9.77 - samples/sec: 3230.90 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 20:24:10,412 epoch 9 - iter 189/272 - loss 0.01157508 - time (sec): 11.42 - samples/sec: 3213.24 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 20:24:11,943 epoch 9 - iter 216/272 - loss 0.01177737 - time (sec): 12.95 - samples/sec: 3230.00 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 20:24:13,506 epoch 9 - iter 243/272 - loss 0.01213604 - time (sec): 14.52 - samples/sec: 3218.45 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 20:24:14,983 epoch 9 - iter 270/272 - loss 0.01161932 - time (sec): 15.99 - samples/sec: 3226.42 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 20:24:15,106 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:24:15,106 EPOCH 9 done: loss 0.0115 - lr: 0.000003 |
|
2023-10-16 20:24:16,622 DEV : loss 0.16301465034484863 - f1-score (micro avg) 0.8327 |
|
2023-10-16 20:24:16,627 saving best model |
|
2023-10-16 20:24:17,102 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:24:18,760 epoch 10 - iter 27/272 - loss 0.01320646 - time (sec): 1.66 - samples/sec: 2935.45 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 20:24:20,380 epoch 10 - iter 54/272 - loss 0.01195513 - time (sec): 3.28 - samples/sec: 3000.55 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 20:24:22,050 epoch 10 - iter 81/272 - loss 0.01072011 - time (sec): 4.95 - samples/sec: 3165.10 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 20:24:23,640 epoch 10 - iter 108/272 - loss 0.01026115 - time (sec): 6.54 - samples/sec: 3232.53 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 20:24:25,083 epoch 10 - iter 135/272 - loss 0.01064150 - time (sec): 7.98 - samples/sec: 3247.32 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 20:24:26,612 epoch 10 - iter 162/272 - loss 0.00997044 - time (sec): 9.51 - samples/sec: 3235.87 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 20:24:28,274 epoch 10 - iter 189/272 - loss 0.01040692 - time (sec): 11.17 - samples/sec: 3250.02 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 20:24:30,075 epoch 10 - iter 216/272 - loss 0.00966299 - time (sec): 12.97 - samples/sec: 3285.20 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 20:24:31,522 epoch 10 - iter 243/272 - loss 0.00976536 - time (sec): 14.42 - samples/sec: 3264.05 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 20:24:32,976 epoch 10 - iter 270/272 - loss 0.00978654 - time (sec): 15.87 - samples/sec: 3252.31 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 20:24:33,097 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:24:33,097 EPOCH 10 done: loss 0.0097 - lr: 0.000000 |
|
2023-10-16 20:24:34,795 DEV : loss 0.1661917269229889 - f1-score (micro avg) 0.8155 |
|
2023-10-16 20:24:35,171 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:24:35,172 Loading model from best epoch ... |
|
2023-10-16 20:24:36,716 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-16 20:24:38,779 |
|
Results: |
|
- F-score (micro) 0.7838 |
|
- F-score (macro) 0.7384 |
|
- Accuracy 0.6582 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8138 0.8686 0.8403 312 |
|
PER 0.6914 0.8510 0.7629 208 |
|
ORG 0.5217 0.4364 0.4752 55 |
|
HumanProd 0.8077 0.9545 0.8750 22 |
|
|
|
micro avg 0.7458 0.8258 0.7838 597 |
|
macro avg 0.7087 0.7776 0.7384 597 |
|
weighted avg 0.7440 0.8258 0.7810 597 |
|
|
|
2023-10-16 20:24:38,779 ---------------------------------------------------------------------------------------------------- |
|
|