stefan-it's picture
Upload folder using huggingface_hub
3a9832b
raw
history blame
23.8 kB
2023-10-17 15:06:19,081 ----------------------------------------------------------------------------------------------------
2023-10-17 15:06:19,082 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 15:06:19,082 ----------------------------------------------------------------------------------------------------
2023-10-17 15:06:19,082 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-17 15:06:19,082 ----------------------------------------------------------------------------------------------------
2023-10-17 15:06:19,082 Train: 7936 sentences
2023-10-17 15:06:19,082 (train_with_dev=False, train_with_test=False)
2023-10-17 15:06:19,082 ----------------------------------------------------------------------------------------------------
2023-10-17 15:06:19,082 Training Params:
2023-10-17 15:06:19,082 - learning_rate: "5e-05"
2023-10-17 15:06:19,082 - mini_batch_size: "8"
2023-10-17 15:06:19,082 - max_epochs: "10"
2023-10-17 15:06:19,082 - shuffle: "True"
2023-10-17 15:06:19,082 ----------------------------------------------------------------------------------------------------
2023-10-17 15:06:19,083 Plugins:
2023-10-17 15:06:19,083 - TensorboardLogger
2023-10-17 15:06:19,083 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 15:06:19,083 ----------------------------------------------------------------------------------------------------
2023-10-17 15:06:19,083 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 15:06:19,083 - metric: "('micro avg', 'f1-score')"
2023-10-17 15:06:19,083 ----------------------------------------------------------------------------------------------------
2023-10-17 15:06:19,083 Computation:
2023-10-17 15:06:19,083 - compute on device: cuda:0
2023-10-17 15:06:19,083 - embedding storage: none
2023-10-17 15:06:19,083 ----------------------------------------------------------------------------------------------------
2023-10-17 15:06:19,083 Model training base path: "hmbench-icdar/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-17 15:06:19,083 ----------------------------------------------------------------------------------------------------
2023-10-17 15:06:19,083 ----------------------------------------------------------------------------------------------------
2023-10-17 15:06:19,083 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 15:06:24,942 epoch 1 - iter 99/992 - loss 2.38046963 - time (sec): 5.86 - samples/sec: 2753.36 - lr: 0.000005 - momentum: 0.000000
2023-10-17 15:06:31,168 epoch 1 - iter 198/992 - loss 1.37254739 - time (sec): 12.08 - samples/sec: 2777.31 - lr: 0.000010 - momentum: 0.000000
2023-10-17 15:06:37,005 epoch 1 - iter 297/992 - loss 1.00559337 - time (sec): 17.92 - samples/sec: 2749.10 - lr: 0.000015 - momentum: 0.000000
2023-10-17 15:06:42,788 epoch 1 - iter 396/992 - loss 0.80698866 - time (sec): 23.70 - samples/sec: 2751.86 - lr: 0.000020 - momentum: 0.000000
2023-10-17 15:06:48,423 epoch 1 - iter 495/992 - loss 0.68842093 - time (sec): 29.34 - samples/sec: 2747.14 - lr: 0.000025 - momentum: 0.000000
2023-10-17 15:06:54,254 epoch 1 - iter 594/992 - loss 0.59727087 - time (sec): 35.17 - samples/sec: 2756.67 - lr: 0.000030 - momentum: 0.000000
2023-10-17 15:07:00,603 epoch 1 - iter 693/992 - loss 0.52832358 - time (sec): 41.52 - samples/sec: 2742.95 - lr: 0.000035 - momentum: 0.000000
2023-10-17 15:07:06,517 epoch 1 - iter 792/992 - loss 0.47754192 - time (sec): 47.43 - samples/sec: 2751.53 - lr: 0.000040 - momentum: 0.000000
2023-10-17 15:07:12,414 epoch 1 - iter 891/992 - loss 0.43922362 - time (sec): 53.33 - samples/sec: 2756.93 - lr: 0.000045 - momentum: 0.000000
2023-10-17 15:07:18,496 epoch 1 - iter 990/992 - loss 0.40646285 - time (sec): 59.41 - samples/sec: 2754.47 - lr: 0.000050 - momentum: 0.000000
2023-10-17 15:07:18,609 ----------------------------------------------------------------------------------------------------
2023-10-17 15:07:18,609 EPOCH 1 done: loss 0.4058 - lr: 0.000050
2023-10-17 15:07:21,928 DEV : loss 0.08580786734819412 - f1-score (micro avg) 0.7132
2023-10-17 15:07:21,960 saving best model
2023-10-17 15:07:23,223 ----------------------------------------------------------------------------------------------------
2023-10-17 15:07:29,185 epoch 2 - iter 99/992 - loss 0.10472742 - time (sec): 5.96 - samples/sec: 2858.77 - lr: 0.000049 - momentum: 0.000000
2023-10-17 15:07:35,198 epoch 2 - iter 198/992 - loss 0.10274139 - time (sec): 11.97 - samples/sec: 2791.29 - lr: 0.000049 - momentum: 0.000000
2023-10-17 15:07:40,921 epoch 2 - iter 297/992 - loss 0.10453930 - time (sec): 17.70 - samples/sec: 2811.26 - lr: 0.000048 - momentum: 0.000000
2023-10-17 15:07:46,491 epoch 2 - iter 396/992 - loss 0.10618343 - time (sec): 23.27 - samples/sec: 2805.37 - lr: 0.000048 - momentum: 0.000000
2023-10-17 15:07:52,431 epoch 2 - iter 495/992 - loss 0.10477397 - time (sec): 29.21 - samples/sec: 2810.58 - lr: 0.000047 - momentum: 0.000000
2023-10-17 15:07:58,372 epoch 2 - iter 594/992 - loss 0.10424898 - time (sec): 35.15 - samples/sec: 2787.08 - lr: 0.000047 - momentum: 0.000000
2023-10-17 15:08:04,514 epoch 2 - iter 693/992 - loss 0.10574565 - time (sec): 41.29 - samples/sec: 2760.39 - lr: 0.000046 - momentum: 0.000000
2023-10-17 15:08:10,405 epoch 2 - iter 792/992 - loss 0.10502822 - time (sec): 47.18 - samples/sec: 2759.11 - lr: 0.000046 - momentum: 0.000000
2023-10-17 15:08:16,744 epoch 2 - iter 891/992 - loss 0.10440715 - time (sec): 53.52 - samples/sec: 2751.13 - lr: 0.000045 - momentum: 0.000000
2023-10-17 15:08:22,649 epoch 2 - iter 990/992 - loss 0.10635558 - time (sec): 59.42 - samples/sec: 2754.71 - lr: 0.000044 - momentum: 0.000000
2023-10-17 15:08:22,763 ----------------------------------------------------------------------------------------------------
2023-10-17 15:08:22,763 EPOCH 2 done: loss 0.1063 - lr: 0.000044
2023-10-17 15:08:26,289 DEV : loss 0.09309504926204681 - f1-score (micro avg) 0.7346
2023-10-17 15:08:26,310 saving best model
2023-10-17 15:08:26,957 ----------------------------------------------------------------------------------------------------
2023-10-17 15:08:33,050 epoch 3 - iter 99/992 - loss 0.07641069 - time (sec): 6.09 - samples/sec: 2495.54 - lr: 0.000044 - momentum: 0.000000
2023-10-17 15:08:39,473 epoch 3 - iter 198/992 - loss 0.07843575 - time (sec): 12.51 - samples/sec: 2516.95 - lr: 0.000043 - momentum: 0.000000
2023-10-17 15:08:45,808 epoch 3 - iter 297/992 - loss 0.07377962 - time (sec): 18.85 - samples/sec: 2566.19 - lr: 0.000043 - momentum: 0.000000
2023-10-17 15:08:52,077 epoch 3 - iter 396/992 - loss 0.07136994 - time (sec): 25.12 - samples/sec: 2594.72 - lr: 0.000042 - momentum: 0.000000
2023-10-17 15:08:58,402 epoch 3 - iter 495/992 - loss 0.07162863 - time (sec): 31.44 - samples/sec: 2593.96 - lr: 0.000042 - momentum: 0.000000
2023-10-17 15:09:04,568 epoch 3 - iter 594/992 - loss 0.07280519 - time (sec): 37.61 - samples/sec: 2622.46 - lr: 0.000041 - momentum: 0.000000
2023-10-17 15:09:10,352 epoch 3 - iter 693/992 - loss 0.07280608 - time (sec): 43.39 - samples/sec: 2636.16 - lr: 0.000041 - momentum: 0.000000
2023-10-17 15:09:16,277 epoch 3 - iter 792/992 - loss 0.07274453 - time (sec): 49.32 - samples/sec: 2646.29 - lr: 0.000040 - momentum: 0.000000
2023-10-17 15:09:22,902 epoch 3 - iter 891/992 - loss 0.07234744 - time (sec): 55.94 - samples/sec: 2637.53 - lr: 0.000039 - momentum: 0.000000
2023-10-17 15:09:28,682 epoch 3 - iter 990/992 - loss 0.07312508 - time (sec): 61.72 - samples/sec: 2651.40 - lr: 0.000039 - momentum: 0.000000
2023-10-17 15:09:28,790 ----------------------------------------------------------------------------------------------------
2023-10-17 15:09:28,790 EPOCH 3 done: loss 0.0730 - lr: 0.000039
2023-10-17 15:09:32,577 DEV : loss 0.10305587947368622 - f1-score (micro avg) 0.7591
2023-10-17 15:09:32,610 saving best model
2023-10-17 15:09:33,090 ----------------------------------------------------------------------------------------------------
2023-10-17 15:09:38,942 epoch 4 - iter 99/992 - loss 0.05222278 - time (sec): 5.85 - samples/sec: 2696.28 - lr: 0.000038 - momentum: 0.000000
2023-10-17 15:09:45,319 epoch 4 - iter 198/992 - loss 0.05464912 - time (sec): 12.23 - samples/sec: 2664.59 - lr: 0.000038 - momentum: 0.000000
2023-10-17 15:09:51,435 epoch 4 - iter 297/992 - loss 0.05795324 - time (sec): 18.34 - samples/sec: 2641.93 - lr: 0.000037 - momentum: 0.000000
2023-10-17 15:09:57,594 epoch 4 - iter 396/992 - loss 0.05516303 - time (sec): 24.50 - samples/sec: 2658.01 - lr: 0.000037 - momentum: 0.000000
2023-10-17 15:10:03,797 epoch 4 - iter 495/992 - loss 0.05508194 - time (sec): 30.70 - samples/sec: 2659.79 - lr: 0.000036 - momentum: 0.000000
2023-10-17 15:10:09,783 epoch 4 - iter 594/992 - loss 0.05615006 - time (sec): 36.69 - samples/sec: 2665.02 - lr: 0.000036 - momentum: 0.000000
2023-10-17 15:10:15,945 epoch 4 - iter 693/992 - loss 0.05660499 - time (sec): 42.85 - samples/sec: 2666.84 - lr: 0.000035 - momentum: 0.000000
2023-10-17 15:10:21,870 epoch 4 - iter 792/992 - loss 0.05723636 - time (sec): 48.78 - samples/sec: 2675.19 - lr: 0.000034 - momentum: 0.000000
2023-10-17 15:10:27,762 epoch 4 - iter 891/992 - loss 0.05771745 - time (sec): 54.67 - samples/sec: 2694.56 - lr: 0.000034 - momentum: 0.000000
2023-10-17 15:10:33,702 epoch 4 - iter 990/992 - loss 0.05696544 - time (sec): 60.61 - samples/sec: 2700.46 - lr: 0.000033 - momentum: 0.000000
2023-10-17 15:10:33,832 ----------------------------------------------------------------------------------------------------
2023-10-17 15:10:33,832 EPOCH 4 done: loss 0.0572 - lr: 0.000033
2023-10-17 15:10:37,423 DEV : loss 0.1290876865386963 - f1-score (micro avg) 0.7365
2023-10-17 15:10:37,453 ----------------------------------------------------------------------------------------------------
2023-10-17 15:10:43,108 epoch 5 - iter 99/992 - loss 0.04105148 - time (sec): 5.65 - samples/sec: 2830.85 - lr: 0.000033 - momentum: 0.000000
2023-10-17 15:10:49,178 epoch 5 - iter 198/992 - loss 0.04148932 - time (sec): 11.72 - samples/sec: 2798.97 - lr: 0.000032 - momentum: 0.000000
2023-10-17 15:10:55,666 epoch 5 - iter 297/992 - loss 0.03980325 - time (sec): 18.21 - samples/sec: 2752.45 - lr: 0.000032 - momentum: 0.000000
2023-10-17 15:11:01,437 epoch 5 - iter 396/992 - loss 0.04154139 - time (sec): 23.98 - samples/sec: 2764.28 - lr: 0.000031 - momentum: 0.000000
2023-10-17 15:11:07,548 epoch 5 - iter 495/992 - loss 0.04253159 - time (sec): 30.09 - samples/sec: 2780.57 - lr: 0.000031 - momentum: 0.000000
2023-10-17 15:11:13,348 epoch 5 - iter 594/992 - loss 0.04313180 - time (sec): 35.89 - samples/sec: 2771.86 - lr: 0.000030 - momentum: 0.000000
2023-10-17 15:11:19,562 epoch 5 - iter 693/992 - loss 0.04278345 - time (sec): 42.11 - samples/sec: 2762.89 - lr: 0.000029 - momentum: 0.000000
2023-10-17 15:11:25,360 epoch 5 - iter 792/992 - loss 0.04253978 - time (sec): 47.91 - samples/sec: 2753.94 - lr: 0.000029 - momentum: 0.000000
2023-10-17 15:11:31,032 epoch 5 - iter 891/992 - loss 0.04250696 - time (sec): 53.58 - samples/sec: 2755.88 - lr: 0.000028 - momentum: 0.000000
2023-10-17 15:11:37,126 epoch 5 - iter 990/992 - loss 0.04181300 - time (sec): 59.67 - samples/sec: 2742.54 - lr: 0.000028 - momentum: 0.000000
2023-10-17 15:11:37,239 ----------------------------------------------------------------------------------------------------
2023-10-17 15:11:37,239 EPOCH 5 done: loss 0.0417 - lr: 0.000028
2023-10-17 15:11:41,340 DEV : loss 0.1687757819890976 - f1-score (micro avg) 0.7644
2023-10-17 15:11:41,374 saving best model
2023-10-17 15:11:41,959 ----------------------------------------------------------------------------------------------------
2023-10-17 15:11:48,364 epoch 6 - iter 99/992 - loss 0.02946611 - time (sec): 6.40 - samples/sec: 2586.47 - lr: 0.000027 - momentum: 0.000000
2023-10-17 15:11:54,568 epoch 6 - iter 198/992 - loss 0.03061488 - time (sec): 12.61 - samples/sec: 2654.29 - lr: 0.000027 - momentum: 0.000000
2023-10-17 15:12:00,552 epoch 6 - iter 297/992 - loss 0.02985251 - time (sec): 18.59 - samples/sec: 2698.65 - lr: 0.000026 - momentum: 0.000000
2023-10-17 15:12:06,580 epoch 6 - iter 396/992 - loss 0.02840164 - time (sec): 24.62 - samples/sec: 2714.73 - lr: 0.000026 - momentum: 0.000000
2023-10-17 15:12:12,475 epoch 6 - iter 495/992 - loss 0.02922618 - time (sec): 30.51 - samples/sec: 2725.65 - lr: 0.000025 - momentum: 0.000000
2023-10-17 15:12:18,608 epoch 6 - iter 594/992 - loss 0.02995890 - time (sec): 36.65 - samples/sec: 2740.43 - lr: 0.000024 - momentum: 0.000000
2023-10-17 15:12:24,344 epoch 6 - iter 693/992 - loss 0.03016712 - time (sec): 42.38 - samples/sec: 2735.06 - lr: 0.000024 - momentum: 0.000000
2023-10-17 15:12:30,413 epoch 6 - iter 792/992 - loss 0.03158247 - time (sec): 48.45 - samples/sec: 2714.37 - lr: 0.000023 - momentum: 0.000000
2023-10-17 15:12:36,424 epoch 6 - iter 891/992 - loss 0.03092480 - time (sec): 54.46 - samples/sec: 2713.22 - lr: 0.000023 - momentum: 0.000000
2023-10-17 15:12:42,336 epoch 6 - iter 990/992 - loss 0.03100699 - time (sec): 60.37 - samples/sec: 2710.72 - lr: 0.000022 - momentum: 0.000000
2023-10-17 15:12:42,446 ----------------------------------------------------------------------------------------------------
2023-10-17 15:12:42,447 EPOCH 6 done: loss 0.0311 - lr: 0.000022
2023-10-17 15:12:46,001 DEV : loss 0.18693169951438904 - f1-score (micro avg) 0.7604
2023-10-17 15:12:46,022 ----------------------------------------------------------------------------------------------------
2023-10-17 15:12:51,972 epoch 7 - iter 99/992 - loss 0.02441708 - time (sec): 5.95 - samples/sec: 2670.95 - lr: 0.000022 - momentum: 0.000000
2023-10-17 15:12:58,124 epoch 7 - iter 198/992 - loss 0.02229387 - time (sec): 12.10 - samples/sec: 2687.51 - lr: 0.000021 - momentum: 0.000000
2023-10-17 15:13:04,348 epoch 7 - iter 297/992 - loss 0.02315922 - time (sec): 18.32 - samples/sec: 2667.51 - lr: 0.000021 - momentum: 0.000000
2023-10-17 15:13:10,631 epoch 7 - iter 396/992 - loss 0.02270701 - time (sec): 24.61 - samples/sec: 2698.30 - lr: 0.000020 - momentum: 0.000000
2023-10-17 15:13:17,126 epoch 7 - iter 495/992 - loss 0.02221474 - time (sec): 31.10 - samples/sec: 2711.28 - lr: 0.000019 - momentum: 0.000000
2023-10-17 15:13:22,985 epoch 7 - iter 594/992 - loss 0.02215952 - time (sec): 36.96 - samples/sec: 2718.05 - lr: 0.000019 - momentum: 0.000000
2023-10-17 15:13:28,764 epoch 7 - iter 693/992 - loss 0.02222955 - time (sec): 42.74 - samples/sec: 2718.53 - lr: 0.000018 - momentum: 0.000000
2023-10-17 15:13:34,713 epoch 7 - iter 792/992 - loss 0.02198392 - time (sec): 48.69 - samples/sec: 2715.09 - lr: 0.000018 - momentum: 0.000000
2023-10-17 15:13:40,679 epoch 7 - iter 891/992 - loss 0.02263738 - time (sec): 54.66 - samples/sec: 2711.54 - lr: 0.000017 - momentum: 0.000000
2023-10-17 15:13:46,476 epoch 7 - iter 990/992 - loss 0.02209691 - time (sec): 60.45 - samples/sec: 2706.35 - lr: 0.000017 - momentum: 0.000000
2023-10-17 15:13:46,588 ----------------------------------------------------------------------------------------------------
2023-10-17 15:13:46,588 EPOCH 7 done: loss 0.0223 - lr: 0.000017
2023-10-17 15:13:50,192 DEV : loss 0.20075371861457825 - f1-score (micro avg) 0.7563
2023-10-17 15:13:50,226 ----------------------------------------------------------------------------------------------------
2023-10-17 15:13:56,194 epoch 8 - iter 99/992 - loss 0.01434494 - time (sec): 5.97 - samples/sec: 2726.76 - lr: 0.000016 - momentum: 0.000000
2023-10-17 15:14:01,983 epoch 8 - iter 198/992 - loss 0.01221880 - time (sec): 11.75 - samples/sec: 2747.78 - lr: 0.000016 - momentum: 0.000000
2023-10-17 15:14:08,169 epoch 8 - iter 297/992 - loss 0.01358537 - time (sec): 17.94 - samples/sec: 2718.92 - lr: 0.000015 - momentum: 0.000000
2023-10-17 15:14:14,284 epoch 8 - iter 396/992 - loss 0.01240926 - time (sec): 24.06 - samples/sec: 2714.16 - lr: 0.000014 - momentum: 0.000000
2023-10-17 15:14:20,536 epoch 8 - iter 495/992 - loss 0.01395736 - time (sec): 30.31 - samples/sec: 2686.43 - lr: 0.000014 - momentum: 0.000000
2023-10-17 15:14:26,357 epoch 8 - iter 594/992 - loss 0.01412399 - time (sec): 36.13 - samples/sec: 2688.93 - lr: 0.000013 - momentum: 0.000000
2023-10-17 15:14:32,373 epoch 8 - iter 693/992 - loss 0.01400427 - time (sec): 42.15 - samples/sec: 2696.74 - lr: 0.000013 - momentum: 0.000000
2023-10-17 15:14:38,421 epoch 8 - iter 792/992 - loss 0.01472146 - time (sec): 48.19 - samples/sec: 2706.17 - lr: 0.000012 - momentum: 0.000000
2023-10-17 15:14:44,134 epoch 8 - iter 891/992 - loss 0.01436421 - time (sec): 53.91 - samples/sec: 2713.39 - lr: 0.000012 - momentum: 0.000000
2023-10-17 15:14:50,402 epoch 8 - iter 990/992 - loss 0.01442694 - time (sec): 60.17 - samples/sec: 2719.16 - lr: 0.000011 - momentum: 0.000000
2023-10-17 15:14:50,524 ----------------------------------------------------------------------------------------------------
2023-10-17 15:14:50,525 EPOCH 8 done: loss 0.0146 - lr: 0.000011
2023-10-17 15:14:54,214 DEV : loss 0.22761167585849762 - f1-score (micro avg) 0.7606
2023-10-17 15:14:54,240 ----------------------------------------------------------------------------------------------------
2023-10-17 15:15:00,161 epoch 9 - iter 99/992 - loss 0.00799092 - time (sec): 5.92 - samples/sec: 2843.66 - lr: 0.000011 - momentum: 0.000000
2023-10-17 15:15:06,079 epoch 9 - iter 198/992 - loss 0.00828030 - time (sec): 11.84 - samples/sec: 2852.74 - lr: 0.000010 - momentum: 0.000000
2023-10-17 15:15:12,439 epoch 9 - iter 297/992 - loss 0.00916347 - time (sec): 18.20 - samples/sec: 2757.11 - lr: 0.000009 - momentum: 0.000000
2023-10-17 15:15:18,232 epoch 9 - iter 396/992 - loss 0.01105875 - time (sec): 23.99 - samples/sec: 2764.75 - lr: 0.000009 - momentum: 0.000000
2023-10-17 15:15:24,475 epoch 9 - iter 495/992 - loss 0.01018621 - time (sec): 30.23 - samples/sec: 2754.32 - lr: 0.000008 - momentum: 0.000000
2023-10-17 15:15:30,268 epoch 9 - iter 594/992 - loss 0.01068382 - time (sec): 36.03 - samples/sec: 2751.33 - lr: 0.000008 - momentum: 0.000000
2023-10-17 15:15:36,445 epoch 9 - iter 693/992 - loss 0.01008544 - time (sec): 42.20 - samples/sec: 2739.54 - lr: 0.000007 - momentum: 0.000000
2023-10-17 15:15:42,355 epoch 9 - iter 792/992 - loss 0.00998382 - time (sec): 48.11 - samples/sec: 2739.36 - lr: 0.000007 - momentum: 0.000000
2023-10-17 15:15:48,413 epoch 9 - iter 891/992 - loss 0.00988621 - time (sec): 54.17 - samples/sec: 2728.23 - lr: 0.000006 - momentum: 0.000000
2023-10-17 15:15:54,351 epoch 9 - iter 990/992 - loss 0.00985186 - time (sec): 60.11 - samples/sec: 2723.30 - lr: 0.000006 - momentum: 0.000000
2023-10-17 15:15:54,459 ----------------------------------------------------------------------------------------------------
2023-10-17 15:15:54,459 EPOCH 9 done: loss 0.0098 - lr: 0.000006
2023-10-17 15:15:58,980 DEV : loss 0.24459530413150787 - f1-score (micro avg) 0.7596
2023-10-17 15:15:59,006 ----------------------------------------------------------------------------------------------------
2023-10-17 15:16:05,391 epoch 10 - iter 99/992 - loss 0.00479346 - time (sec): 6.38 - samples/sec: 2649.86 - lr: 0.000005 - momentum: 0.000000
2023-10-17 15:16:11,650 epoch 10 - iter 198/992 - loss 0.00564539 - time (sec): 12.64 - samples/sec: 2639.92 - lr: 0.000004 - momentum: 0.000000
2023-10-17 15:16:17,940 epoch 10 - iter 297/992 - loss 0.00670906 - time (sec): 18.93 - samples/sec: 2650.03 - lr: 0.000004 - momentum: 0.000000
2023-10-17 15:16:24,031 epoch 10 - iter 396/992 - loss 0.00703610 - time (sec): 25.02 - samples/sec: 2619.19 - lr: 0.000003 - momentum: 0.000000
2023-10-17 15:16:30,132 epoch 10 - iter 495/992 - loss 0.00702067 - time (sec): 31.12 - samples/sec: 2652.10 - lr: 0.000003 - momentum: 0.000000
2023-10-17 15:16:36,083 epoch 10 - iter 594/992 - loss 0.00679611 - time (sec): 37.07 - samples/sec: 2651.23 - lr: 0.000002 - momentum: 0.000000
2023-10-17 15:16:41,761 epoch 10 - iter 693/992 - loss 0.00690832 - time (sec): 42.75 - samples/sec: 2680.33 - lr: 0.000002 - momentum: 0.000000
2023-10-17 15:16:47,817 epoch 10 - iter 792/992 - loss 0.00687466 - time (sec): 48.81 - samples/sec: 2673.00 - lr: 0.000001 - momentum: 0.000000
2023-10-17 15:16:53,876 epoch 10 - iter 891/992 - loss 0.00684382 - time (sec): 54.87 - samples/sec: 2661.32 - lr: 0.000001 - momentum: 0.000000
2023-10-17 15:17:00,571 epoch 10 - iter 990/992 - loss 0.00637386 - time (sec): 61.56 - samples/sec: 2659.92 - lr: 0.000000 - momentum: 0.000000
2023-10-17 15:17:00,687 ----------------------------------------------------------------------------------------------------
2023-10-17 15:17:00,687 EPOCH 10 done: loss 0.0064 - lr: 0.000000
2023-10-17 15:17:04,344 DEV : loss 0.24962207674980164 - f1-score (micro avg) 0.7634
2023-10-17 15:17:04,822 ----------------------------------------------------------------------------------------------------
2023-10-17 15:17:04,823 Loading model from best epoch ...
2023-10-17 15:17:06,444 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 15:17:10,062
Results:
- F-score (micro) 0.7653
- F-score (macro) 0.665
- Accuracy 0.6489
By class:
precision recall f1-score support
LOC 0.7975 0.8779 0.8358 655
PER 0.6768 0.7982 0.7325 223
ORG 0.4554 0.4016 0.4268 127
micro avg 0.7336 0.8000 0.7653 1005
macro avg 0.6432 0.6925 0.6650 1005
weighted avg 0.7275 0.8000 0.7612 1005
2023-10-17 15:17:10,062 ----------------------------------------------------------------------------------------------------