stefan-it's picture
Upload folder using huggingface_hub
d71a669
2023-10-08 18:16:32,446 ----------------------------------------------------------------------------------------------------
2023-10-08 18:16:32,447 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-08 18:16:32,447 ----------------------------------------------------------------------------------------------------
2023-10-08 18:16:32,447 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /app/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-08 18:16:32,448 ----------------------------------------------------------------------------------------------------
2023-10-08 18:16:32,448 Train: 966 sentences
2023-10-08 18:16:32,448 (train_with_dev=False, train_with_test=False)
2023-10-08 18:16:32,448 ----------------------------------------------------------------------------------------------------
2023-10-08 18:16:32,448 Training Params:
2023-10-08 18:16:32,448 - learning_rate: "0.00015"
2023-10-08 18:16:32,448 - mini_batch_size: "4"
2023-10-08 18:16:32,448 - max_epochs: "10"
2023-10-08 18:16:32,448 - shuffle: "True"
2023-10-08 18:16:32,448 ----------------------------------------------------------------------------------------------------
2023-10-08 18:16:32,448 Plugins:
2023-10-08 18:16:32,448 - TensorboardLogger
2023-10-08 18:16:32,448 - LinearScheduler | warmup_fraction: '0.1'
2023-10-08 18:16:32,448 ----------------------------------------------------------------------------------------------------
2023-10-08 18:16:32,448 Final evaluation on model from best epoch (best-model.pt)
2023-10-08 18:16:32,448 - metric: "('micro avg', 'f1-score')"
2023-10-08 18:16:32,448 ----------------------------------------------------------------------------------------------------
2023-10-08 18:16:32,448 Computation:
2023-10-08 18:16:32,449 - compute on device: cuda:0
2023-10-08 18:16:32,449 - embedding storage: none
2023-10-08 18:16:32,449 ----------------------------------------------------------------------------------------------------
2023-10-08 18:16:32,449 Model training base path: "hmbench-ajmc/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-1"
2023-10-08 18:16:32,449 ----------------------------------------------------------------------------------------------------
2023-10-08 18:16:32,449 ----------------------------------------------------------------------------------------------------
2023-10-08 18:16:32,449 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-08 18:16:42,184 epoch 1 - iter 24/242 - loss 3.23269142 - time (sec): 9.73 - samples/sec: 224.89 - lr: 0.000014 - momentum: 0.000000
2023-10-08 18:16:51,473 epoch 1 - iter 48/242 - loss 3.22249485 - time (sec): 19.02 - samples/sec: 244.18 - lr: 0.000029 - momentum: 0.000000
2023-10-08 18:17:00,441 epoch 1 - iter 72/242 - loss 3.20551448 - time (sec): 27.99 - samples/sec: 247.62 - lr: 0.000044 - momentum: 0.000000
2023-10-08 18:17:09,710 epoch 1 - iter 96/242 - loss 3.15823056 - time (sec): 37.26 - samples/sec: 253.14 - lr: 0.000059 - momentum: 0.000000
2023-10-08 18:17:18,886 epoch 1 - iter 120/242 - loss 3.07644769 - time (sec): 46.44 - samples/sec: 253.57 - lr: 0.000074 - momentum: 0.000000
2023-10-08 18:17:28,777 epoch 1 - iter 144/242 - loss 2.97338064 - time (sec): 56.33 - samples/sec: 255.90 - lr: 0.000089 - momentum: 0.000000
2023-10-08 18:17:37,776 epoch 1 - iter 168/242 - loss 2.87431993 - time (sec): 65.33 - samples/sec: 256.24 - lr: 0.000104 - momentum: 0.000000
2023-10-08 18:17:46,696 epoch 1 - iter 192/242 - loss 2.76814205 - time (sec): 74.25 - samples/sec: 256.36 - lr: 0.000118 - momentum: 0.000000
2023-10-08 18:17:56,559 epoch 1 - iter 216/242 - loss 2.63563561 - time (sec): 84.11 - samples/sec: 258.43 - lr: 0.000133 - momentum: 0.000000
2023-10-08 18:18:06,722 epoch 1 - iter 240/242 - loss 2.49337003 - time (sec): 94.27 - samples/sec: 260.02 - lr: 0.000148 - momentum: 0.000000
2023-10-08 18:18:07,447 ----------------------------------------------------------------------------------------------------
2023-10-08 18:18:07,447 EPOCH 1 done: loss 2.4815 - lr: 0.000148
2023-10-08 18:18:13,344 DEV : loss 1.187881588935852 - f1-score (micro avg) 0.0
2023-10-08 18:18:13,350 ----------------------------------------------------------------------------------------------------
2023-10-08 18:18:22,466 epoch 2 - iter 24/242 - loss 1.16961126 - time (sec): 9.11 - samples/sec: 249.16 - lr: 0.000148 - momentum: 0.000000
2023-10-08 18:18:31,815 epoch 2 - iter 48/242 - loss 1.03643335 - time (sec): 18.46 - samples/sec: 252.66 - lr: 0.000147 - momentum: 0.000000
2023-10-08 18:18:41,103 epoch 2 - iter 72/242 - loss 0.94733732 - time (sec): 27.75 - samples/sec: 256.46 - lr: 0.000145 - momentum: 0.000000
2023-10-08 18:18:50,518 epoch 2 - iter 96/242 - loss 0.87122298 - time (sec): 37.17 - samples/sec: 259.14 - lr: 0.000143 - momentum: 0.000000
2023-10-08 18:18:59,786 epoch 2 - iter 120/242 - loss 0.81457811 - time (sec): 46.43 - samples/sec: 258.47 - lr: 0.000142 - momentum: 0.000000
2023-10-08 18:19:09,536 epoch 2 - iter 144/242 - loss 0.77131873 - time (sec): 56.18 - samples/sec: 261.82 - lr: 0.000140 - momentum: 0.000000
2023-10-08 18:19:19,642 epoch 2 - iter 168/242 - loss 0.72892369 - time (sec): 66.29 - samples/sec: 261.41 - lr: 0.000139 - momentum: 0.000000
2023-10-08 18:19:28,785 epoch 2 - iter 192/242 - loss 0.68941919 - time (sec): 75.43 - samples/sec: 261.16 - lr: 0.000137 - momentum: 0.000000
2023-10-08 18:19:37,777 epoch 2 - iter 216/242 - loss 0.65715679 - time (sec): 84.43 - samples/sec: 260.48 - lr: 0.000135 - momentum: 0.000000
2023-10-08 18:19:47,470 epoch 2 - iter 240/242 - loss 0.62460856 - time (sec): 94.12 - samples/sec: 261.61 - lr: 0.000134 - momentum: 0.000000
2023-10-08 18:19:48,012 ----------------------------------------------------------------------------------------------------
2023-10-08 18:19:48,012 EPOCH 2 done: loss 0.6235 - lr: 0.000134
2023-10-08 18:19:53,806 DEV : loss 0.3897227346897125 - f1-score (micro avg) 0.3731
2023-10-08 18:19:53,813 saving best model
2023-10-08 18:19:54,725 ----------------------------------------------------------------------------------------------------
2023-10-08 18:20:04,366 epoch 3 - iter 24/242 - loss 0.35697873 - time (sec): 9.64 - samples/sec: 268.39 - lr: 0.000132 - momentum: 0.000000
2023-10-08 18:20:13,604 epoch 3 - iter 48/242 - loss 0.37113707 - time (sec): 18.88 - samples/sec: 267.62 - lr: 0.000130 - momentum: 0.000000
2023-10-08 18:20:23,337 epoch 3 - iter 72/242 - loss 0.34367840 - time (sec): 28.61 - samples/sec: 270.64 - lr: 0.000128 - momentum: 0.000000
2023-10-08 18:20:32,526 epoch 3 - iter 96/242 - loss 0.32984894 - time (sec): 37.80 - samples/sec: 266.51 - lr: 0.000127 - momentum: 0.000000
2023-10-08 18:20:41,455 epoch 3 - iter 120/242 - loss 0.31221782 - time (sec): 46.73 - samples/sec: 264.59 - lr: 0.000125 - momentum: 0.000000
2023-10-08 18:20:50,329 epoch 3 - iter 144/242 - loss 0.30435470 - time (sec): 55.60 - samples/sec: 263.57 - lr: 0.000124 - momentum: 0.000000
2023-10-08 18:21:00,276 epoch 3 - iter 168/242 - loss 0.29962563 - time (sec): 65.55 - samples/sec: 263.91 - lr: 0.000122 - momentum: 0.000000
2023-10-08 18:21:09,953 epoch 3 - iter 192/242 - loss 0.29351352 - time (sec): 75.23 - samples/sec: 264.92 - lr: 0.000120 - momentum: 0.000000
2023-10-08 18:21:18,884 epoch 3 - iter 216/242 - loss 0.28484928 - time (sec): 84.16 - samples/sec: 263.64 - lr: 0.000119 - momentum: 0.000000
2023-10-08 18:21:28,210 epoch 3 - iter 240/242 - loss 0.27921957 - time (sec): 93.48 - samples/sec: 263.07 - lr: 0.000117 - momentum: 0.000000
2023-10-08 18:21:28,826 ----------------------------------------------------------------------------------------------------
2023-10-08 18:21:28,826 EPOCH 3 done: loss 0.2786 - lr: 0.000117
2023-10-08 18:21:34,635 DEV : loss 0.22860029339790344 - f1-score (micro avg) 0.5446
2023-10-08 18:21:34,641 saving best model
2023-10-08 18:21:39,854 ----------------------------------------------------------------------------------------------------
2023-10-08 18:21:48,411 epoch 4 - iter 24/242 - loss 0.23513209 - time (sec): 8.56 - samples/sec: 256.67 - lr: 0.000115 - momentum: 0.000000
2023-10-08 18:21:58,114 epoch 4 - iter 48/242 - loss 0.22118636 - time (sec): 18.26 - samples/sec: 265.13 - lr: 0.000113 - momentum: 0.000000
2023-10-08 18:22:07,595 epoch 4 - iter 72/242 - loss 0.20467054 - time (sec): 27.74 - samples/sec: 262.11 - lr: 0.000112 - momentum: 0.000000
2023-10-08 18:22:16,768 epoch 4 - iter 96/242 - loss 0.19305601 - time (sec): 36.91 - samples/sec: 262.29 - lr: 0.000110 - momentum: 0.000000
2023-10-08 18:22:25,785 epoch 4 - iter 120/242 - loss 0.19060223 - time (sec): 45.93 - samples/sec: 261.90 - lr: 0.000109 - momentum: 0.000000
2023-10-08 18:22:36,021 epoch 4 - iter 144/242 - loss 0.18708839 - time (sec): 56.17 - samples/sec: 263.75 - lr: 0.000107 - momentum: 0.000000
2023-10-08 18:22:45,696 epoch 4 - iter 168/242 - loss 0.17861271 - time (sec): 65.84 - samples/sec: 264.17 - lr: 0.000105 - momentum: 0.000000
2023-10-08 18:22:54,966 epoch 4 - iter 192/242 - loss 0.17748223 - time (sec): 75.11 - samples/sec: 262.29 - lr: 0.000104 - momentum: 0.000000
2023-10-08 18:23:04,486 epoch 4 - iter 216/242 - loss 0.17548225 - time (sec): 84.63 - samples/sec: 261.77 - lr: 0.000102 - momentum: 0.000000
2023-10-08 18:23:13,862 epoch 4 - iter 240/242 - loss 0.17101350 - time (sec): 94.01 - samples/sec: 261.05 - lr: 0.000100 - momentum: 0.000000
2023-10-08 18:23:14,560 ----------------------------------------------------------------------------------------------------
2023-10-08 18:23:14,561 EPOCH 4 done: loss 0.1710 - lr: 0.000100
2023-10-08 18:23:20,444 DEV : loss 0.15941345691680908 - f1-score (micro avg) 0.8212
2023-10-08 18:23:20,451 saving best model
2023-10-08 18:23:24,842 ----------------------------------------------------------------------------------------------------
2023-10-08 18:23:34,448 epoch 5 - iter 24/242 - loss 0.16486530 - time (sec): 9.61 - samples/sec: 263.61 - lr: 0.000098 - momentum: 0.000000
2023-10-08 18:23:43,910 epoch 5 - iter 48/242 - loss 0.13645891 - time (sec): 19.07 - samples/sec: 258.40 - lr: 0.000097 - momentum: 0.000000
2023-10-08 18:23:53,993 epoch 5 - iter 72/242 - loss 0.13013903 - time (sec): 29.15 - samples/sec: 264.42 - lr: 0.000095 - momentum: 0.000000
2023-10-08 18:24:04,037 epoch 5 - iter 96/242 - loss 0.12377710 - time (sec): 39.19 - samples/sec: 265.32 - lr: 0.000094 - momentum: 0.000000
2023-10-08 18:24:13,765 epoch 5 - iter 120/242 - loss 0.12205066 - time (sec): 48.92 - samples/sec: 262.72 - lr: 0.000092 - momentum: 0.000000
2023-10-08 18:24:22,979 epoch 5 - iter 144/242 - loss 0.12395128 - time (sec): 58.14 - samples/sec: 260.01 - lr: 0.000090 - momentum: 0.000000
2023-10-08 18:24:32,801 epoch 5 - iter 168/242 - loss 0.11820058 - time (sec): 67.96 - samples/sec: 259.84 - lr: 0.000089 - momentum: 0.000000
2023-10-08 18:24:41,768 epoch 5 - iter 192/242 - loss 0.11799339 - time (sec): 76.92 - samples/sec: 257.75 - lr: 0.000087 - momentum: 0.000000
2023-10-08 18:24:51,316 epoch 5 - iter 216/242 - loss 0.11722304 - time (sec): 86.47 - samples/sec: 256.40 - lr: 0.000085 - momentum: 0.000000
2023-10-08 18:25:00,897 epoch 5 - iter 240/242 - loss 0.11333769 - time (sec): 96.05 - samples/sec: 255.45 - lr: 0.000084 - momentum: 0.000000
2023-10-08 18:25:01,673 ----------------------------------------------------------------------------------------------------
2023-10-08 18:25:01,673 EPOCH 5 done: loss 0.1129 - lr: 0.000084
2023-10-08 18:25:07,915 DEV : loss 0.1479427069425583 - f1-score (micro avg) 0.8097
2023-10-08 18:25:07,921 ----------------------------------------------------------------------------------------------------
2023-10-08 18:25:17,380 epoch 6 - iter 24/242 - loss 0.09476938 - time (sec): 9.46 - samples/sec: 237.69 - lr: 0.000082 - momentum: 0.000000
2023-10-08 18:25:27,454 epoch 6 - iter 48/242 - loss 0.08965422 - time (sec): 19.53 - samples/sec: 245.46 - lr: 0.000080 - momentum: 0.000000
2023-10-08 18:25:37,034 epoch 6 - iter 72/242 - loss 0.08409169 - time (sec): 29.11 - samples/sec: 243.79 - lr: 0.000079 - momentum: 0.000000
2023-10-08 18:25:47,244 epoch 6 - iter 96/242 - loss 0.08653234 - time (sec): 39.32 - samples/sec: 244.98 - lr: 0.000077 - momentum: 0.000000
2023-10-08 18:25:57,462 epoch 6 - iter 120/242 - loss 0.08433397 - time (sec): 49.54 - samples/sec: 245.74 - lr: 0.000075 - momentum: 0.000000
2023-10-08 18:26:08,047 epoch 6 - iter 144/242 - loss 0.08364166 - time (sec): 60.12 - samples/sec: 247.54 - lr: 0.000074 - momentum: 0.000000
2023-10-08 18:26:18,163 epoch 6 - iter 168/242 - loss 0.08837454 - time (sec): 70.24 - samples/sec: 248.70 - lr: 0.000072 - momentum: 0.000000
2023-10-08 18:26:27,960 epoch 6 - iter 192/242 - loss 0.08797324 - time (sec): 80.04 - samples/sec: 246.32 - lr: 0.000070 - momentum: 0.000000
2023-10-08 18:26:37,514 epoch 6 - iter 216/242 - loss 0.08549075 - time (sec): 89.59 - samples/sec: 245.10 - lr: 0.000069 - momentum: 0.000000
2023-10-08 18:26:48,162 epoch 6 - iter 240/242 - loss 0.08456891 - time (sec): 100.24 - samples/sec: 244.47 - lr: 0.000067 - momentum: 0.000000
2023-10-08 18:26:49,071 ----------------------------------------------------------------------------------------------------
2023-10-08 18:26:49,072 EPOCH 6 done: loss 0.0846 - lr: 0.000067
2023-10-08 18:26:55,670 DEV : loss 0.13671354949474335 - f1-score (micro avg) 0.824
2023-10-08 18:26:55,676 saving best model
2023-10-08 18:27:00,080 ----------------------------------------------------------------------------------------------------
2023-10-08 18:27:08,919 epoch 7 - iter 24/242 - loss 0.04626953 - time (sec): 8.84 - samples/sec: 221.55 - lr: 0.000065 - momentum: 0.000000
2023-10-08 18:27:19,128 epoch 7 - iter 48/242 - loss 0.06148305 - time (sec): 19.05 - samples/sec: 239.21 - lr: 0.000064 - momentum: 0.000000
2023-10-08 18:27:29,148 epoch 7 - iter 72/242 - loss 0.06595940 - time (sec): 29.07 - samples/sec: 240.18 - lr: 0.000062 - momentum: 0.000000
2023-10-08 18:27:39,577 epoch 7 - iter 96/242 - loss 0.06887665 - time (sec): 39.50 - samples/sec: 242.89 - lr: 0.000060 - momentum: 0.000000
2023-10-08 18:27:49,708 epoch 7 - iter 120/242 - loss 0.06674868 - time (sec): 49.63 - samples/sec: 243.18 - lr: 0.000059 - momentum: 0.000000
2023-10-08 18:27:58,945 epoch 7 - iter 144/242 - loss 0.06165700 - time (sec): 58.86 - samples/sec: 240.61 - lr: 0.000057 - momentum: 0.000000
2023-10-08 18:28:09,127 epoch 7 - iter 168/242 - loss 0.06476297 - time (sec): 69.05 - samples/sec: 240.87 - lr: 0.000055 - momentum: 0.000000
2023-10-08 18:28:19,497 epoch 7 - iter 192/242 - loss 0.06659121 - time (sec): 79.41 - samples/sec: 241.57 - lr: 0.000054 - momentum: 0.000000
2023-10-08 18:28:30,481 epoch 7 - iter 216/242 - loss 0.06620009 - time (sec): 90.40 - samples/sec: 242.38 - lr: 0.000052 - momentum: 0.000000
2023-10-08 18:28:41,271 epoch 7 - iter 240/242 - loss 0.06433525 - time (sec): 101.19 - samples/sec: 242.96 - lr: 0.000050 - momentum: 0.000000
2023-10-08 18:28:41,908 ----------------------------------------------------------------------------------------------------
2023-10-08 18:28:41,908 EPOCH 7 done: loss 0.0641 - lr: 0.000050
2023-10-08 18:28:48,483 DEV : loss 0.1378345787525177 - f1-score (micro avg) 0.8149
2023-10-08 18:28:48,489 ----------------------------------------------------------------------------------------------------
2023-10-08 18:28:57,955 epoch 8 - iter 24/242 - loss 0.05161041 - time (sec): 9.46 - samples/sec: 222.09 - lr: 0.000049 - momentum: 0.000000
2023-10-08 18:29:08,141 epoch 8 - iter 48/242 - loss 0.05952830 - time (sec): 19.65 - samples/sec: 236.58 - lr: 0.000047 - momentum: 0.000000
2023-10-08 18:29:18,619 epoch 8 - iter 72/242 - loss 0.05676119 - time (sec): 30.13 - samples/sec: 243.02 - lr: 0.000045 - momentum: 0.000000
2023-10-08 18:29:28,729 epoch 8 - iter 96/242 - loss 0.04846381 - time (sec): 40.24 - samples/sec: 243.03 - lr: 0.000044 - momentum: 0.000000
2023-10-08 18:29:39,172 epoch 8 - iter 120/242 - loss 0.04918867 - time (sec): 50.68 - samples/sec: 243.12 - lr: 0.000042 - momentum: 0.000000
2023-10-08 18:29:49,395 epoch 8 - iter 144/242 - loss 0.04964492 - time (sec): 60.91 - samples/sec: 244.08 - lr: 0.000040 - momentum: 0.000000
2023-10-08 18:29:59,859 epoch 8 - iter 168/242 - loss 0.05311315 - time (sec): 71.37 - samples/sec: 245.18 - lr: 0.000039 - momentum: 0.000000
2023-10-08 18:30:10,010 epoch 8 - iter 192/242 - loss 0.05009814 - time (sec): 81.52 - samples/sec: 244.73 - lr: 0.000037 - momentum: 0.000000
2023-10-08 18:30:19,787 epoch 8 - iter 216/242 - loss 0.04810545 - time (sec): 91.30 - samples/sec: 243.95 - lr: 0.000035 - momentum: 0.000000
2023-10-08 18:30:29,559 epoch 8 - iter 240/242 - loss 0.04878436 - time (sec): 101.07 - samples/sec: 243.65 - lr: 0.000034 - momentum: 0.000000
2023-10-08 18:30:30,089 ----------------------------------------------------------------------------------------------------
2023-10-08 18:30:30,089 EPOCH 8 done: loss 0.0487 - lr: 0.000034
2023-10-08 18:30:36,560 DEV : loss 0.14124426245689392 - f1-score (micro avg) 0.8231
2023-10-08 18:30:36,566 ----------------------------------------------------------------------------------------------------
2023-10-08 18:30:46,085 epoch 9 - iter 24/242 - loss 0.05078836 - time (sec): 9.52 - samples/sec: 234.52 - lr: 0.000032 - momentum: 0.000000
2023-10-08 18:30:55,435 epoch 9 - iter 48/242 - loss 0.04517817 - time (sec): 18.87 - samples/sec: 233.10 - lr: 0.000030 - momentum: 0.000000
2023-10-08 18:31:05,694 epoch 9 - iter 72/242 - loss 0.03891168 - time (sec): 29.13 - samples/sec: 237.11 - lr: 0.000029 - momentum: 0.000000
2023-10-08 18:31:15,788 epoch 9 - iter 96/242 - loss 0.04475568 - time (sec): 39.22 - samples/sec: 240.97 - lr: 0.000027 - momentum: 0.000000
2023-10-08 18:31:26,418 epoch 9 - iter 120/242 - loss 0.04568425 - time (sec): 49.85 - samples/sec: 243.13 - lr: 0.000025 - momentum: 0.000000
2023-10-08 18:31:37,134 epoch 9 - iter 144/242 - loss 0.04606648 - time (sec): 60.57 - samples/sec: 242.53 - lr: 0.000024 - momentum: 0.000000
2023-10-08 18:31:47,375 epoch 9 - iter 168/242 - loss 0.04397738 - time (sec): 70.81 - samples/sec: 242.77 - lr: 0.000022 - momentum: 0.000000
2023-10-08 18:31:57,299 epoch 9 - iter 192/242 - loss 0.04245585 - time (sec): 80.73 - samples/sec: 242.51 - lr: 0.000020 - momentum: 0.000000
2023-10-08 18:32:07,349 epoch 9 - iter 216/242 - loss 0.04348688 - time (sec): 90.78 - samples/sec: 242.62 - lr: 0.000019 - momentum: 0.000000
2023-10-08 18:32:17,558 epoch 9 - iter 240/242 - loss 0.04244140 - time (sec): 100.99 - samples/sec: 243.14 - lr: 0.000017 - momentum: 0.000000
2023-10-08 18:32:18,254 ----------------------------------------------------------------------------------------------------
2023-10-08 18:32:18,255 EPOCH 9 done: loss 0.0426 - lr: 0.000017
2023-10-08 18:32:24,676 DEV : loss 0.14517952501773834 - f1-score (micro avg) 0.8109
2023-10-08 18:32:24,682 ----------------------------------------------------------------------------------------------------
2023-10-08 18:32:34,783 epoch 10 - iter 24/242 - loss 0.03127148 - time (sec): 10.10 - samples/sec: 242.77 - lr: 0.000015 - momentum: 0.000000
2023-10-08 18:32:45,274 epoch 10 - iter 48/242 - loss 0.02653376 - time (sec): 20.59 - samples/sec: 246.08 - lr: 0.000014 - momentum: 0.000000
2023-10-08 18:32:55,388 epoch 10 - iter 72/242 - loss 0.02893367 - time (sec): 30.70 - samples/sec: 245.21 - lr: 0.000012 - momentum: 0.000000
2023-10-08 18:33:05,766 epoch 10 - iter 96/242 - loss 0.03364083 - time (sec): 41.08 - samples/sec: 245.84 - lr: 0.000010 - momentum: 0.000000
2023-10-08 18:33:15,066 epoch 10 - iter 120/242 - loss 0.03636518 - time (sec): 50.38 - samples/sec: 242.54 - lr: 0.000009 - momentum: 0.000000
2023-10-08 18:33:24,798 epoch 10 - iter 144/242 - loss 0.03644534 - time (sec): 60.11 - samples/sec: 241.39 - lr: 0.000007 - momentum: 0.000000
2023-10-08 18:33:34,831 epoch 10 - iter 168/242 - loss 0.03646037 - time (sec): 70.15 - samples/sec: 241.73 - lr: 0.000005 - momentum: 0.000000
2023-10-08 18:33:45,003 epoch 10 - iter 192/242 - loss 0.03701845 - time (sec): 80.32 - samples/sec: 242.52 - lr: 0.000004 - momentum: 0.000000
2023-10-08 18:33:55,174 epoch 10 - iter 216/242 - loss 0.03595125 - time (sec): 90.49 - samples/sec: 242.81 - lr: 0.000002 - momentum: 0.000000
2023-10-08 18:34:05,446 epoch 10 - iter 240/242 - loss 0.03766310 - time (sec): 100.76 - samples/sec: 243.91 - lr: 0.000000 - momentum: 0.000000
2023-10-08 18:34:06,228 ----------------------------------------------------------------------------------------------------
2023-10-08 18:34:06,228 EPOCH 10 done: loss 0.0379 - lr: 0.000000
2023-10-08 18:34:12,689 DEV : loss 0.1463211625814438 - f1-score (micro avg) 0.8225
2023-10-08 18:34:13,565 ----------------------------------------------------------------------------------------------------
2023-10-08 18:34:13,566 Loading model from best epoch ...
2023-10-08 18:34:19,025 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-08 18:34:25,383
Results:
- F-score (micro) 0.779
- F-score (macro) 0.4674
- Accuracy 0.6737
By class:
precision recall f1-score support
pers 0.8188 0.8129 0.8159 139
scope 0.7931 0.8915 0.8394 129
work 0.6162 0.7625 0.6816 80
loc 0.0000 0.0000 0.0000 9
date 0.0000 0.0000 0.0000 3
micro avg 0.7565 0.8028 0.7790 360
macro avg 0.4456 0.4934 0.4674 360
weighted avg 0.7373 0.8028 0.7673 360
2023-10-08 18:34:25,383 ----------------------------------------------------------------------------------------------------