|
2023-10-06 15:59:08,898 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 15:59:08,899 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): T5LayerNorm() |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): T5LayerNorm() |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): T5LayerNorm() |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): T5LayerNorm() |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): T5LayerNorm() |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=25, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-06 15:59:08,899 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 15:59:08,899 MultiCorpus: 1214 train + 266 dev + 251 test sentences |
|
- NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /app/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator |
|
2023-10-06 15:59:08,899 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 15:59:08,899 Train: 1214 sentences |
|
2023-10-06 15:59:08,899 (train_with_dev=False, train_with_test=False) |
|
2023-10-06 15:59:08,899 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 15:59:08,900 Training Params: |
|
2023-10-06 15:59:08,900 - learning_rate: "0.00015" |
|
2023-10-06 15:59:08,900 - mini_batch_size: "8" |
|
2023-10-06 15:59:08,900 - max_epochs: "10" |
|
2023-10-06 15:59:08,900 - shuffle: "True" |
|
2023-10-06 15:59:08,900 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 15:59:08,900 Plugins: |
|
2023-10-06 15:59:08,900 - TensorboardLogger |
|
2023-10-06 15:59:08,900 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-06 15:59:08,900 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 15:59:08,900 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-06 15:59:08,900 - metric: "('micro avg', 'f1-score')" |
|
2023-10-06 15:59:08,900 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 15:59:08,900 Computation: |
|
2023-10-06 15:59:08,900 - compute on device: cuda:0 |
|
2023-10-06 15:59:08,900 - embedding storage: none |
|
2023-10-06 15:59:08,901 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 15:59:08,901 Model training base path: "hmbench-ajmc/en-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-06 15:59:08,901 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 15:59:08,901 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 15:59:08,901 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-06 15:59:19,279 epoch 1 - iter 15/152 - loss 3.23553399 - time (sec): 10.38 - samples/sec: 308.10 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-06 15:59:29,286 epoch 1 - iter 30/152 - loss 3.22939813 - time (sec): 20.38 - samples/sec: 302.35 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-06 15:59:39,222 epoch 1 - iter 45/152 - loss 3.21751683 - time (sec): 30.32 - samples/sec: 295.52 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-06 15:59:49,336 epoch 1 - iter 60/152 - loss 3.19471768 - time (sec): 40.43 - samples/sec: 295.15 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-06 15:59:59,415 epoch 1 - iter 75/152 - loss 3.14939917 - time (sec): 50.51 - samples/sec: 294.80 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-06 16:00:10,582 epoch 1 - iter 90/152 - loss 3.07470578 - time (sec): 61.68 - samples/sec: 296.90 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-06 16:00:19,963 epoch 1 - iter 105/152 - loss 3.00882280 - time (sec): 71.06 - samples/sec: 293.71 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-06 16:00:30,508 epoch 1 - iter 120/152 - loss 2.91433039 - time (sec): 81.61 - samples/sec: 294.71 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-06 16:00:41,106 epoch 1 - iter 135/152 - loss 2.81522670 - time (sec): 92.20 - samples/sec: 295.26 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-06 16:00:52,076 epoch 1 - iter 150/152 - loss 2.71049372 - time (sec): 103.17 - samples/sec: 296.29 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-06 16:00:53,455 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:00:53,455 EPOCH 1 done: loss 2.6965 - lr: 0.000147 |
|
2023-10-06 16:01:00,771 DEV : loss 1.6262575387954712 - f1-score (micro avg) 0.0 |
|
2023-10-06 16:01:00,779 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:01:11,250 epoch 2 - iter 15/152 - loss 1.54970409 - time (sec): 10.47 - samples/sec: 292.66 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-06 16:01:21,614 epoch 2 - iter 30/152 - loss 1.43184541 - time (sec): 20.83 - samples/sec: 292.65 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-06 16:01:32,678 epoch 2 - iter 45/152 - loss 1.33590739 - time (sec): 31.90 - samples/sec: 292.03 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-06 16:01:43,716 epoch 2 - iter 60/152 - loss 1.22960476 - time (sec): 42.93 - samples/sec: 291.58 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-06 16:01:53,250 epoch 2 - iter 75/152 - loss 1.15504462 - time (sec): 52.47 - samples/sec: 287.33 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-06 16:02:03,759 epoch 2 - iter 90/152 - loss 1.07433988 - time (sec): 62.98 - samples/sec: 287.51 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-06 16:02:15,118 epoch 2 - iter 105/152 - loss 1.03695154 - time (sec): 74.34 - samples/sec: 284.18 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-06 16:02:26,195 epoch 2 - iter 120/152 - loss 0.99220094 - time (sec): 85.41 - samples/sec: 284.13 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-06 16:02:37,420 epoch 2 - iter 135/152 - loss 0.94715002 - time (sec): 96.64 - samples/sec: 284.38 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-06 16:02:48,400 epoch 2 - iter 150/152 - loss 0.90263114 - time (sec): 107.62 - samples/sec: 284.49 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-06 16:02:49,774 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:02:49,774 EPOCH 2 done: loss 0.9008 - lr: 0.000134 |
|
2023-10-06 16:02:57,798 DEV : loss 0.5528021454811096 - f1-score (micro avg) 0.0 |
|
2023-10-06 16:02:57,805 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:03:08,641 epoch 3 - iter 15/152 - loss 0.57289098 - time (sec): 10.83 - samples/sec: 266.65 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-06 16:03:19,591 epoch 3 - iter 30/152 - loss 0.49562157 - time (sec): 21.78 - samples/sec: 269.00 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-06 16:03:29,996 epoch 3 - iter 45/152 - loss 0.45703984 - time (sec): 32.19 - samples/sec: 267.32 - lr: 0.000129 - momentum: 0.000000 |
|
2023-10-06 16:03:41,830 epoch 3 - iter 60/152 - loss 0.44664831 - time (sec): 44.02 - samples/sec: 271.95 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-06 16:03:52,649 epoch 3 - iter 75/152 - loss 0.42891430 - time (sec): 54.84 - samples/sec: 271.54 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-06 16:04:03,823 epoch 3 - iter 90/152 - loss 0.42302343 - time (sec): 66.02 - samples/sec: 271.99 - lr: 0.000124 - momentum: 0.000000 |
|
2023-10-06 16:04:15,076 epoch 3 - iter 105/152 - loss 0.41046195 - time (sec): 77.27 - samples/sec: 273.02 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-06 16:04:26,291 epoch 3 - iter 120/152 - loss 0.39898527 - time (sec): 88.48 - samples/sec: 274.32 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-06 16:04:37,183 epoch 3 - iter 135/152 - loss 0.38534366 - time (sec): 99.38 - samples/sec: 274.77 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-06 16:04:48,644 epoch 3 - iter 150/152 - loss 0.37413664 - time (sec): 110.84 - samples/sec: 275.55 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-06 16:04:50,208 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:04:50,208 EPOCH 3 done: loss 0.3737 - lr: 0.000117 |
|
2023-10-06 16:04:58,144 DEV : loss 0.3319585919380188 - f1-score (micro avg) 0.507 |
|
2023-10-06 16:04:58,151 saving best model |
|
2023-10-06 16:04:59,194 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:05:10,813 epoch 4 - iter 15/152 - loss 0.29376549 - time (sec): 11.62 - samples/sec: 272.54 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-06 16:05:21,224 epoch 4 - iter 30/152 - loss 0.27340139 - time (sec): 22.03 - samples/sec: 268.74 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-06 16:05:32,266 epoch 4 - iter 45/152 - loss 0.26581077 - time (sec): 33.07 - samples/sec: 270.88 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-06 16:05:43,392 epoch 4 - iter 60/152 - loss 0.25503646 - time (sec): 44.20 - samples/sec: 271.95 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-06 16:05:54,574 epoch 4 - iter 75/152 - loss 0.25391004 - time (sec): 55.38 - samples/sec: 272.89 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-06 16:06:05,579 epoch 4 - iter 90/152 - loss 0.24599003 - time (sec): 66.38 - samples/sec: 273.65 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-06 16:06:16,309 epoch 4 - iter 105/152 - loss 0.24005813 - time (sec): 77.11 - samples/sec: 273.35 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-06 16:06:28,262 epoch 4 - iter 120/152 - loss 0.23682524 - time (sec): 89.07 - samples/sec: 276.48 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-06 16:06:39,989 epoch 4 - iter 135/152 - loss 0.22950166 - time (sec): 100.79 - samples/sec: 276.87 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-06 16:06:50,184 epoch 4 - iter 150/152 - loss 0.22413553 - time (sec): 110.99 - samples/sec: 276.17 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-06 16:06:51,440 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:06:51,440 EPOCH 4 done: loss 0.2235 - lr: 0.000101 |
|
2023-10-06 16:06:59,149 DEV : loss 0.22811780869960785 - f1-score (micro avg) 0.6674 |
|
2023-10-06 16:06:59,155 saving best model |
|
2023-10-06 16:07:04,230 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:07:15,407 epoch 5 - iter 15/152 - loss 0.13651398 - time (sec): 11.18 - samples/sec: 275.15 - lr: 0.000099 - momentum: 0.000000 |
|
2023-10-06 16:07:26,193 epoch 5 - iter 30/152 - loss 0.16481221 - time (sec): 21.96 - samples/sec: 276.25 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-06 16:07:37,585 epoch 5 - iter 45/152 - loss 0.17014447 - time (sec): 33.35 - samples/sec: 277.42 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-06 16:07:48,939 epoch 5 - iter 60/152 - loss 0.16586242 - time (sec): 44.71 - samples/sec: 277.72 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-06 16:08:00,314 epoch 5 - iter 75/152 - loss 0.16777109 - time (sec): 56.08 - samples/sec: 278.40 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-06 16:08:10,997 epoch 5 - iter 90/152 - loss 0.15877418 - time (sec): 66.77 - samples/sec: 277.78 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-06 16:08:21,997 epoch 5 - iter 105/152 - loss 0.15242023 - time (sec): 77.77 - samples/sec: 279.60 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-06 16:08:32,507 epoch 5 - iter 120/152 - loss 0.15341168 - time (sec): 88.28 - samples/sec: 282.14 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-06 16:08:42,899 epoch 5 - iter 135/152 - loss 0.15092835 - time (sec): 98.67 - samples/sec: 282.27 - lr: 0.000086 - momentum: 0.000000 |
|
2023-10-06 16:08:52,951 epoch 5 - iter 150/152 - loss 0.15021431 - time (sec): 108.72 - samples/sec: 282.95 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-06 16:08:53,974 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:08:53,975 EPOCH 5 done: loss 0.1494 - lr: 0.000084 |
|
2023-10-06 16:09:01,309 DEV : loss 0.17536617815494537 - f1-score (micro avg) 0.6989 |
|
2023-10-06 16:09:01,317 saving best model |
|
2023-10-06 16:09:05,624 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:09:15,972 epoch 6 - iter 15/152 - loss 0.14040234 - time (sec): 10.35 - samples/sec: 307.74 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-06 16:09:26,692 epoch 6 - iter 30/152 - loss 0.12875256 - time (sec): 21.07 - samples/sec: 305.60 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-06 16:09:37,492 epoch 6 - iter 45/152 - loss 0.12395352 - time (sec): 31.87 - samples/sec: 305.49 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-06 16:09:48,334 epoch 6 - iter 60/152 - loss 0.11736335 - time (sec): 42.71 - samples/sec: 300.74 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-06 16:09:58,601 epoch 6 - iter 75/152 - loss 0.11162072 - time (sec): 52.98 - samples/sec: 300.53 - lr: 0.000076 - momentum: 0.000000 |
|
2023-10-06 16:10:08,041 epoch 6 - iter 90/152 - loss 0.10797765 - time (sec): 62.42 - samples/sec: 297.76 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-06 16:10:18,903 epoch 6 - iter 105/152 - loss 0.10655617 - time (sec): 73.28 - samples/sec: 297.96 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-06 16:10:28,818 epoch 6 - iter 120/152 - loss 0.10611263 - time (sec): 83.19 - samples/sec: 296.81 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-06 16:10:39,027 epoch 6 - iter 135/152 - loss 0.10372822 - time (sec): 93.40 - samples/sec: 295.53 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-06 16:10:49,505 epoch 6 - iter 150/152 - loss 0.10463535 - time (sec): 103.88 - samples/sec: 295.31 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-06 16:10:50,622 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:10:50,623 EPOCH 6 done: loss 0.1058 - lr: 0.000067 |
|
2023-10-06 16:10:57,871 DEV : loss 0.15567055344581604 - f1-score (micro avg) 0.8195 |
|
2023-10-06 16:10:57,879 saving best model |
|
2023-10-06 16:11:02,242 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:11:12,218 epoch 7 - iter 15/152 - loss 0.07733067 - time (sec): 9.97 - samples/sec: 287.52 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-06 16:11:22,182 epoch 7 - iter 30/152 - loss 0.08154846 - time (sec): 19.94 - samples/sec: 288.58 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-06 16:11:32,564 epoch 7 - iter 45/152 - loss 0.07207774 - time (sec): 30.32 - samples/sec: 288.91 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-06 16:11:43,021 epoch 7 - iter 60/152 - loss 0.06814567 - time (sec): 40.78 - samples/sec: 290.09 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-06 16:11:53,448 epoch 7 - iter 75/152 - loss 0.06892973 - time (sec): 51.20 - samples/sec: 289.35 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-06 16:12:04,408 epoch 7 - iter 90/152 - loss 0.07718801 - time (sec): 62.17 - samples/sec: 291.18 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-06 16:12:15,055 epoch 7 - iter 105/152 - loss 0.07474215 - time (sec): 72.81 - samples/sec: 291.20 - lr: 0.000056 - momentum: 0.000000 |
|
2023-10-06 16:12:25,644 epoch 7 - iter 120/152 - loss 0.07507597 - time (sec): 83.40 - samples/sec: 289.76 - lr: 0.000054 - momentum: 0.000000 |
|
2023-10-06 16:12:36,817 epoch 7 - iter 135/152 - loss 0.07591231 - time (sec): 94.57 - samples/sec: 290.83 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-06 16:12:47,737 epoch 7 - iter 150/152 - loss 0.08084531 - time (sec): 105.49 - samples/sec: 290.18 - lr: 0.000051 - momentum: 0.000000 |
|
2023-10-06 16:12:49,027 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:12:49,028 EPOCH 7 done: loss 0.0802 - lr: 0.000051 |
|
2023-10-06 16:12:56,660 DEV : loss 0.14975833892822266 - f1-score (micro avg) 0.8255 |
|
2023-10-06 16:12:56,667 saving best model |
|
2023-10-06 16:13:01,000 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:13:11,751 epoch 8 - iter 15/152 - loss 0.08031097 - time (sec): 10.75 - samples/sec: 281.79 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-06 16:13:22,627 epoch 8 - iter 30/152 - loss 0.08197433 - time (sec): 21.62 - samples/sec: 276.26 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-06 16:13:32,872 epoch 8 - iter 45/152 - loss 0.07403169 - time (sec): 31.87 - samples/sec: 272.89 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-06 16:13:43,876 epoch 8 - iter 60/152 - loss 0.07606961 - time (sec): 42.87 - samples/sec: 274.59 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-06 16:13:55,223 epoch 8 - iter 75/152 - loss 0.07054570 - time (sec): 54.22 - samples/sec: 276.43 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-06 16:14:06,219 epoch 8 - iter 90/152 - loss 0.06846871 - time (sec): 65.22 - samples/sec: 275.27 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-06 16:14:17,729 epoch 8 - iter 105/152 - loss 0.06593960 - time (sec): 76.73 - samples/sec: 276.19 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-06 16:14:29,435 epoch 8 - iter 120/152 - loss 0.06549689 - time (sec): 88.43 - samples/sec: 278.16 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-06 16:14:40,711 epoch 8 - iter 135/152 - loss 0.06601897 - time (sec): 99.71 - samples/sec: 278.56 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-06 16:14:51,103 epoch 8 - iter 150/152 - loss 0.06589859 - time (sec): 110.10 - samples/sec: 277.69 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-06 16:14:52,590 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:14:52,590 EPOCH 8 done: loss 0.0657 - lr: 0.000034 |
|
2023-10-06 16:15:00,397 DEV : loss 0.14192526042461395 - f1-score (micro avg) 0.8298 |
|
2023-10-06 16:15:00,404 saving best model |
|
2023-10-06 16:15:04,738 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:15:16,420 epoch 9 - iter 15/152 - loss 0.08055117 - time (sec): 11.68 - samples/sec: 289.30 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-06 16:15:27,670 epoch 9 - iter 30/152 - loss 0.07059158 - time (sec): 22.93 - samples/sec: 286.22 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-06 16:15:38,460 epoch 9 - iter 45/152 - loss 0.06701449 - time (sec): 33.72 - samples/sec: 282.38 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-06 16:15:49,085 epoch 9 - iter 60/152 - loss 0.06306260 - time (sec): 44.35 - samples/sec: 279.62 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-06 16:16:00,117 epoch 9 - iter 75/152 - loss 0.05895212 - time (sec): 55.38 - samples/sec: 276.85 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-06 16:16:11,413 epoch 9 - iter 90/152 - loss 0.05798036 - time (sec): 66.67 - samples/sec: 276.98 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-06 16:16:22,971 epoch 9 - iter 105/152 - loss 0.05724903 - time (sec): 78.23 - samples/sec: 277.15 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-06 16:16:33,774 epoch 9 - iter 120/152 - loss 0.05970107 - time (sec): 89.03 - samples/sec: 276.13 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-06 16:16:44,573 epoch 9 - iter 135/152 - loss 0.05612059 - time (sec): 99.83 - samples/sec: 276.36 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-06 16:16:55,558 epoch 9 - iter 150/152 - loss 0.05649651 - time (sec): 110.82 - samples/sec: 276.24 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-06 16:16:56,798 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:16:56,798 EPOCH 9 done: loss 0.0569 - lr: 0.000018 |
|
2023-10-06 16:17:04,762 DEV : loss 0.14784981310367584 - f1-score (micro avg) 0.826 |
|
2023-10-06 16:17:04,771 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:17:15,545 epoch 10 - iter 15/152 - loss 0.06202699 - time (sec): 10.77 - samples/sec: 266.79 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-06 16:17:27,709 epoch 10 - iter 30/152 - loss 0.05314995 - time (sec): 22.94 - samples/sec: 277.98 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-06 16:17:38,589 epoch 10 - iter 45/152 - loss 0.05329503 - time (sec): 33.82 - samples/sec: 276.99 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-06 16:17:49,528 epoch 10 - iter 60/152 - loss 0.05040656 - time (sec): 44.76 - samples/sec: 276.12 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-06 16:18:00,129 epoch 10 - iter 75/152 - loss 0.05349118 - time (sec): 55.36 - samples/sec: 273.26 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-06 16:18:11,286 epoch 10 - iter 90/152 - loss 0.05167885 - time (sec): 66.51 - samples/sec: 274.66 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-06 16:18:22,040 epoch 10 - iter 105/152 - loss 0.04974350 - time (sec): 77.27 - samples/sec: 272.88 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-06 16:18:33,569 epoch 10 - iter 120/152 - loss 0.05016859 - time (sec): 88.80 - samples/sec: 273.80 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-06 16:18:44,895 epoch 10 - iter 135/152 - loss 0.04954922 - time (sec): 100.12 - samples/sec: 274.62 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-06 16:18:56,220 epoch 10 - iter 150/152 - loss 0.05098316 - time (sec): 111.45 - samples/sec: 275.71 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-06 16:18:57,336 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:18:57,336 EPOCH 10 done: loss 0.0514 - lr: 0.000001 |
|
2023-10-06 16:19:05,303 DEV : loss 0.14607474207878113 - f1-score (micro avg) 0.8323 |
|
2023-10-06 16:19:05,311 saving best model |
|
2023-10-06 16:19:10,518 ---------------------------------------------------------------------------------------------------- |
|
2023-10-06 16:19:10,519 Loading model from best epoch ... |
|
2023-10-06 16:19:13,086 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object |
|
2023-10-06 16:19:20,434 |
|
Results: |
|
- F-score (micro) 0.794 |
|
- F-score (macro) 0.4833 |
|
- Accuracy 0.6659 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
scope 0.7500 0.7947 0.7717 151 |
|
work 0.6949 0.8632 0.7700 95 |
|
pers 0.8125 0.9479 0.8750 96 |
|
loc 0.0000 0.0000 0.0000 3 |
|
date 0.0000 0.0000 0.0000 3 |
|
|
|
micro avg 0.7513 0.8420 0.7940 348 |
|
macro avg 0.4515 0.5212 0.4833 348 |
|
weighted avg 0.7393 0.8420 0.7864 348 |
|
|
|
2023-10-06 16:19:20,434 ---------------------------------------------------------------------------------------------------- |
|
|