|
2023-10-18 16:45:47,734 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:45:47,735 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=25, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 16:45:47,735 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:45:47,735 MultiCorpus: 966 train + 219 dev + 204 test sentences |
|
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator |
|
2023-10-18 16:45:47,735 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:45:47,735 Train: 966 sentences |
|
2023-10-18 16:45:47,735 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 16:45:47,735 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:45:47,735 Training Params: |
|
2023-10-18 16:45:47,735 - learning_rate: "3e-05" |
|
2023-10-18 16:45:47,735 - mini_batch_size: "8" |
|
2023-10-18 16:45:47,735 - max_epochs: "10" |
|
2023-10-18 16:45:47,735 - shuffle: "True" |
|
2023-10-18 16:45:47,735 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:45:47,735 Plugins: |
|
2023-10-18 16:45:47,735 - TensorboardLogger |
|
2023-10-18 16:45:47,735 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 16:45:47,735 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:45:47,735 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 16:45:47,735 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 16:45:47,735 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:45:47,735 Computation: |
|
2023-10-18 16:45:47,736 - compute on device: cuda:0 |
|
2023-10-18 16:45:47,736 - embedding storage: none |
|
2023-10-18 16:45:47,736 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:45:47,736 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-18 16:45:47,736 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:45:47,736 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:45:47,736 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 16:45:47,979 epoch 1 - iter 12/121 - loss 3.74648411 - time (sec): 0.24 - samples/sec: 10419.20 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 16:45:48,227 epoch 1 - iter 24/121 - loss 3.68370248 - time (sec): 0.49 - samples/sec: 10578.93 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 16:45:48,496 epoch 1 - iter 36/121 - loss 3.65976179 - time (sec): 0.76 - samples/sec: 10007.70 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 16:45:48,738 epoch 1 - iter 48/121 - loss 3.60095226 - time (sec): 1.00 - samples/sec: 10140.29 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 16:45:49,021 epoch 1 - iter 60/121 - loss 3.51597596 - time (sec): 1.28 - samples/sec: 9910.17 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 16:45:49,303 epoch 1 - iter 72/121 - loss 3.43277478 - time (sec): 1.57 - samples/sec: 9715.59 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 16:45:49,589 epoch 1 - iter 84/121 - loss 3.30444397 - time (sec): 1.85 - samples/sec: 9586.32 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 16:45:49,859 epoch 1 - iter 96/121 - loss 3.18042863 - time (sec): 2.12 - samples/sec: 9525.74 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 16:45:50,127 epoch 1 - iter 108/121 - loss 3.04807606 - time (sec): 2.39 - samples/sec: 9357.55 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 16:45:50,391 epoch 1 - iter 120/121 - loss 2.92090205 - time (sec): 2.65 - samples/sec: 9277.71 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 16:45:50,410 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:45:50,410 EPOCH 1 done: loss 2.9143 - lr: 0.000030 |
|
2023-10-18 16:45:50,808 DEV : loss 0.863234281539917 - f1-score (micro avg) 0.0 |
|
2023-10-18 16:45:50,813 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:45:51,100 epoch 2 - iter 12/121 - loss 1.30082198 - time (sec): 0.29 - samples/sec: 8186.36 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 16:45:51,370 epoch 2 - iter 24/121 - loss 1.12777448 - time (sec): 0.56 - samples/sec: 8276.00 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 16:45:51,628 epoch 2 - iter 36/121 - loss 1.05004211 - time (sec): 0.81 - samples/sec: 8458.78 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 16:45:51,923 epoch 2 - iter 48/121 - loss 0.99630462 - time (sec): 1.11 - samples/sec: 8499.56 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 16:45:52,201 epoch 2 - iter 60/121 - loss 0.95026676 - time (sec): 1.39 - samples/sec: 8650.92 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 16:45:52,487 epoch 2 - iter 72/121 - loss 0.91025975 - time (sec): 1.67 - samples/sec: 8866.76 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 16:45:52,755 epoch 2 - iter 84/121 - loss 0.88549633 - time (sec): 1.94 - samples/sec: 8854.31 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 16:45:53,025 epoch 2 - iter 96/121 - loss 0.86102880 - time (sec): 2.21 - samples/sec: 8959.22 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 16:45:53,297 epoch 2 - iter 108/121 - loss 0.84547299 - time (sec): 2.48 - samples/sec: 9030.07 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 16:45:53,557 epoch 2 - iter 120/121 - loss 0.83142002 - time (sec): 2.74 - samples/sec: 8989.60 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 16:45:53,574 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:45:53,574 EPOCH 2 done: loss 0.8309 - lr: 0.000027 |
|
2023-10-18 16:45:53,997 DEV : loss 0.624695360660553 - f1-score (micro avg) 0.0 |
|
2023-10-18 16:45:54,001 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:45:54,261 epoch 3 - iter 12/121 - loss 0.51803915 - time (sec): 0.26 - samples/sec: 9189.38 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 16:45:54,522 epoch 3 - iter 24/121 - loss 0.68662225 - time (sec): 0.52 - samples/sec: 9477.05 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 16:45:54,794 epoch 3 - iter 36/121 - loss 0.69658370 - time (sec): 0.79 - samples/sec: 9537.10 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 16:45:55,066 epoch 3 - iter 48/121 - loss 0.66937676 - time (sec): 1.06 - samples/sec: 9386.16 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 16:45:55,327 epoch 3 - iter 60/121 - loss 0.65572614 - time (sec): 1.33 - samples/sec: 9146.96 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 16:45:55,595 epoch 3 - iter 72/121 - loss 0.65498713 - time (sec): 1.59 - samples/sec: 9146.01 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 16:45:55,873 epoch 3 - iter 84/121 - loss 0.65702703 - time (sec): 1.87 - samples/sec: 9040.65 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 16:45:56,157 epoch 3 - iter 96/121 - loss 0.66350106 - time (sec): 2.15 - samples/sec: 9166.92 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 16:45:56,427 epoch 3 - iter 108/121 - loss 0.66063703 - time (sec): 2.42 - samples/sec: 9129.72 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 16:45:56,701 epoch 3 - iter 120/121 - loss 0.64791186 - time (sec): 2.70 - samples/sec: 9128.68 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 16:45:56,721 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:45:56,721 EPOCH 3 done: loss 0.6467 - lr: 0.000023 |
|
2023-10-18 16:45:57,142 DEV : loss 0.5270166397094727 - f1-score (micro avg) 0.0 |
|
2023-10-18 16:45:57,146 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:45:57,424 epoch 4 - iter 12/121 - loss 0.56958966 - time (sec): 0.28 - samples/sec: 9007.46 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 16:45:57,698 epoch 4 - iter 24/121 - loss 0.61242243 - time (sec): 0.55 - samples/sec: 8826.66 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 16:45:57,967 epoch 4 - iter 36/121 - loss 0.58608041 - time (sec): 0.82 - samples/sec: 9167.49 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 16:45:58,231 epoch 4 - iter 48/121 - loss 0.57641784 - time (sec): 1.08 - samples/sec: 9220.79 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 16:45:58,511 epoch 4 - iter 60/121 - loss 0.58020126 - time (sec): 1.36 - samples/sec: 9084.86 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 16:45:58,785 epoch 4 - iter 72/121 - loss 0.58511099 - time (sec): 1.64 - samples/sec: 9061.23 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 16:45:59,062 epoch 4 - iter 84/121 - loss 0.59626739 - time (sec): 1.92 - samples/sec: 9079.67 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 16:45:59,344 epoch 4 - iter 96/121 - loss 0.57746753 - time (sec): 2.20 - samples/sec: 9029.03 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 16:45:59,622 epoch 4 - iter 108/121 - loss 0.57253977 - time (sec): 2.48 - samples/sec: 9009.55 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 16:45:59,903 epoch 4 - iter 120/121 - loss 0.56945905 - time (sec): 2.76 - samples/sec: 8917.37 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 16:45:59,924 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:45:59,924 EPOCH 4 done: loss 0.5669 - lr: 0.000020 |
|
2023-10-18 16:46:00,341 DEV : loss 0.44734323024749756 - f1-score (micro avg) 0.2308 |
|
2023-10-18 16:46:00,345 saving best model |
|
2023-10-18 16:46:00,379 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:46:00,646 epoch 5 - iter 12/121 - loss 0.61451524 - time (sec): 0.27 - samples/sec: 9623.71 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 16:46:00,905 epoch 5 - iter 24/121 - loss 0.58469911 - time (sec): 0.53 - samples/sec: 9407.80 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 16:46:01,179 epoch 5 - iter 36/121 - loss 0.56048350 - time (sec): 0.80 - samples/sec: 9570.33 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 16:46:01,445 epoch 5 - iter 48/121 - loss 0.54159321 - time (sec): 1.07 - samples/sec: 9406.63 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 16:46:01,716 epoch 5 - iter 60/121 - loss 0.53418200 - time (sec): 1.34 - samples/sec: 9502.58 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 16:46:01,989 epoch 5 - iter 72/121 - loss 0.51851423 - time (sec): 1.61 - samples/sec: 9296.97 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 16:46:02,257 epoch 5 - iter 84/121 - loss 0.51233355 - time (sec): 1.88 - samples/sec: 9282.31 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 16:46:02,522 epoch 5 - iter 96/121 - loss 0.50988761 - time (sec): 2.14 - samples/sec: 9193.17 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 16:46:02,794 epoch 5 - iter 108/121 - loss 0.50842276 - time (sec): 2.41 - samples/sec: 9172.41 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 16:46:03,071 epoch 5 - iter 120/121 - loss 0.50411087 - time (sec): 2.69 - samples/sec: 9117.67 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 16:46:03,096 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:46:03,096 EPOCH 5 done: loss 0.5095 - lr: 0.000017 |
|
2023-10-18 16:46:03,527 DEV : loss 0.40638014674186707 - f1-score (micro avg) 0.4 |
|
2023-10-18 16:46:03,532 saving best model |
|
2023-10-18 16:46:03,567 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:46:03,777 epoch 6 - iter 12/121 - loss 0.49439897 - time (sec): 0.21 - samples/sec: 11461.53 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 16:46:04,025 epoch 6 - iter 24/121 - loss 0.51316033 - time (sec): 0.46 - samples/sec: 10733.13 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 16:46:04,301 epoch 6 - iter 36/121 - loss 0.48864252 - time (sec): 0.73 - samples/sec: 10650.78 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 16:46:04,575 epoch 6 - iter 48/121 - loss 0.48777182 - time (sec): 1.01 - samples/sec: 10326.34 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 16:46:04,871 epoch 6 - iter 60/121 - loss 0.49044822 - time (sec): 1.30 - samples/sec: 9888.20 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 16:46:05,146 epoch 6 - iter 72/121 - loss 0.48735073 - time (sec): 1.58 - samples/sec: 9607.22 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 16:46:05,411 epoch 6 - iter 84/121 - loss 0.47966025 - time (sec): 1.84 - samples/sec: 9577.06 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 16:46:05,673 epoch 6 - iter 96/121 - loss 0.47400510 - time (sec): 2.11 - samples/sec: 9468.08 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 16:46:05,943 epoch 6 - iter 108/121 - loss 0.47385862 - time (sec): 2.38 - samples/sec: 9315.37 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 16:46:06,209 epoch 6 - iter 120/121 - loss 0.47174045 - time (sec): 2.64 - samples/sec: 9300.29 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 16:46:06,230 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:46:06,230 EPOCH 6 done: loss 0.4726 - lr: 0.000013 |
|
2023-10-18 16:46:06,655 DEV : loss 0.3771236538887024 - f1-score (micro avg) 0.4637 |
|
2023-10-18 16:46:06,660 saving best model |
|
2023-10-18 16:46:06,696 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:46:06,959 epoch 7 - iter 12/121 - loss 0.50855341 - time (sec): 0.26 - samples/sec: 8664.05 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 16:46:07,227 epoch 7 - iter 24/121 - loss 0.50309165 - time (sec): 0.53 - samples/sec: 8465.37 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 16:46:07,495 epoch 7 - iter 36/121 - loss 0.48435900 - time (sec): 0.80 - samples/sec: 8884.18 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 16:46:07,765 epoch 7 - iter 48/121 - loss 0.49070949 - time (sec): 1.07 - samples/sec: 8975.73 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 16:46:08,030 epoch 7 - iter 60/121 - loss 0.47805378 - time (sec): 1.33 - samples/sec: 9007.50 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 16:46:08,292 epoch 7 - iter 72/121 - loss 0.45865677 - time (sec): 1.60 - samples/sec: 9056.59 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 16:46:08,557 epoch 7 - iter 84/121 - loss 0.46285654 - time (sec): 1.86 - samples/sec: 9095.10 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 16:46:08,824 epoch 7 - iter 96/121 - loss 0.45698473 - time (sec): 2.13 - samples/sec: 9137.84 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 16:46:09,097 epoch 7 - iter 108/121 - loss 0.45010642 - time (sec): 2.40 - samples/sec: 9154.53 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 16:46:09,380 epoch 7 - iter 120/121 - loss 0.44388889 - time (sec): 2.68 - samples/sec: 9163.62 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 16:46:09,402 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:46:09,402 EPOCH 7 done: loss 0.4439 - lr: 0.000010 |
|
2023-10-18 16:46:09,830 DEV : loss 0.36025094985961914 - f1-score (micro avg) 0.4861 |
|
2023-10-18 16:46:09,834 saving best model |
|
2023-10-18 16:46:09,867 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:46:10,139 epoch 8 - iter 12/121 - loss 0.49178735 - time (sec): 0.27 - samples/sec: 9099.14 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 16:46:10,405 epoch 8 - iter 24/121 - loss 0.45343472 - time (sec): 0.54 - samples/sec: 9473.71 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 16:46:10,671 epoch 8 - iter 36/121 - loss 0.43985722 - time (sec): 0.80 - samples/sec: 9700.73 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 16:46:10,927 epoch 8 - iter 48/121 - loss 0.44642953 - time (sec): 1.06 - samples/sec: 9520.09 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 16:46:11,184 epoch 8 - iter 60/121 - loss 0.44974196 - time (sec): 1.32 - samples/sec: 9296.92 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 16:46:11,450 epoch 8 - iter 72/121 - loss 0.44562545 - time (sec): 1.58 - samples/sec: 9319.65 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 16:46:11,713 epoch 8 - iter 84/121 - loss 0.43244860 - time (sec): 1.85 - samples/sec: 9360.43 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 16:46:11,980 epoch 8 - iter 96/121 - loss 0.43229543 - time (sec): 2.11 - samples/sec: 9278.73 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 16:46:12,253 epoch 8 - iter 108/121 - loss 0.42846687 - time (sec): 2.39 - samples/sec: 9308.05 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 16:46:12,511 epoch 8 - iter 120/121 - loss 0.42835762 - time (sec): 2.64 - samples/sec: 9321.29 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 16:46:12,529 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:46:12,529 EPOCH 8 done: loss 0.4275 - lr: 0.000007 |
|
2023-10-18 16:46:12,963 DEV : loss 0.34982264041900635 - f1-score (micro avg) 0.4863 |
|
2023-10-18 16:46:12,967 saving best model |
|
2023-10-18 16:46:13,001 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:46:13,269 epoch 9 - iter 12/121 - loss 0.51786957 - time (sec): 0.27 - samples/sec: 9415.14 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 16:46:13,544 epoch 9 - iter 24/121 - loss 0.42988999 - time (sec): 0.54 - samples/sec: 9067.47 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 16:46:13,824 epoch 9 - iter 36/121 - loss 0.43626895 - time (sec): 0.82 - samples/sec: 8928.87 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 16:46:14,085 epoch 9 - iter 48/121 - loss 0.42446742 - time (sec): 1.08 - samples/sec: 8791.01 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 16:46:14,368 epoch 9 - iter 60/121 - loss 0.41563092 - time (sec): 1.37 - samples/sec: 9065.85 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 16:46:14,631 epoch 9 - iter 72/121 - loss 0.43338714 - time (sec): 1.63 - samples/sec: 9007.42 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 16:46:14,919 epoch 9 - iter 84/121 - loss 0.41611575 - time (sec): 1.92 - samples/sec: 9056.31 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 16:46:15,183 epoch 9 - iter 96/121 - loss 0.42665487 - time (sec): 2.18 - samples/sec: 9021.17 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 16:46:15,452 epoch 9 - iter 108/121 - loss 0.42619722 - time (sec): 2.45 - samples/sec: 9038.76 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 16:46:15,726 epoch 9 - iter 120/121 - loss 0.41925859 - time (sec): 2.72 - samples/sec: 9040.39 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 16:46:15,749 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:46:15,749 EPOCH 9 done: loss 0.4194 - lr: 0.000004 |
|
2023-10-18 16:46:16,188 DEV : loss 0.3439044654369354 - f1-score (micro avg) 0.4879 |
|
2023-10-18 16:46:16,193 saving best model |
|
2023-10-18 16:46:16,228 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:46:16,482 epoch 10 - iter 12/121 - loss 0.42384553 - time (sec): 0.25 - samples/sec: 9536.45 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 16:46:16,749 epoch 10 - iter 24/121 - loss 0.41767451 - time (sec): 0.52 - samples/sec: 9890.13 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 16:46:17,005 epoch 10 - iter 36/121 - loss 0.41253948 - time (sec): 0.78 - samples/sec: 9228.11 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 16:46:17,280 epoch 10 - iter 48/121 - loss 0.41003779 - time (sec): 1.05 - samples/sec: 9390.56 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 16:46:17,557 epoch 10 - iter 60/121 - loss 0.41735594 - time (sec): 1.33 - samples/sec: 9224.41 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 16:46:17,819 epoch 10 - iter 72/121 - loss 0.41559357 - time (sec): 1.59 - samples/sec: 9388.81 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 16:46:18,096 epoch 10 - iter 84/121 - loss 0.41533960 - time (sec): 1.87 - samples/sec: 9452.69 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 16:46:18,345 epoch 10 - iter 96/121 - loss 0.42148921 - time (sec): 2.12 - samples/sec: 9451.51 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 16:46:18,606 epoch 10 - iter 108/121 - loss 0.42097347 - time (sec): 2.38 - samples/sec: 9387.80 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 16:46:18,868 epoch 10 - iter 120/121 - loss 0.41708506 - time (sec): 2.64 - samples/sec: 9315.15 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 16:46:18,888 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:46:18,888 EPOCH 10 done: loss 0.4174 - lr: 0.000000 |
|
2023-10-18 16:46:19,325 DEV : loss 0.3418276309967041 - f1-score (micro avg) 0.4922 |
|
2023-10-18 16:46:19,330 saving best model |
|
2023-10-18 16:46:19,395 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:46:19,395 Loading model from best epoch ... |
|
2023-10-18 16:46:19,477 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date |
|
2023-10-18 16:46:19,913 |
|
Results: |
|
- F-score (micro) 0.4341 |
|
- F-score (macro) 0.204 |
|
- Accuracy 0.2881 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
pers 0.6000 0.6043 0.6022 139 |
|
scope 0.4029 0.4341 0.4179 129 |
|
work 0.0000 0.0000 0.0000 80 |
|
loc 0.0000 0.0000 0.0000 9 |
|
date 0.0000 0.0000 0.0000 3 |
|
|
|
micro avg 0.4912 0.3889 0.4341 360 |
|
macro avg 0.2006 0.2077 0.2040 360 |
|
weighted avg 0.3760 0.3889 0.3822 360 |
|
|
|
2023-10-18 16:46:19,913 ---------------------------------------------------------------------------------------------------- |
|
|