Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697660469.46dc0c540dd0.3341.0 +3 -0
- test.tsv +0 -0
- training.log +246 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e061352ba49fb2ced2c39be8a227946afb6d3828a7abd60afbd6b8bf79a52edc
|
3 |
+
size 19045922
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 20:21:41 0.0000 0.9947 0.2363 0.3230 0.1765 0.2282 0.1421
|
3 |
+
2 20:22:14 0.0000 0.2791 0.1780 0.3797 0.3767 0.3782 0.2554
|
4 |
+
3 20:22:46 0.0000 0.2286 0.1582 0.4366 0.4559 0.4460 0.3166
|
5 |
+
4 20:23:18 0.0000 0.2025 0.1538 0.5335 0.5317 0.5326 0.3910
|
6 |
+
5 20:23:51 0.0000 0.1890 0.1442 0.5396 0.5939 0.5654 0.4258
|
7 |
+
6 20:24:22 0.0000 0.1771 0.1422 0.5572 0.5950 0.5755 0.4358
|
8 |
+
7 20:24:55 0.0000 0.1692 0.1411 0.5944 0.5803 0.5873 0.4453
|
9 |
+
8 20:25:27 0.0000 0.1630 0.1417 0.5923 0.6131 0.6026 0.4629
|
10 |
+
9 20:26:00 0.0000 0.1602 0.1408 0.5936 0.6063 0.5999 0.4601
|
11 |
+
10 20:26:32 0.0000 0.1550 0.1398 0.5960 0.6109 0.6034 0.4643
|
runs/events.out.tfevents.1697660469.46dc0c540dd0.3341.0
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7aea2c33d79ec75513b067b63c74e9b008a9da47c9eec2ec044c05f4846fa8b5
|
3 |
+
size 1108164
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,246 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-18 20:21:09,554 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-18 20:21:09,555 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 128)
|
7 |
+
(position_embeddings): Embedding(512, 128)
|
8 |
+
(token_type_embeddings): Embedding(2, 128)
|
9 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-1): 2 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=128, out_features=128, bias=True)
|
18 |
+
(key): Linear(in_features=128, out_features=128, bias=True)
|
19 |
+
(value): Linear(in_features=128, out_features=128, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=128, out_features=128, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=128, out_features=512, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=512, out_features=128, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=128, out_features=128, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=128, out_features=13, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-18 20:21:09,555 MultiCorpus: 7936 train + 992 dev + 992 test sentences
|
52 |
+
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
|
53 |
+
2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-18 20:21:09,555 Train: 7936 sentences
|
55 |
+
2023-10-18 20:21:09,555 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-18 20:21:09,555 Training Params:
|
58 |
+
2023-10-18 20:21:09,555 - learning_rate: "3e-05"
|
59 |
+
2023-10-18 20:21:09,555 - mini_batch_size: "4"
|
60 |
+
2023-10-18 20:21:09,555 - max_epochs: "10"
|
61 |
+
2023-10-18 20:21:09,555 - shuffle: "True"
|
62 |
+
2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-18 20:21:09,555 Plugins:
|
64 |
+
2023-10-18 20:21:09,555 - TensorboardLogger
|
65 |
+
2023-10-18 20:21:09,555 - LinearScheduler | warmup_fraction: '0.1'
|
66 |
+
2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
|
67 |
+
2023-10-18 20:21:09,555 Final evaluation on model from best epoch (best-model.pt)
|
68 |
+
2023-10-18 20:21:09,555 - metric: "('micro avg', 'f1-score')"
|
69 |
+
2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-18 20:21:09,555 Computation:
|
71 |
+
2023-10-18 20:21:09,555 - compute on device: cuda:0
|
72 |
+
2023-10-18 20:21:09,555 - embedding storage: none
|
73 |
+
2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
|
74 |
+
2023-10-18 20:21:09,556 Model training base path: "hmbench-icdar/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
|
75 |
+
2023-10-18 20:21:09,556 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-18 20:21:09,556 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-10-18 20:21:09,556 Logging anything other than scalars to TensorBoard is currently not supported.
|
78 |
+
2023-10-18 20:21:13,160 epoch 1 - iter 198/1984 - loss 3.19239065 - time (sec): 3.60 - samples/sec: 4520.00 - lr: 0.000003 - momentum: 0.000000
|
79 |
+
2023-10-18 20:21:16,212 epoch 1 - iter 396/1984 - loss 2.81648826 - time (sec): 6.66 - samples/sec: 4879.36 - lr: 0.000006 - momentum: 0.000000
|
80 |
+
2023-10-18 20:21:19,275 epoch 1 - iter 594/1984 - loss 2.28737400 - time (sec): 9.72 - samples/sec: 5062.60 - lr: 0.000009 - momentum: 0.000000
|
81 |
+
2023-10-18 20:21:22,346 epoch 1 - iter 792/1984 - loss 1.86592392 - time (sec): 12.79 - samples/sec: 5140.12 - lr: 0.000012 - momentum: 0.000000
|
82 |
+
2023-10-18 20:21:25,305 epoch 1 - iter 990/1984 - loss 1.59866230 - time (sec): 15.75 - samples/sec: 5171.07 - lr: 0.000015 - momentum: 0.000000
|
83 |
+
2023-10-18 20:21:28,333 epoch 1 - iter 1188/1984 - loss 1.40641142 - time (sec): 18.78 - samples/sec: 5213.98 - lr: 0.000018 - momentum: 0.000000
|
84 |
+
2023-10-18 20:21:31,352 epoch 1 - iter 1386/1984 - loss 1.26714919 - time (sec): 21.80 - samples/sec: 5230.21 - lr: 0.000021 - momentum: 0.000000
|
85 |
+
2023-10-18 20:21:34,376 epoch 1 - iter 1584/1984 - loss 1.15707396 - time (sec): 24.82 - samples/sec: 5276.51 - lr: 0.000024 - momentum: 0.000000
|
86 |
+
2023-10-18 20:21:37,250 epoch 1 - iter 1782/1984 - loss 1.07044664 - time (sec): 27.69 - samples/sec: 5313.14 - lr: 0.000027 - momentum: 0.000000
|
87 |
+
2023-10-18 20:21:40,102 epoch 1 - iter 1980/1984 - loss 0.99619007 - time (sec): 30.55 - samples/sec: 5356.70 - lr: 0.000030 - momentum: 0.000000
|
88 |
+
2023-10-18 20:21:40,161 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-18 20:21:40,161 EPOCH 1 done: loss 0.9947 - lr: 0.000030
|
90 |
+
2023-10-18 20:21:41,606 DEV : loss 0.23625816404819489 - f1-score (micro avg) 0.2282
|
91 |
+
2023-10-18 20:21:41,623 saving best model
|
92 |
+
2023-10-18 20:21:41,659 ----------------------------------------------------------------------------------------------------
|
93 |
+
2023-10-18 20:21:44,713 epoch 2 - iter 198/1984 - loss 0.34878745 - time (sec): 3.05 - samples/sec: 5568.75 - lr: 0.000030 - momentum: 0.000000
|
94 |
+
2023-10-18 20:21:47,768 epoch 2 - iter 396/1984 - loss 0.32539556 - time (sec): 6.11 - samples/sec: 5539.51 - lr: 0.000029 - momentum: 0.000000
|
95 |
+
2023-10-18 20:21:51,158 epoch 2 - iter 594/1984 - loss 0.31675399 - time (sec): 9.50 - samples/sec: 5273.19 - lr: 0.000029 - momentum: 0.000000
|
96 |
+
2023-10-18 20:21:54,182 epoch 2 - iter 792/1984 - loss 0.30310641 - time (sec): 12.52 - samples/sec: 5269.51 - lr: 0.000029 - momentum: 0.000000
|
97 |
+
2023-10-18 20:21:57,250 epoch 2 - iter 990/1984 - loss 0.29671952 - time (sec): 15.59 - samples/sec: 5326.88 - lr: 0.000028 - momentum: 0.000000
|
98 |
+
2023-10-18 20:22:00,316 epoch 2 - iter 1188/1984 - loss 0.29371486 - time (sec): 18.66 - samples/sec: 5332.97 - lr: 0.000028 - momentum: 0.000000
|
99 |
+
2023-10-18 20:22:03,364 epoch 2 - iter 1386/1984 - loss 0.28559705 - time (sec): 21.70 - samples/sec: 5357.01 - lr: 0.000028 - momentum: 0.000000
|
100 |
+
2023-10-18 20:22:06,430 epoch 2 - iter 1584/1984 - loss 0.28450858 - time (sec): 24.77 - samples/sec: 5353.75 - lr: 0.000027 - momentum: 0.000000
|
101 |
+
2023-10-18 20:22:09,535 epoch 2 - iter 1782/1984 - loss 0.28203045 - time (sec): 27.87 - samples/sec: 5316.94 - lr: 0.000027 - momentum: 0.000000
|
102 |
+
2023-10-18 20:22:12,574 epoch 2 - iter 1980/1984 - loss 0.27933585 - time (sec): 30.91 - samples/sec: 5293.09 - lr: 0.000027 - momentum: 0.000000
|
103 |
+
2023-10-18 20:22:12,634 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-18 20:22:12,634 EPOCH 2 done: loss 0.2791 - lr: 0.000027
|
105 |
+
2023-10-18 20:22:14,454 DEV : loss 0.17796094715595245 - f1-score (micro avg) 0.3782
|
106 |
+
2023-10-18 20:22:14,472 saving best model
|
107 |
+
2023-10-18 20:22:14,505 ----------------------------------------------------------------------------------------------------
|
108 |
+
2023-10-18 20:22:17,635 epoch 3 - iter 198/1984 - loss 0.21732311 - time (sec): 3.13 - samples/sec: 5209.34 - lr: 0.000026 - momentum: 0.000000
|
109 |
+
2023-10-18 20:22:20,674 epoch 3 - iter 396/1984 - loss 0.21382431 - time (sec): 6.17 - samples/sec: 5318.22 - lr: 0.000026 - momentum: 0.000000
|
110 |
+
2023-10-18 20:22:23,712 epoch 3 - iter 594/1984 - loss 0.23635522 - time (sec): 9.21 - samples/sec: 5292.83 - lr: 0.000026 - momentum: 0.000000
|
111 |
+
2023-10-18 20:22:26,797 epoch 3 - iter 792/1984 - loss 0.23384899 - time (sec): 12.29 - samples/sec: 5342.49 - lr: 0.000025 - momentum: 0.000000
|
112 |
+
2023-10-18 20:22:29,630 epoch 3 - iter 990/1984 - loss 0.23319557 - time (sec): 15.12 - samples/sec: 5373.24 - lr: 0.000025 - momentum: 0.000000
|
113 |
+
2023-10-18 20:22:32,703 epoch 3 - iter 1188/1984 - loss 0.23261618 - time (sec): 18.20 - samples/sec: 5391.85 - lr: 0.000025 - momentum: 0.000000
|
114 |
+
2023-10-18 20:22:35,702 epoch 3 - iter 1386/1984 - loss 0.23393194 - time (sec): 21.20 - samples/sec: 5382.13 - lr: 0.000024 - momentum: 0.000000
|
115 |
+
2023-10-18 20:22:38,761 epoch 3 - iter 1584/1984 - loss 0.23335714 - time (sec): 24.26 - samples/sec: 5394.11 - lr: 0.000024 - momentum: 0.000000
|
116 |
+
2023-10-18 20:22:41,803 epoch 3 - iter 1782/1984 - loss 0.23015129 - time (sec): 27.30 - samples/sec: 5396.62 - lr: 0.000024 - momentum: 0.000000
|
117 |
+
2023-10-18 20:22:44,850 epoch 3 - iter 1980/1984 - loss 0.22849262 - time (sec): 30.34 - samples/sec: 5389.37 - lr: 0.000023 - momentum: 0.000000
|
118 |
+
2023-10-18 20:22:44,920 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-18 20:22:44,920 EPOCH 3 done: loss 0.2286 - lr: 0.000023
|
120 |
+
2023-10-18 20:22:46,719 DEV : loss 0.15816588699817657 - f1-score (micro avg) 0.446
|
121 |
+
2023-10-18 20:22:46,736 saving best model
|
122 |
+
2023-10-18 20:22:46,766 ----------------------------------------------------------------------------------------------------
|
123 |
+
2023-10-18 20:22:49,823 epoch 4 - iter 198/1984 - loss 0.21892155 - time (sec): 3.06 - samples/sec: 5353.62 - lr: 0.000023 - momentum: 0.000000
|
124 |
+
2023-10-18 20:22:52,844 epoch 4 - iter 396/1984 - loss 0.22023973 - time (sec): 6.08 - samples/sec: 5341.75 - lr: 0.000023 - momentum: 0.000000
|
125 |
+
2023-10-18 20:22:55,849 epoch 4 - iter 594/1984 - loss 0.21519154 - time (sec): 9.08 - samples/sec: 5280.36 - lr: 0.000022 - momentum: 0.000000
|
126 |
+
2023-10-18 20:22:58,885 epoch 4 - iter 792/1984 - loss 0.21938168 - time (sec): 12.12 - samples/sec: 5231.69 - lr: 0.000022 - momentum: 0.000000
|
127 |
+
2023-10-18 20:23:01,883 epoch 4 - iter 990/1984 - loss 0.21550684 - time (sec): 15.12 - samples/sec: 5252.24 - lr: 0.000022 - momentum: 0.000000
|
128 |
+
2023-10-18 20:23:04,896 epoch 4 - iter 1188/1984 - loss 0.20933788 - time (sec): 18.13 - samples/sec: 5284.31 - lr: 0.000021 - momentum: 0.000000
|
129 |
+
2023-10-18 20:23:07,904 epoch 4 - iter 1386/1984 - loss 0.20913435 - time (sec): 21.14 - samples/sec: 5366.26 - lr: 0.000021 - momentum: 0.000000
|
130 |
+
2023-10-18 20:23:10,953 epoch 4 - iter 1584/1984 - loss 0.20611639 - time (sec): 24.19 - samples/sec: 5381.99 - lr: 0.000021 - momentum: 0.000000
|
131 |
+
2023-10-18 20:23:13,935 epoch 4 - iter 1782/1984 - loss 0.20643644 - time (sec): 27.17 - samples/sec: 5373.80 - lr: 0.000020 - momentum: 0.000000
|
132 |
+
2023-10-18 20:23:16,968 epoch 4 - iter 1980/1984 - loss 0.20253333 - time (sec): 30.20 - samples/sec: 5418.56 - lr: 0.000020 - momentum: 0.000000
|
133 |
+
2023-10-18 20:23:17,030 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-18 20:23:17,030 EPOCH 4 done: loss 0.2025 - lr: 0.000020
|
135 |
+
2023-10-18 20:23:18,839 DEV : loss 0.15379515290260315 - f1-score (micro avg) 0.5326
|
136 |
+
2023-10-18 20:23:18,856 saving best model
|
137 |
+
2023-10-18 20:23:18,889 ----------------------------------------------------------------------------------------------------
|
138 |
+
2023-10-18 20:23:21,899 epoch 5 - iter 198/1984 - loss 0.23060341 - time (sec): 3.01 - samples/sec: 4955.51 - lr: 0.000020 - momentum: 0.000000
|
139 |
+
2023-10-18 20:23:25,011 epoch 5 - iter 396/1984 - loss 0.19610837 - time (sec): 6.12 - samples/sec: 5314.25 - lr: 0.000019 - momentum: 0.000000
|
140 |
+
2023-10-18 20:23:28,034 epoch 5 - iter 594/1984 - loss 0.19599719 - time (sec): 9.14 - samples/sec: 5362.28 - lr: 0.000019 - momentum: 0.000000
|
141 |
+
2023-10-18 20:23:31,070 epoch 5 - iter 792/1984 - loss 0.19167242 - time (sec): 12.18 - samples/sec: 5419.94 - lr: 0.000019 - momentum: 0.000000
|
142 |
+
2023-10-18 20:23:34,098 epoch 5 - iter 990/1984 - loss 0.18849708 - time (sec): 15.21 - samples/sec: 5363.90 - lr: 0.000018 - momentum: 0.000000
|
143 |
+
2023-10-18 20:23:37,129 epoch 5 - iter 1188/1984 - loss 0.18868575 - time (sec): 18.24 - samples/sec: 5392.29 - lr: 0.000018 - momentum: 0.000000
|
144 |
+
2023-10-18 20:23:40,183 epoch 5 - iter 1386/1984 - loss 0.18996251 - time (sec): 21.29 - samples/sec: 5399.95 - lr: 0.000018 - momentum: 0.000000
|
145 |
+
2023-10-18 20:23:43,230 epoch 5 - iter 1584/1984 - loss 0.18960056 - time (sec): 24.34 - samples/sec: 5407.03 - lr: 0.000017 - momentum: 0.000000
|
146 |
+
2023-10-18 20:23:46,294 epoch 5 - iter 1782/1984 - loss 0.18960480 - time (sec): 27.40 - samples/sec: 5392.96 - lr: 0.000017 - momentum: 0.000000
|
147 |
+
2023-10-18 20:23:49,245 epoch 5 - iter 1980/1984 - loss 0.18902609 - time (sec): 30.35 - samples/sec: 5390.97 - lr: 0.000017 - momentum: 0.000000
|
148 |
+
2023-10-18 20:23:49,308 ----------------------------------------------------------------------------------------------------
|
149 |
+
2023-10-18 20:23:49,308 EPOCH 5 done: loss 0.1890 - lr: 0.000017
|
150 |
+
2023-10-18 20:23:51,124 DEV : loss 0.14417240023612976 - f1-score (micro avg) 0.5654
|
151 |
+
2023-10-18 20:23:51,141 saving best model
|
152 |
+
2023-10-18 20:23:51,174 ----------------------------------------------------------------------------------------------------
|
153 |
+
2023-10-18 20:23:54,192 epoch 6 - iter 198/1984 - loss 0.20691944 - time (sec): 3.02 - samples/sec: 5108.94 - lr: 0.000016 - momentum: 0.000000
|
154 |
+
2023-10-18 20:23:57,214 epoch 6 - iter 396/1984 - loss 0.18917630 - time (sec): 6.04 - samples/sec: 5249.12 - lr: 0.000016 - momentum: 0.000000
|
155 |
+
2023-10-18 20:24:00,217 epoch 6 - iter 594/1984 - loss 0.18665542 - time (sec): 9.04 - samples/sec: 5247.48 - lr: 0.000016 - momentum: 0.000000
|
156 |
+
2023-10-18 20:24:03,320 epoch 6 - iter 792/1984 - loss 0.18917319 - time (sec): 12.15 - samples/sec: 5235.56 - lr: 0.000015 - momentum: 0.000000
|
157 |
+
2023-10-18 20:24:06,415 epoch 6 - iter 990/1984 - loss 0.18919754 - time (sec): 15.24 - samples/sec: 5275.00 - lr: 0.000015 - momentum: 0.000000
|
158 |
+
2023-10-18 20:24:09,474 epoch 6 - iter 1188/1984 - loss 0.18272315 - time (sec): 18.30 - samples/sec: 5320.83 - lr: 0.000015 - momentum: 0.000000
|
159 |
+
2023-10-18 20:24:12,446 epoch 6 - iter 1386/1984 - loss 0.17852437 - time (sec): 21.27 - samples/sec: 5378.84 - lr: 0.000014 - momentum: 0.000000
|
160 |
+
2023-10-18 20:24:15,144 epoch 6 - iter 1584/1984 - loss 0.17782946 - time (sec): 23.97 - samples/sec: 5458.78 - lr: 0.000014 - momentum: 0.000000
|
161 |
+
2023-10-18 20:24:18,117 epoch 6 - iter 1782/1984 - loss 0.17992881 - time (sec): 26.94 - samples/sec: 5465.89 - lr: 0.000014 - momentum: 0.000000
|
162 |
+
2023-10-18 20:24:21,064 epoch 6 - iter 1980/1984 - loss 0.17716357 - time (sec): 29.89 - samples/sec: 5475.25 - lr: 0.000013 - momentum: 0.000000
|
163 |
+
2023-10-18 20:24:21,123 ----------------------------------------------------------------------------------------------------
|
164 |
+
2023-10-18 20:24:21,123 EPOCH 6 done: loss 0.1771 - lr: 0.000013
|
165 |
+
2023-10-18 20:24:22,948 DEV : loss 0.14219804108142853 - f1-score (micro avg) 0.5755
|
166 |
+
2023-10-18 20:24:22,965 saving best model
|
167 |
+
2023-10-18 20:24:22,998 ----------------------------------------------------------------------------------------------------
|
168 |
+
2023-10-18 20:24:26,033 epoch 7 - iter 198/1984 - loss 0.21070221 - time (sec): 3.03 - samples/sec: 5279.32 - lr: 0.000013 - momentum: 0.000000
|
169 |
+
2023-10-18 20:24:28,996 epoch 7 - iter 396/1984 - loss 0.18898161 - time (sec): 6.00 - samples/sec: 5395.21 - lr: 0.000013 - momentum: 0.000000
|
170 |
+
2023-10-18 20:24:32,004 epoch 7 - iter 594/1984 - loss 0.18467547 - time (sec): 9.00 - samples/sec: 5381.83 - lr: 0.000012 - momentum: 0.000000
|
171 |
+
2023-10-18 20:24:35,059 epoch 7 - iter 792/1984 - loss 0.17499372 - time (sec): 12.06 - samples/sec: 5446.69 - lr: 0.000012 - momentum: 0.000000
|
172 |
+
2023-10-18 20:24:38,049 epoch 7 - iter 990/1984 - loss 0.17390168 - time (sec): 15.05 - samples/sec: 5481.36 - lr: 0.000012 - momentum: 0.000000
|
173 |
+
2023-10-18 20:24:41,030 epoch 7 - iter 1188/1984 - loss 0.17296841 - time (sec): 18.03 - samples/sec: 5453.01 - lr: 0.000011 - momentum: 0.000000
|
174 |
+
2023-10-18 20:24:44,215 epoch 7 - iter 1386/1984 - loss 0.16940126 - time (sec): 21.22 - samples/sec: 5422.00 - lr: 0.000011 - momentum: 0.000000
|
175 |
+
2023-10-18 20:24:47,216 epoch 7 - iter 1584/1984 - loss 0.16927956 - time (sec): 24.22 - samples/sec: 5404.10 - lr: 0.000011 - momentum: 0.000000
|
176 |
+
2023-10-18 20:24:50,282 epoch 7 - iter 1782/1984 - loss 0.16921141 - time (sec): 27.28 - samples/sec: 5385.26 - lr: 0.000010 - momentum: 0.000000
|
177 |
+
2023-10-18 20:24:53,363 epoch 7 - iter 1980/1984 - loss 0.16921911 - time (sec): 30.36 - samples/sec: 5393.77 - lr: 0.000010 - momentum: 0.000000
|
178 |
+
2023-10-18 20:24:53,427 ----------------------------------------------------------------------------------------------------
|
179 |
+
2023-10-18 20:24:53,427 EPOCH 7 done: loss 0.1692 - lr: 0.000010
|
180 |
+
2023-10-18 20:24:55,548 DEV : loss 0.1411110907793045 - f1-score (micro avg) 0.5873
|
181 |
+
2023-10-18 20:24:55,564 saving best model
|
182 |
+
2023-10-18 20:24:55,599 ----------------------------------------------------------------------------------------------------
|
183 |
+
2023-10-18 20:24:58,631 epoch 8 - iter 198/1984 - loss 0.16894551 - time (sec): 3.03 - samples/sec: 5308.14 - lr: 0.000010 - momentum: 0.000000
|
184 |
+
2023-10-18 20:25:01,643 epoch 8 - iter 396/1984 - loss 0.16387924 - time (sec): 6.04 - samples/sec: 5258.74 - lr: 0.000009 - momentum: 0.000000
|
185 |
+
2023-10-18 20:25:04,674 epoch 8 - iter 594/1984 - loss 0.16514882 - time (sec): 9.07 - samples/sec: 5202.23 - lr: 0.000009 - momentum: 0.000000
|
186 |
+
2023-10-18 20:25:07,774 epoch 8 - iter 792/1984 - loss 0.16780131 - time (sec): 12.18 - samples/sec: 5274.47 - lr: 0.000009 - momentum: 0.000000
|
187 |
+
2023-10-18 20:25:10,830 epoch 8 - iter 990/1984 - loss 0.16733526 - time (sec): 15.23 - samples/sec: 5256.42 - lr: 0.000008 - momentum: 0.000000
|
188 |
+
2023-10-18 20:25:13,904 epoch 8 - iter 1188/1984 - loss 0.16373840 - time (sec): 18.30 - samples/sec: 5340.86 - lr: 0.000008 - momentum: 0.000000
|
189 |
+
2023-10-18 20:25:16,887 epoch 8 - iter 1386/1984 - loss 0.16181785 - time (sec): 21.29 - samples/sec: 5329.93 - lr: 0.000008 - momentum: 0.000000
|
190 |
+
2023-10-18 20:25:19,959 epoch 8 - iter 1584/1984 - loss 0.16227133 - time (sec): 24.36 - samples/sec: 5355.25 - lr: 0.000007 - momentum: 0.000000
|
191 |
+
2023-10-18 20:25:23,008 epoch 8 - iter 1782/1984 - loss 0.16335469 - time (sec): 27.41 - samples/sec: 5361.81 - lr: 0.000007 - momentum: 0.000000
|
192 |
+
2023-10-18 20:25:26,049 epoch 8 - iter 1980/1984 - loss 0.16311903 - time (sec): 30.45 - samples/sec: 5376.89 - lr: 0.000007 - momentum: 0.000000
|
193 |
+
2023-10-18 20:25:26,105 ----------------------------------------------------------------------------------------------------
|
194 |
+
2023-10-18 20:25:26,105 EPOCH 8 done: loss 0.1630 - lr: 0.000007
|
195 |
+
2023-10-18 20:25:27,913 DEV : loss 0.14169389009475708 - f1-score (micro avg) 0.6026
|
196 |
+
2023-10-18 20:25:27,932 saving best model
|
197 |
+
2023-10-18 20:25:27,966 ----------------------------------------------------------------------------------------------------
|
198 |
+
2023-10-18 20:25:31,010 epoch 9 - iter 198/1984 - loss 0.14872512 - time (sec): 3.04 - samples/sec: 5336.13 - lr: 0.000006 - momentum: 0.000000
|
199 |
+
2023-10-18 20:25:34,053 epoch 9 - iter 396/1984 - loss 0.16471370 - time (sec): 6.09 - samples/sec: 5390.47 - lr: 0.000006 - momentum: 0.000000
|
200 |
+
2023-10-18 20:25:37,062 epoch 9 - iter 594/1984 - loss 0.16359773 - time (sec): 9.10 - samples/sec: 5307.53 - lr: 0.000006 - momentum: 0.000000
|
201 |
+
2023-10-18 20:25:40,148 epoch 9 - iter 792/1984 - loss 0.16213124 - time (sec): 12.18 - samples/sec: 5246.00 - lr: 0.000005 - momentum: 0.000000
|
202 |
+
2023-10-18 20:25:43,182 epoch 9 - iter 990/1984 - loss 0.16071329 - time (sec): 15.21 - samples/sec: 5310.46 - lr: 0.000005 - momentum: 0.000000
|
203 |
+
2023-10-18 20:25:46,211 epoch 9 - iter 1188/1984 - loss 0.16040075 - time (sec): 18.24 - samples/sec: 5307.42 - lr: 0.000005 - momentum: 0.000000
|
204 |
+
2023-10-18 20:25:49,259 epoch 9 - iter 1386/1984 - loss 0.15955613 - time (sec): 21.29 - samples/sec: 5367.69 - lr: 0.000004 - momentum: 0.000000
|
205 |
+
2023-10-18 20:25:52,294 epoch 9 - iter 1584/1984 - loss 0.16153404 - time (sec): 24.33 - samples/sec: 5350.28 - lr: 0.000004 - momentum: 0.000000
|
206 |
+
2023-10-18 20:25:55,337 epoch 9 - iter 1782/1984 - loss 0.16026350 - time (sec): 27.37 - samples/sec: 5356.94 - lr: 0.000004 - momentum: 0.000000
|
207 |
+
2023-10-18 20:25:58,420 epoch 9 - iter 1980/1984 - loss 0.16022030 - time (sec): 30.45 - samples/sec: 5375.18 - lr: 0.000003 - momentum: 0.000000
|
208 |
+
2023-10-18 20:25:58,483 ----------------------------------------------------------------------------------------------------
|
209 |
+
2023-10-18 20:25:58,483 EPOCH 9 done: loss 0.1602 - lr: 0.000003
|
210 |
+
2023-10-18 20:26:00,280 DEV : loss 0.14077164232730865 - f1-score (micro avg) 0.5999
|
211 |
+
2023-10-18 20:26:00,296 ----------------------------------------------------------------------------------------------------
|
212 |
+
2023-10-18 20:26:03,366 epoch 10 - iter 198/1984 - loss 0.15429908 - time (sec): 3.07 - samples/sec: 5117.20 - lr: 0.000003 - momentum: 0.000000
|
213 |
+
2023-10-18 20:26:06,590 epoch 10 - iter 396/1984 - loss 0.16153646 - time (sec): 6.29 - samples/sec: 5099.32 - lr: 0.000003 - momentum: 0.000000
|
214 |
+
2023-10-18 20:26:09,606 epoch 10 - iter 594/1984 - loss 0.16009665 - time (sec): 9.31 - samples/sec: 5150.65 - lr: 0.000002 - momentum: 0.000000
|
215 |
+
2023-10-18 20:26:12,646 epoch 10 - iter 792/1984 - loss 0.15349866 - time (sec): 12.35 - samples/sec: 5233.90 - lr: 0.000002 - momentum: 0.000000
|
216 |
+
2023-10-18 20:26:15,665 epoch 10 - iter 990/1984 - loss 0.15500561 - time (sec): 15.37 - samples/sec: 5272.80 - lr: 0.000002 - momentum: 0.000000
|
217 |
+
2023-10-18 20:26:18,672 epoch 10 - iter 1188/1984 - loss 0.15624539 - time (sec): 18.37 - samples/sec: 5268.47 - lr: 0.000001 - momentum: 0.000000
|
218 |
+
2023-10-18 20:26:21,752 epoch 10 - iter 1386/1984 - loss 0.15463573 - time (sec): 21.46 - samples/sec: 5328.03 - lr: 0.000001 - momentum: 0.000000
|
219 |
+
2023-10-18 20:26:24,848 epoch 10 - iter 1584/1984 - loss 0.15358641 - time (sec): 24.55 - samples/sec: 5300.75 - lr: 0.000001 - momentum: 0.000000
|
220 |
+
2023-10-18 20:26:27,969 epoch 10 - iter 1782/1984 - loss 0.15337495 - time (sec): 27.67 - samples/sec: 5306.87 - lr: 0.000000 - momentum: 0.000000
|
221 |
+
2023-10-18 20:26:30,832 epoch 10 - iter 1980/1984 - loss 0.15484121 - time (sec): 30.54 - samples/sec: 5357.87 - lr: 0.000000 - momentum: 0.000000
|
222 |
+
2023-10-18 20:26:30,891 ----------------------------------------------------------------------------------------------------
|
223 |
+
2023-10-18 20:26:30,891 EPOCH 10 done: loss 0.1550 - lr: 0.000000
|
224 |
+
2023-10-18 20:26:32,699 DEV : loss 0.13979580998420715 - f1-score (micro avg) 0.6034
|
225 |
+
2023-10-18 20:26:32,718 saving best model
|
226 |
+
2023-10-18 20:26:32,781 ----------------------------------------------------------------------------------------------------
|
227 |
+
2023-10-18 20:26:32,782 Loading model from best epoch ...
|
228 |
+
2023-10-18 20:26:32,869 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
|
229 |
+
2023-10-18 20:26:34,366
|
230 |
+
Results:
|
231 |
+
- F-score (micro) 0.6068
|
232 |
+
- F-score (macro) 0.436
|
233 |
+
- Accuracy 0.4819
|
234 |
+
|
235 |
+
By class:
|
236 |
+
precision recall f1-score support
|
237 |
+
|
238 |
+
LOC 0.7375 0.6992 0.7179 655
|
239 |
+
PER 0.4011 0.6547 0.4974 223
|
240 |
+
ORG 0.2917 0.0551 0.0927 127
|
241 |
+
|
242 |
+
micro avg 0.6056 0.6080 0.6068 1005
|
243 |
+
macro avg 0.4768 0.4697 0.4360 1005
|
244 |
+
weighted avg 0.6065 0.6080 0.5900 1005
|
245 |
+
|
246 |
+
2023-10-18 20:26:34,366 ----------------------------------------------------------------------------------------------------
|