stefan-it commited on
Commit
c28c9eb
·
1 Parent(s): 9eec6e0

Upload folder using huggingface_hub

Browse files
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e061352ba49fb2ced2c39be8a227946afb6d3828a7abd60afbd6b8bf79a52edc
3
+ size 19045922
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 20:21:41 0.0000 0.9947 0.2363 0.3230 0.1765 0.2282 0.1421
3
+ 2 20:22:14 0.0000 0.2791 0.1780 0.3797 0.3767 0.3782 0.2554
4
+ 3 20:22:46 0.0000 0.2286 0.1582 0.4366 0.4559 0.4460 0.3166
5
+ 4 20:23:18 0.0000 0.2025 0.1538 0.5335 0.5317 0.5326 0.3910
6
+ 5 20:23:51 0.0000 0.1890 0.1442 0.5396 0.5939 0.5654 0.4258
7
+ 6 20:24:22 0.0000 0.1771 0.1422 0.5572 0.5950 0.5755 0.4358
8
+ 7 20:24:55 0.0000 0.1692 0.1411 0.5944 0.5803 0.5873 0.4453
9
+ 8 20:25:27 0.0000 0.1630 0.1417 0.5923 0.6131 0.6026 0.4629
10
+ 9 20:26:00 0.0000 0.1602 0.1408 0.5936 0.6063 0.5999 0.4601
11
+ 10 20:26:32 0.0000 0.1550 0.1398 0.5960 0.6109 0.6034 0.4643
runs/events.out.tfevents.1697660469.46dc0c540dd0.3341.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7aea2c33d79ec75513b067b63c74e9b008a9da47c9eec2ec044c05f4846fa8b5
3
+ size 1108164
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,246 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-18 20:21:09,554 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-18 20:21:09,555 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(32001, 128)
7
+ (position_embeddings): Embedding(512, 128)
8
+ (token_type_embeddings): Embedding(2, 128)
9
+ (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-1): 2 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=128, out_features=128, bias=True)
18
+ (key): Linear(in_features=128, out_features=128, bias=True)
19
+ (value): Linear(in_features=128, out_features=128, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=128, out_features=128, bias=True)
24
+ (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=128, out_features=512, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=512, out_features=128, bias=True)
34
+ (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=128, out_features=128, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=128, out_features=13, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-18 20:21:09,555 MultiCorpus: 7936 train + 992 dev + 992 test sentences
52
+ - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
53
+ 2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-18 20:21:09,555 Train: 7936 sentences
55
+ 2023-10-18 20:21:09,555 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-18 20:21:09,555 Training Params:
58
+ 2023-10-18 20:21:09,555 - learning_rate: "3e-05"
59
+ 2023-10-18 20:21:09,555 - mini_batch_size: "4"
60
+ 2023-10-18 20:21:09,555 - max_epochs: "10"
61
+ 2023-10-18 20:21:09,555 - shuffle: "True"
62
+ 2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-18 20:21:09,555 Plugins:
64
+ 2023-10-18 20:21:09,555 - TensorboardLogger
65
+ 2023-10-18 20:21:09,555 - LinearScheduler | warmup_fraction: '0.1'
66
+ 2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
67
+ 2023-10-18 20:21:09,555 Final evaluation on model from best epoch (best-model.pt)
68
+ 2023-10-18 20:21:09,555 - metric: "('micro avg', 'f1-score')"
69
+ 2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-18 20:21:09,555 Computation:
71
+ 2023-10-18 20:21:09,555 - compute on device: cuda:0
72
+ 2023-10-18 20:21:09,555 - embedding storage: none
73
+ 2023-10-18 20:21:09,555 ----------------------------------------------------------------------------------------------------
74
+ 2023-10-18 20:21:09,556 Model training base path: "hmbench-icdar/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
75
+ 2023-10-18 20:21:09,556 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-18 20:21:09,556 ----------------------------------------------------------------------------------------------------
77
+ 2023-10-18 20:21:09,556 Logging anything other than scalars to TensorBoard is currently not supported.
78
+ 2023-10-18 20:21:13,160 epoch 1 - iter 198/1984 - loss 3.19239065 - time (sec): 3.60 - samples/sec: 4520.00 - lr: 0.000003 - momentum: 0.000000
79
+ 2023-10-18 20:21:16,212 epoch 1 - iter 396/1984 - loss 2.81648826 - time (sec): 6.66 - samples/sec: 4879.36 - lr: 0.000006 - momentum: 0.000000
80
+ 2023-10-18 20:21:19,275 epoch 1 - iter 594/1984 - loss 2.28737400 - time (sec): 9.72 - samples/sec: 5062.60 - lr: 0.000009 - momentum: 0.000000
81
+ 2023-10-18 20:21:22,346 epoch 1 - iter 792/1984 - loss 1.86592392 - time (sec): 12.79 - samples/sec: 5140.12 - lr: 0.000012 - momentum: 0.000000
82
+ 2023-10-18 20:21:25,305 epoch 1 - iter 990/1984 - loss 1.59866230 - time (sec): 15.75 - samples/sec: 5171.07 - lr: 0.000015 - momentum: 0.000000
83
+ 2023-10-18 20:21:28,333 epoch 1 - iter 1188/1984 - loss 1.40641142 - time (sec): 18.78 - samples/sec: 5213.98 - lr: 0.000018 - momentum: 0.000000
84
+ 2023-10-18 20:21:31,352 epoch 1 - iter 1386/1984 - loss 1.26714919 - time (sec): 21.80 - samples/sec: 5230.21 - lr: 0.000021 - momentum: 0.000000
85
+ 2023-10-18 20:21:34,376 epoch 1 - iter 1584/1984 - loss 1.15707396 - time (sec): 24.82 - samples/sec: 5276.51 - lr: 0.000024 - momentum: 0.000000
86
+ 2023-10-18 20:21:37,250 epoch 1 - iter 1782/1984 - loss 1.07044664 - time (sec): 27.69 - samples/sec: 5313.14 - lr: 0.000027 - momentum: 0.000000
87
+ 2023-10-18 20:21:40,102 epoch 1 - iter 1980/1984 - loss 0.99619007 - time (sec): 30.55 - samples/sec: 5356.70 - lr: 0.000030 - momentum: 0.000000
88
+ 2023-10-18 20:21:40,161 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-18 20:21:40,161 EPOCH 1 done: loss 0.9947 - lr: 0.000030
90
+ 2023-10-18 20:21:41,606 DEV : loss 0.23625816404819489 - f1-score (micro avg) 0.2282
91
+ 2023-10-18 20:21:41,623 saving best model
92
+ 2023-10-18 20:21:41,659 ----------------------------------------------------------------------------------------------------
93
+ 2023-10-18 20:21:44,713 epoch 2 - iter 198/1984 - loss 0.34878745 - time (sec): 3.05 - samples/sec: 5568.75 - lr: 0.000030 - momentum: 0.000000
94
+ 2023-10-18 20:21:47,768 epoch 2 - iter 396/1984 - loss 0.32539556 - time (sec): 6.11 - samples/sec: 5539.51 - lr: 0.000029 - momentum: 0.000000
95
+ 2023-10-18 20:21:51,158 epoch 2 - iter 594/1984 - loss 0.31675399 - time (sec): 9.50 - samples/sec: 5273.19 - lr: 0.000029 - momentum: 0.000000
96
+ 2023-10-18 20:21:54,182 epoch 2 - iter 792/1984 - loss 0.30310641 - time (sec): 12.52 - samples/sec: 5269.51 - lr: 0.000029 - momentum: 0.000000
97
+ 2023-10-18 20:21:57,250 epoch 2 - iter 990/1984 - loss 0.29671952 - time (sec): 15.59 - samples/sec: 5326.88 - lr: 0.000028 - momentum: 0.000000
98
+ 2023-10-18 20:22:00,316 epoch 2 - iter 1188/1984 - loss 0.29371486 - time (sec): 18.66 - samples/sec: 5332.97 - lr: 0.000028 - momentum: 0.000000
99
+ 2023-10-18 20:22:03,364 epoch 2 - iter 1386/1984 - loss 0.28559705 - time (sec): 21.70 - samples/sec: 5357.01 - lr: 0.000028 - momentum: 0.000000
100
+ 2023-10-18 20:22:06,430 epoch 2 - iter 1584/1984 - loss 0.28450858 - time (sec): 24.77 - samples/sec: 5353.75 - lr: 0.000027 - momentum: 0.000000
101
+ 2023-10-18 20:22:09,535 epoch 2 - iter 1782/1984 - loss 0.28203045 - time (sec): 27.87 - samples/sec: 5316.94 - lr: 0.000027 - momentum: 0.000000
102
+ 2023-10-18 20:22:12,574 epoch 2 - iter 1980/1984 - loss 0.27933585 - time (sec): 30.91 - samples/sec: 5293.09 - lr: 0.000027 - momentum: 0.000000
103
+ 2023-10-18 20:22:12,634 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-18 20:22:12,634 EPOCH 2 done: loss 0.2791 - lr: 0.000027
105
+ 2023-10-18 20:22:14,454 DEV : loss 0.17796094715595245 - f1-score (micro avg) 0.3782
106
+ 2023-10-18 20:22:14,472 saving best model
107
+ 2023-10-18 20:22:14,505 ----------------------------------------------------------------------------------------------------
108
+ 2023-10-18 20:22:17,635 epoch 3 - iter 198/1984 - loss 0.21732311 - time (sec): 3.13 - samples/sec: 5209.34 - lr: 0.000026 - momentum: 0.000000
109
+ 2023-10-18 20:22:20,674 epoch 3 - iter 396/1984 - loss 0.21382431 - time (sec): 6.17 - samples/sec: 5318.22 - lr: 0.000026 - momentum: 0.000000
110
+ 2023-10-18 20:22:23,712 epoch 3 - iter 594/1984 - loss 0.23635522 - time (sec): 9.21 - samples/sec: 5292.83 - lr: 0.000026 - momentum: 0.000000
111
+ 2023-10-18 20:22:26,797 epoch 3 - iter 792/1984 - loss 0.23384899 - time (sec): 12.29 - samples/sec: 5342.49 - lr: 0.000025 - momentum: 0.000000
112
+ 2023-10-18 20:22:29,630 epoch 3 - iter 990/1984 - loss 0.23319557 - time (sec): 15.12 - samples/sec: 5373.24 - lr: 0.000025 - momentum: 0.000000
113
+ 2023-10-18 20:22:32,703 epoch 3 - iter 1188/1984 - loss 0.23261618 - time (sec): 18.20 - samples/sec: 5391.85 - lr: 0.000025 - momentum: 0.000000
114
+ 2023-10-18 20:22:35,702 epoch 3 - iter 1386/1984 - loss 0.23393194 - time (sec): 21.20 - samples/sec: 5382.13 - lr: 0.000024 - momentum: 0.000000
115
+ 2023-10-18 20:22:38,761 epoch 3 - iter 1584/1984 - loss 0.23335714 - time (sec): 24.26 - samples/sec: 5394.11 - lr: 0.000024 - momentum: 0.000000
116
+ 2023-10-18 20:22:41,803 epoch 3 - iter 1782/1984 - loss 0.23015129 - time (sec): 27.30 - samples/sec: 5396.62 - lr: 0.000024 - momentum: 0.000000
117
+ 2023-10-18 20:22:44,850 epoch 3 - iter 1980/1984 - loss 0.22849262 - time (sec): 30.34 - samples/sec: 5389.37 - lr: 0.000023 - momentum: 0.000000
118
+ 2023-10-18 20:22:44,920 ----------------------------------------------------------------------------------------------------
119
+ 2023-10-18 20:22:44,920 EPOCH 3 done: loss 0.2286 - lr: 0.000023
120
+ 2023-10-18 20:22:46,719 DEV : loss 0.15816588699817657 - f1-score (micro avg) 0.446
121
+ 2023-10-18 20:22:46,736 saving best model
122
+ 2023-10-18 20:22:46,766 ----------------------------------------------------------------------------------------------------
123
+ 2023-10-18 20:22:49,823 epoch 4 - iter 198/1984 - loss 0.21892155 - time (sec): 3.06 - samples/sec: 5353.62 - lr: 0.000023 - momentum: 0.000000
124
+ 2023-10-18 20:22:52,844 epoch 4 - iter 396/1984 - loss 0.22023973 - time (sec): 6.08 - samples/sec: 5341.75 - lr: 0.000023 - momentum: 0.000000
125
+ 2023-10-18 20:22:55,849 epoch 4 - iter 594/1984 - loss 0.21519154 - time (sec): 9.08 - samples/sec: 5280.36 - lr: 0.000022 - momentum: 0.000000
126
+ 2023-10-18 20:22:58,885 epoch 4 - iter 792/1984 - loss 0.21938168 - time (sec): 12.12 - samples/sec: 5231.69 - lr: 0.000022 - momentum: 0.000000
127
+ 2023-10-18 20:23:01,883 epoch 4 - iter 990/1984 - loss 0.21550684 - time (sec): 15.12 - samples/sec: 5252.24 - lr: 0.000022 - momentum: 0.000000
128
+ 2023-10-18 20:23:04,896 epoch 4 - iter 1188/1984 - loss 0.20933788 - time (sec): 18.13 - samples/sec: 5284.31 - lr: 0.000021 - momentum: 0.000000
129
+ 2023-10-18 20:23:07,904 epoch 4 - iter 1386/1984 - loss 0.20913435 - time (sec): 21.14 - samples/sec: 5366.26 - lr: 0.000021 - momentum: 0.000000
130
+ 2023-10-18 20:23:10,953 epoch 4 - iter 1584/1984 - loss 0.20611639 - time (sec): 24.19 - samples/sec: 5381.99 - lr: 0.000021 - momentum: 0.000000
131
+ 2023-10-18 20:23:13,935 epoch 4 - iter 1782/1984 - loss 0.20643644 - time (sec): 27.17 - samples/sec: 5373.80 - lr: 0.000020 - momentum: 0.000000
132
+ 2023-10-18 20:23:16,968 epoch 4 - iter 1980/1984 - loss 0.20253333 - time (sec): 30.20 - samples/sec: 5418.56 - lr: 0.000020 - momentum: 0.000000
133
+ 2023-10-18 20:23:17,030 ----------------------------------------------------------------------------------------------------
134
+ 2023-10-18 20:23:17,030 EPOCH 4 done: loss 0.2025 - lr: 0.000020
135
+ 2023-10-18 20:23:18,839 DEV : loss 0.15379515290260315 - f1-score (micro avg) 0.5326
136
+ 2023-10-18 20:23:18,856 saving best model
137
+ 2023-10-18 20:23:18,889 ----------------------------------------------------------------------------------------------------
138
+ 2023-10-18 20:23:21,899 epoch 5 - iter 198/1984 - loss 0.23060341 - time (sec): 3.01 - samples/sec: 4955.51 - lr: 0.000020 - momentum: 0.000000
139
+ 2023-10-18 20:23:25,011 epoch 5 - iter 396/1984 - loss 0.19610837 - time (sec): 6.12 - samples/sec: 5314.25 - lr: 0.000019 - momentum: 0.000000
140
+ 2023-10-18 20:23:28,034 epoch 5 - iter 594/1984 - loss 0.19599719 - time (sec): 9.14 - samples/sec: 5362.28 - lr: 0.000019 - momentum: 0.000000
141
+ 2023-10-18 20:23:31,070 epoch 5 - iter 792/1984 - loss 0.19167242 - time (sec): 12.18 - samples/sec: 5419.94 - lr: 0.000019 - momentum: 0.000000
142
+ 2023-10-18 20:23:34,098 epoch 5 - iter 990/1984 - loss 0.18849708 - time (sec): 15.21 - samples/sec: 5363.90 - lr: 0.000018 - momentum: 0.000000
143
+ 2023-10-18 20:23:37,129 epoch 5 - iter 1188/1984 - loss 0.18868575 - time (sec): 18.24 - samples/sec: 5392.29 - lr: 0.000018 - momentum: 0.000000
144
+ 2023-10-18 20:23:40,183 epoch 5 - iter 1386/1984 - loss 0.18996251 - time (sec): 21.29 - samples/sec: 5399.95 - lr: 0.000018 - momentum: 0.000000
145
+ 2023-10-18 20:23:43,230 epoch 5 - iter 1584/1984 - loss 0.18960056 - time (sec): 24.34 - samples/sec: 5407.03 - lr: 0.000017 - momentum: 0.000000
146
+ 2023-10-18 20:23:46,294 epoch 5 - iter 1782/1984 - loss 0.18960480 - time (sec): 27.40 - samples/sec: 5392.96 - lr: 0.000017 - momentum: 0.000000
147
+ 2023-10-18 20:23:49,245 epoch 5 - iter 1980/1984 - loss 0.18902609 - time (sec): 30.35 - samples/sec: 5390.97 - lr: 0.000017 - momentum: 0.000000
148
+ 2023-10-18 20:23:49,308 ----------------------------------------------------------------------------------------------------
149
+ 2023-10-18 20:23:49,308 EPOCH 5 done: loss 0.1890 - lr: 0.000017
150
+ 2023-10-18 20:23:51,124 DEV : loss 0.14417240023612976 - f1-score (micro avg) 0.5654
151
+ 2023-10-18 20:23:51,141 saving best model
152
+ 2023-10-18 20:23:51,174 ----------------------------------------------------------------------------------------------------
153
+ 2023-10-18 20:23:54,192 epoch 6 - iter 198/1984 - loss 0.20691944 - time (sec): 3.02 - samples/sec: 5108.94 - lr: 0.000016 - momentum: 0.000000
154
+ 2023-10-18 20:23:57,214 epoch 6 - iter 396/1984 - loss 0.18917630 - time (sec): 6.04 - samples/sec: 5249.12 - lr: 0.000016 - momentum: 0.000000
155
+ 2023-10-18 20:24:00,217 epoch 6 - iter 594/1984 - loss 0.18665542 - time (sec): 9.04 - samples/sec: 5247.48 - lr: 0.000016 - momentum: 0.000000
156
+ 2023-10-18 20:24:03,320 epoch 6 - iter 792/1984 - loss 0.18917319 - time (sec): 12.15 - samples/sec: 5235.56 - lr: 0.000015 - momentum: 0.000000
157
+ 2023-10-18 20:24:06,415 epoch 6 - iter 990/1984 - loss 0.18919754 - time (sec): 15.24 - samples/sec: 5275.00 - lr: 0.000015 - momentum: 0.000000
158
+ 2023-10-18 20:24:09,474 epoch 6 - iter 1188/1984 - loss 0.18272315 - time (sec): 18.30 - samples/sec: 5320.83 - lr: 0.000015 - momentum: 0.000000
159
+ 2023-10-18 20:24:12,446 epoch 6 - iter 1386/1984 - loss 0.17852437 - time (sec): 21.27 - samples/sec: 5378.84 - lr: 0.000014 - momentum: 0.000000
160
+ 2023-10-18 20:24:15,144 epoch 6 - iter 1584/1984 - loss 0.17782946 - time (sec): 23.97 - samples/sec: 5458.78 - lr: 0.000014 - momentum: 0.000000
161
+ 2023-10-18 20:24:18,117 epoch 6 - iter 1782/1984 - loss 0.17992881 - time (sec): 26.94 - samples/sec: 5465.89 - lr: 0.000014 - momentum: 0.000000
162
+ 2023-10-18 20:24:21,064 epoch 6 - iter 1980/1984 - loss 0.17716357 - time (sec): 29.89 - samples/sec: 5475.25 - lr: 0.000013 - momentum: 0.000000
163
+ 2023-10-18 20:24:21,123 ----------------------------------------------------------------------------------------------------
164
+ 2023-10-18 20:24:21,123 EPOCH 6 done: loss 0.1771 - lr: 0.000013
165
+ 2023-10-18 20:24:22,948 DEV : loss 0.14219804108142853 - f1-score (micro avg) 0.5755
166
+ 2023-10-18 20:24:22,965 saving best model
167
+ 2023-10-18 20:24:22,998 ----------------------------------------------------------------------------------------------------
168
+ 2023-10-18 20:24:26,033 epoch 7 - iter 198/1984 - loss 0.21070221 - time (sec): 3.03 - samples/sec: 5279.32 - lr: 0.000013 - momentum: 0.000000
169
+ 2023-10-18 20:24:28,996 epoch 7 - iter 396/1984 - loss 0.18898161 - time (sec): 6.00 - samples/sec: 5395.21 - lr: 0.000013 - momentum: 0.000000
170
+ 2023-10-18 20:24:32,004 epoch 7 - iter 594/1984 - loss 0.18467547 - time (sec): 9.00 - samples/sec: 5381.83 - lr: 0.000012 - momentum: 0.000000
171
+ 2023-10-18 20:24:35,059 epoch 7 - iter 792/1984 - loss 0.17499372 - time (sec): 12.06 - samples/sec: 5446.69 - lr: 0.000012 - momentum: 0.000000
172
+ 2023-10-18 20:24:38,049 epoch 7 - iter 990/1984 - loss 0.17390168 - time (sec): 15.05 - samples/sec: 5481.36 - lr: 0.000012 - momentum: 0.000000
173
+ 2023-10-18 20:24:41,030 epoch 7 - iter 1188/1984 - loss 0.17296841 - time (sec): 18.03 - samples/sec: 5453.01 - lr: 0.000011 - momentum: 0.000000
174
+ 2023-10-18 20:24:44,215 epoch 7 - iter 1386/1984 - loss 0.16940126 - time (sec): 21.22 - samples/sec: 5422.00 - lr: 0.000011 - momentum: 0.000000
175
+ 2023-10-18 20:24:47,216 epoch 7 - iter 1584/1984 - loss 0.16927956 - time (sec): 24.22 - samples/sec: 5404.10 - lr: 0.000011 - momentum: 0.000000
176
+ 2023-10-18 20:24:50,282 epoch 7 - iter 1782/1984 - loss 0.16921141 - time (sec): 27.28 - samples/sec: 5385.26 - lr: 0.000010 - momentum: 0.000000
177
+ 2023-10-18 20:24:53,363 epoch 7 - iter 1980/1984 - loss 0.16921911 - time (sec): 30.36 - samples/sec: 5393.77 - lr: 0.000010 - momentum: 0.000000
178
+ 2023-10-18 20:24:53,427 ----------------------------------------------------------------------------------------------------
179
+ 2023-10-18 20:24:53,427 EPOCH 7 done: loss 0.1692 - lr: 0.000010
180
+ 2023-10-18 20:24:55,548 DEV : loss 0.1411110907793045 - f1-score (micro avg) 0.5873
181
+ 2023-10-18 20:24:55,564 saving best model
182
+ 2023-10-18 20:24:55,599 ----------------------------------------------------------------------------------------------------
183
+ 2023-10-18 20:24:58,631 epoch 8 - iter 198/1984 - loss 0.16894551 - time (sec): 3.03 - samples/sec: 5308.14 - lr: 0.000010 - momentum: 0.000000
184
+ 2023-10-18 20:25:01,643 epoch 8 - iter 396/1984 - loss 0.16387924 - time (sec): 6.04 - samples/sec: 5258.74 - lr: 0.000009 - momentum: 0.000000
185
+ 2023-10-18 20:25:04,674 epoch 8 - iter 594/1984 - loss 0.16514882 - time (sec): 9.07 - samples/sec: 5202.23 - lr: 0.000009 - momentum: 0.000000
186
+ 2023-10-18 20:25:07,774 epoch 8 - iter 792/1984 - loss 0.16780131 - time (sec): 12.18 - samples/sec: 5274.47 - lr: 0.000009 - momentum: 0.000000
187
+ 2023-10-18 20:25:10,830 epoch 8 - iter 990/1984 - loss 0.16733526 - time (sec): 15.23 - samples/sec: 5256.42 - lr: 0.000008 - momentum: 0.000000
188
+ 2023-10-18 20:25:13,904 epoch 8 - iter 1188/1984 - loss 0.16373840 - time (sec): 18.30 - samples/sec: 5340.86 - lr: 0.000008 - momentum: 0.000000
189
+ 2023-10-18 20:25:16,887 epoch 8 - iter 1386/1984 - loss 0.16181785 - time (sec): 21.29 - samples/sec: 5329.93 - lr: 0.000008 - momentum: 0.000000
190
+ 2023-10-18 20:25:19,959 epoch 8 - iter 1584/1984 - loss 0.16227133 - time (sec): 24.36 - samples/sec: 5355.25 - lr: 0.000007 - momentum: 0.000000
191
+ 2023-10-18 20:25:23,008 epoch 8 - iter 1782/1984 - loss 0.16335469 - time (sec): 27.41 - samples/sec: 5361.81 - lr: 0.000007 - momentum: 0.000000
192
+ 2023-10-18 20:25:26,049 epoch 8 - iter 1980/1984 - loss 0.16311903 - time (sec): 30.45 - samples/sec: 5376.89 - lr: 0.000007 - momentum: 0.000000
193
+ 2023-10-18 20:25:26,105 ----------------------------------------------------------------------------------------------------
194
+ 2023-10-18 20:25:26,105 EPOCH 8 done: loss 0.1630 - lr: 0.000007
195
+ 2023-10-18 20:25:27,913 DEV : loss 0.14169389009475708 - f1-score (micro avg) 0.6026
196
+ 2023-10-18 20:25:27,932 saving best model
197
+ 2023-10-18 20:25:27,966 ----------------------------------------------------------------------------------------------------
198
+ 2023-10-18 20:25:31,010 epoch 9 - iter 198/1984 - loss 0.14872512 - time (sec): 3.04 - samples/sec: 5336.13 - lr: 0.000006 - momentum: 0.000000
199
+ 2023-10-18 20:25:34,053 epoch 9 - iter 396/1984 - loss 0.16471370 - time (sec): 6.09 - samples/sec: 5390.47 - lr: 0.000006 - momentum: 0.000000
200
+ 2023-10-18 20:25:37,062 epoch 9 - iter 594/1984 - loss 0.16359773 - time (sec): 9.10 - samples/sec: 5307.53 - lr: 0.000006 - momentum: 0.000000
201
+ 2023-10-18 20:25:40,148 epoch 9 - iter 792/1984 - loss 0.16213124 - time (sec): 12.18 - samples/sec: 5246.00 - lr: 0.000005 - momentum: 0.000000
202
+ 2023-10-18 20:25:43,182 epoch 9 - iter 990/1984 - loss 0.16071329 - time (sec): 15.21 - samples/sec: 5310.46 - lr: 0.000005 - momentum: 0.000000
203
+ 2023-10-18 20:25:46,211 epoch 9 - iter 1188/1984 - loss 0.16040075 - time (sec): 18.24 - samples/sec: 5307.42 - lr: 0.000005 - momentum: 0.000000
204
+ 2023-10-18 20:25:49,259 epoch 9 - iter 1386/1984 - loss 0.15955613 - time (sec): 21.29 - samples/sec: 5367.69 - lr: 0.000004 - momentum: 0.000000
205
+ 2023-10-18 20:25:52,294 epoch 9 - iter 1584/1984 - loss 0.16153404 - time (sec): 24.33 - samples/sec: 5350.28 - lr: 0.000004 - momentum: 0.000000
206
+ 2023-10-18 20:25:55,337 epoch 9 - iter 1782/1984 - loss 0.16026350 - time (sec): 27.37 - samples/sec: 5356.94 - lr: 0.000004 - momentum: 0.000000
207
+ 2023-10-18 20:25:58,420 epoch 9 - iter 1980/1984 - loss 0.16022030 - time (sec): 30.45 - samples/sec: 5375.18 - lr: 0.000003 - momentum: 0.000000
208
+ 2023-10-18 20:25:58,483 ----------------------------------------------------------------------------------------------------
209
+ 2023-10-18 20:25:58,483 EPOCH 9 done: loss 0.1602 - lr: 0.000003
210
+ 2023-10-18 20:26:00,280 DEV : loss 0.14077164232730865 - f1-score (micro avg) 0.5999
211
+ 2023-10-18 20:26:00,296 ----------------------------------------------------------------------------------------------------
212
+ 2023-10-18 20:26:03,366 epoch 10 - iter 198/1984 - loss 0.15429908 - time (sec): 3.07 - samples/sec: 5117.20 - lr: 0.000003 - momentum: 0.000000
213
+ 2023-10-18 20:26:06,590 epoch 10 - iter 396/1984 - loss 0.16153646 - time (sec): 6.29 - samples/sec: 5099.32 - lr: 0.000003 - momentum: 0.000000
214
+ 2023-10-18 20:26:09,606 epoch 10 - iter 594/1984 - loss 0.16009665 - time (sec): 9.31 - samples/sec: 5150.65 - lr: 0.000002 - momentum: 0.000000
215
+ 2023-10-18 20:26:12,646 epoch 10 - iter 792/1984 - loss 0.15349866 - time (sec): 12.35 - samples/sec: 5233.90 - lr: 0.000002 - momentum: 0.000000
216
+ 2023-10-18 20:26:15,665 epoch 10 - iter 990/1984 - loss 0.15500561 - time (sec): 15.37 - samples/sec: 5272.80 - lr: 0.000002 - momentum: 0.000000
217
+ 2023-10-18 20:26:18,672 epoch 10 - iter 1188/1984 - loss 0.15624539 - time (sec): 18.37 - samples/sec: 5268.47 - lr: 0.000001 - momentum: 0.000000
218
+ 2023-10-18 20:26:21,752 epoch 10 - iter 1386/1984 - loss 0.15463573 - time (sec): 21.46 - samples/sec: 5328.03 - lr: 0.000001 - momentum: 0.000000
219
+ 2023-10-18 20:26:24,848 epoch 10 - iter 1584/1984 - loss 0.15358641 - time (sec): 24.55 - samples/sec: 5300.75 - lr: 0.000001 - momentum: 0.000000
220
+ 2023-10-18 20:26:27,969 epoch 10 - iter 1782/1984 - loss 0.15337495 - time (sec): 27.67 - samples/sec: 5306.87 - lr: 0.000000 - momentum: 0.000000
221
+ 2023-10-18 20:26:30,832 epoch 10 - iter 1980/1984 - loss 0.15484121 - time (sec): 30.54 - samples/sec: 5357.87 - lr: 0.000000 - momentum: 0.000000
222
+ 2023-10-18 20:26:30,891 ----------------------------------------------------------------------------------------------------
223
+ 2023-10-18 20:26:30,891 EPOCH 10 done: loss 0.1550 - lr: 0.000000
224
+ 2023-10-18 20:26:32,699 DEV : loss 0.13979580998420715 - f1-score (micro avg) 0.6034
225
+ 2023-10-18 20:26:32,718 saving best model
226
+ 2023-10-18 20:26:32,781 ----------------------------------------------------------------------------------------------------
227
+ 2023-10-18 20:26:32,782 Loading model from best epoch ...
228
+ 2023-10-18 20:26:32,869 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
229
+ 2023-10-18 20:26:34,366
230
+ Results:
231
+ - F-score (micro) 0.6068
232
+ - F-score (macro) 0.436
233
+ - Accuracy 0.4819
234
+
235
+ By class:
236
+ precision recall f1-score support
237
+
238
+ LOC 0.7375 0.6992 0.7179 655
239
+ PER 0.4011 0.6547 0.4974 223
240
+ ORG 0.2917 0.0551 0.0927 127
241
+
242
+ micro avg 0.6056 0.6080 0.6068 1005
243
+ macro avg 0.4768 0.4697 0.4360 1005
244
+ weighted avg 0.6065 0.6080 0.5900 1005
245
+
246
+ 2023-10-18 20:26:34,366 ----------------------------------------------------------------------------------------------------