File size: 26,607 Bytes
ffe4aa6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
2024-03-26 09:43:20,711 ----------------------------------------------------------------------------------------------------
2024-03-26 09:43:20,711 Model: "SequenceTagger(
  (embeddings): TransformerWordEmbeddings(
    (model): BertModel(
      (embeddings): BertEmbeddings(
        (word_embeddings): Embedding(31103, 768)
        (position_embeddings): Embedding(512, 768)
        (token_type_embeddings): Embedding(2, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): BertEncoder(
        (layer): ModuleList(
          (0-11): 12 x BertLayer(
            (attention): BertAttention(
              (self): BertSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): BertSelfOutput(
                (dense): Linear(in_features=768, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (intermediate): BertIntermediate(
              (dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
            )
            (output): BertOutput(
              (dense): Linear(in_features=3072, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
      (pooler): BertPooler(
        (dense): Linear(in_features=768, out_features=768, bias=True)
        (activation): Tanh()
      )
    )
  )
  (locked_dropout): LockedDropout(p=0.5)
  (linear): Linear(in_features=768, out_features=17, bias=True)
  (loss_function): CrossEntropyLoss()
)"
2024-03-26 09:43:20,712 ----------------------------------------------------------------------------------------------------
2024-03-26 09:43:20,712 Corpus: 758 train + 94 dev + 96 test sentences
2024-03-26 09:43:20,712 ----------------------------------------------------------------------------------------------------
2024-03-26 09:43:20,712 Train:  758 sentences
2024-03-26 09:43:20,712         (train_with_dev=False, train_with_test=False)
2024-03-26 09:43:20,712 ----------------------------------------------------------------------------------------------------
2024-03-26 09:43:20,712 Training Params:
2024-03-26 09:43:20,712  - learning_rate: "3e-05" 
2024-03-26 09:43:20,712  - mini_batch_size: "16"
2024-03-26 09:43:20,712  - max_epochs: "10"
2024-03-26 09:43:20,712  - shuffle: "True"
2024-03-26 09:43:20,712 ----------------------------------------------------------------------------------------------------
2024-03-26 09:43:20,712 Plugins:
2024-03-26 09:43:20,712  - TensorboardLogger
2024-03-26 09:43:20,712  - LinearScheduler | warmup_fraction: '0.1'
2024-03-26 09:43:20,712 ----------------------------------------------------------------------------------------------------
2024-03-26 09:43:20,712 Final evaluation on model from best epoch (best-model.pt)
2024-03-26 09:43:20,712  - metric: "('micro avg', 'f1-score')"
2024-03-26 09:43:20,712 ----------------------------------------------------------------------------------------------------
2024-03-26 09:43:20,712 Computation:
2024-03-26 09:43:20,712  - compute on device: cuda:0
2024-03-26 09:43:20,712  - embedding storage: none
2024-03-26 09:43:20,712 ----------------------------------------------------------------------------------------------------
2024-03-26 09:43:20,712 Model training base path: "flair-co-funer-gbert_base-bs16-e10-lr3e-05-2"
2024-03-26 09:43:20,712 ----------------------------------------------------------------------------------------------------
2024-03-26 09:43:20,712 ----------------------------------------------------------------------------------------------------
2024-03-26 09:43:20,712 Logging anything other than scalars to TensorBoard is currently not supported.
2024-03-26 09:43:22,441 epoch 1 - iter 4/48 - loss 3.54097837 - time (sec): 1.73 - samples/sec: 1747.68 - lr: 0.000002 - momentum: 0.000000
2024-03-26 09:43:24,541 epoch 1 - iter 8/48 - loss 3.46732419 - time (sec): 3.83 - samples/sec: 1621.38 - lr: 0.000004 - momentum: 0.000000
2024-03-26 09:43:26,386 epoch 1 - iter 12/48 - loss 3.37699881 - time (sec): 5.67 - samples/sec: 1571.13 - lr: 0.000007 - momentum: 0.000000
2024-03-26 09:43:28,400 epoch 1 - iter 16/48 - loss 3.22918949 - time (sec): 7.69 - samples/sec: 1578.39 - lr: 0.000009 - momentum: 0.000000
2024-03-26 09:43:30,598 epoch 1 - iter 20/48 - loss 3.05714722 - time (sec): 9.89 - samples/sec: 1545.81 - lr: 0.000012 - momentum: 0.000000
2024-03-26 09:43:33,642 epoch 1 - iter 24/48 - loss 2.91610601 - time (sec): 12.93 - samples/sec: 1405.97 - lr: 0.000014 - momentum: 0.000000
2024-03-26 09:43:36,053 epoch 1 - iter 28/48 - loss 2.77155856 - time (sec): 15.34 - samples/sec: 1389.33 - lr: 0.000017 - momentum: 0.000000
2024-03-26 09:43:36,876 epoch 1 - iter 32/48 - loss 2.67641615 - time (sec): 16.16 - samples/sec: 1444.62 - lr: 0.000019 - momentum: 0.000000
2024-03-26 09:43:38,140 epoch 1 - iter 36/48 - loss 2.56751661 - time (sec): 17.43 - samples/sec: 1500.56 - lr: 0.000022 - momentum: 0.000000
2024-03-26 09:43:40,010 epoch 1 - iter 40/48 - loss 2.47153744 - time (sec): 19.30 - samples/sec: 1507.46 - lr: 0.000024 - momentum: 0.000000
2024-03-26 09:43:41,898 epoch 1 - iter 44/48 - loss 2.37591552 - time (sec): 21.19 - samples/sec: 1508.12 - lr: 0.000027 - momentum: 0.000000
2024-03-26 09:43:43,251 epoch 1 - iter 48/48 - loss 2.29457447 - time (sec): 22.54 - samples/sec: 1529.47 - lr: 0.000029 - momentum: 0.000000
2024-03-26 09:43:43,251 ----------------------------------------------------------------------------------------------------
2024-03-26 09:43:43,251 EPOCH 1 done: loss 2.2946 - lr: 0.000029
2024-03-26 09:43:44,060 DEV : loss 0.868198812007904 - f1-score (micro avg)  0.3671
2024-03-26 09:43:44,061 saving best model
2024-03-26 09:43:44,339 ----------------------------------------------------------------------------------------------------
2024-03-26 09:43:45,643 epoch 2 - iter 4/48 - loss 1.21391336 - time (sec): 1.30 - samples/sec: 2226.04 - lr: 0.000030 - momentum: 0.000000
2024-03-26 09:43:47,473 epoch 2 - iter 8/48 - loss 1.01735114 - time (sec): 3.13 - samples/sec: 1946.30 - lr: 0.000030 - momentum: 0.000000
2024-03-26 09:43:50,883 epoch 2 - iter 12/48 - loss 0.91073315 - time (sec): 6.54 - samples/sec: 1555.26 - lr: 0.000029 - momentum: 0.000000
2024-03-26 09:43:53,349 epoch 2 - iter 16/48 - loss 0.84066048 - time (sec): 9.01 - samples/sec: 1478.39 - lr: 0.000029 - momentum: 0.000000
2024-03-26 09:43:55,980 epoch 2 - iter 20/48 - loss 0.78893352 - time (sec): 11.64 - samples/sec: 1427.19 - lr: 0.000029 - momentum: 0.000000
2024-03-26 09:43:57,851 epoch 2 - iter 24/48 - loss 0.74386229 - time (sec): 13.51 - samples/sec: 1426.93 - lr: 0.000028 - momentum: 0.000000
2024-03-26 09:43:59,611 epoch 2 - iter 28/48 - loss 0.72707811 - time (sec): 15.27 - samples/sec: 1436.22 - lr: 0.000028 - momentum: 0.000000
2024-03-26 09:44:01,311 epoch 2 - iter 32/48 - loss 0.70317913 - time (sec): 16.97 - samples/sec: 1449.78 - lr: 0.000028 - momentum: 0.000000
2024-03-26 09:44:03,143 epoch 2 - iter 36/48 - loss 0.68201229 - time (sec): 18.80 - samples/sec: 1458.93 - lr: 0.000028 - momentum: 0.000000
2024-03-26 09:44:04,149 epoch 2 - iter 40/48 - loss 0.66472757 - time (sec): 19.81 - samples/sec: 1506.96 - lr: 0.000027 - momentum: 0.000000
2024-03-26 09:44:05,571 epoch 2 - iter 44/48 - loss 0.65567832 - time (sec): 21.23 - samples/sec: 1526.81 - lr: 0.000027 - momentum: 0.000000
2024-03-26 09:44:07,083 epoch 2 - iter 48/48 - loss 0.63834426 - time (sec): 22.74 - samples/sec: 1515.70 - lr: 0.000027 - momentum: 0.000000
2024-03-26 09:44:07,083 ----------------------------------------------------------------------------------------------------
2024-03-26 09:44:07,083 EPOCH 2 done: loss 0.6383 - lr: 0.000027
2024-03-26 09:44:07,970 DEV : loss 0.3501518666744232 - f1-score (micro avg)  0.7641
2024-03-26 09:44:07,971 saving best model
2024-03-26 09:44:08,428 ----------------------------------------------------------------------------------------------------
2024-03-26 09:44:11,061 epoch 3 - iter 4/48 - loss 0.38909483 - time (sec): 2.63 - samples/sec: 1143.78 - lr: 0.000026 - momentum: 0.000000
2024-03-26 09:44:13,193 epoch 3 - iter 8/48 - loss 0.38382097 - time (sec): 4.76 - samples/sec: 1333.23 - lr: 0.000026 - momentum: 0.000000
2024-03-26 09:44:14,762 epoch 3 - iter 12/48 - loss 0.39785448 - time (sec): 6.33 - samples/sec: 1401.18 - lr: 0.000026 - momentum: 0.000000
2024-03-26 09:44:16,497 epoch 3 - iter 16/48 - loss 0.36688780 - time (sec): 8.07 - samples/sec: 1408.90 - lr: 0.000026 - momentum: 0.000000
2024-03-26 09:44:17,643 epoch 3 - iter 20/48 - loss 0.36697264 - time (sec): 9.21 - samples/sec: 1485.03 - lr: 0.000025 - momentum: 0.000000
2024-03-26 09:44:19,479 epoch 3 - iter 24/48 - loss 0.37183904 - time (sec): 11.05 - samples/sec: 1489.25 - lr: 0.000025 - momentum: 0.000000
2024-03-26 09:44:21,917 epoch 3 - iter 28/48 - loss 0.36591246 - time (sec): 13.49 - samples/sec: 1434.46 - lr: 0.000025 - momentum: 0.000000
2024-03-26 09:44:23,760 epoch 3 - iter 32/48 - loss 0.35952677 - time (sec): 15.33 - samples/sec: 1444.19 - lr: 0.000025 - momentum: 0.000000
2024-03-26 09:44:25,187 epoch 3 - iter 36/48 - loss 0.35091038 - time (sec): 16.76 - samples/sec: 1478.74 - lr: 0.000024 - momentum: 0.000000
2024-03-26 09:44:27,447 epoch 3 - iter 40/48 - loss 0.33792769 - time (sec): 19.02 - samples/sec: 1451.82 - lr: 0.000024 - momentum: 0.000000
2024-03-26 09:44:30,704 epoch 3 - iter 44/48 - loss 0.31414364 - time (sec): 22.27 - samples/sec: 1446.67 - lr: 0.000024 - momentum: 0.000000
2024-03-26 09:44:31,961 epoch 3 - iter 48/48 - loss 0.30774544 - time (sec): 23.53 - samples/sec: 1464.97 - lr: 0.000023 - momentum: 0.000000
2024-03-26 09:44:31,961 ----------------------------------------------------------------------------------------------------
2024-03-26 09:44:31,961 EPOCH 3 done: loss 0.3077 - lr: 0.000023
2024-03-26 09:44:32,882 DEV : loss 0.26573842763900757 - f1-score (micro avg)  0.8386
2024-03-26 09:44:32,883 saving best model
2024-03-26 09:44:33,332 ----------------------------------------------------------------------------------------------------
2024-03-26 09:44:34,890 epoch 4 - iter 4/48 - loss 0.27499759 - time (sec): 1.56 - samples/sec: 1638.26 - lr: 0.000023 - momentum: 0.000000
2024-03-26 09:44:37,093 epoch 4 - iter 8/48 - loss 0.24221231 - time (sec): 3.76 - samples/sec: 1594.15 - lr: 0.000023 - momentum: 0.000000
2024-03-26 09:44:38,337 epoch 4 - iter 12/48 - loss 0.23930651 - time (sec): 5.00 - samples/sec: 1670.67 - lr: 0.000023 - momentum: 0.000000
2024-03-26 09:44:40,546 epoch 4 - iter 16/48 - loss 0.23924867 - time (sec): 7.21 - samples/sec: 1563.30 - lr: 0.000022 - momentum: 0.000000
2024-03-26 09:44:43,067 epoch 4 - iter 20/48 - loss 0.22689797 - time (sec): 9.73 - samples/sec: 1436.47 - lr: 0.000022 - momentum: 0.000000
2024-03-26 09:44:45,085 epoch 4 - iter 24/48 - loss 0.23176516 - time (sec): 11.75 - samples/sec: 1432.56 - lr: 0.000022 - momentum: 0.000000
2024-03-26 09:44:47,195 epoch 4 - iter 28/48 - loss 0.22426201 - time (sec): 13.86 - samples/sec: 1435.28 - lr: 0.000022 - momentum: 0.000000
2024-03-26 09:44:49,751 epoch 4 - iter 32/48 - loss 0.22093128 - time (sec): 16.42 - samples/sec: 1404.63 - lr: 0.000021 - momentum: 0.000000
2024-03-26 09:44:52,547 epoch 4 - iter 36/48 - loss 0.21058242 - time (sec): 19.21 - samples/sec: 1392.25 - lr: 0.000021 - momentum: 0.000000
2024-03-26 09:44:54,235 epoch 4 - iter 40/48 - loss 0.20472930 - time (sec): 20.90 - samples/sec: 1391.98 - lr: 0.000021 - momentum: 0.000000
2024-03-26 09:44:56,221 epoch 4 - iter 44/48 - loss 0.20327120 - time (sec): 22.89 - samples/sec: 1394.79 - lr: 0.000020 - momentum: 0.000000
2024-03-26 09:44:57,880 epoch 4 - iter 48/48 - loss 0.20204122 - time (sec): 24.55 - samples/sec: 1404.36 - lr: 0.000020 - momentum: 0.000000
2024-03-26 09:44:57,880 ----------------------------------------------------------------------------------------------------
2024-03-26 09:44:57,881 EPOCH 4 done: loss 0.2020 - lr: 0.000020
2024-03-26 09:44:58,784 DEV : loss 0.2160491645336151 - f1-score (micro avg)  0.8654
2024-03-26 09:44:58,785 saving best model
2024-03-26 09:44:59,236 ----------------------------------------------------------------------------------------------------
2024-03-26 09:45:00,059 epoch 5 - iter 4/48 - loss 0.11408889 - time (sec): 0.82 - samples/sec: 2231.23 - lr: 0.000020 - momentum: 0.000000
2024-03-26 09:45:01,428 epoch 5 - iter 8/48 - loss 0.15205366 - time (sec): 2.19 - samples/sec: 2030.58 - lr: 0.000020 - momentum: 0.000000
2024-03-26 09:45:04,164 epoch 5 - iter 12/48 - loss 0.14478600 - time (sec): 4.93 - samples/sec: 1619.74 - lr: 0.000019 - momentum: 0.000000
2024-03-26 09:45:07,113 epoch 5 - iter 16/48 - loss 0.14042386 - time (sec): 7.88 - samples/sec: 1432.90 - lr: 0.000019 - momentum: 0.000000
2024-03-26 09:45:08,490 epoch 5 - iter 20/48 - loss 0.14804854 - time (sec): 9.25 - samples/sec: 1483.74 - lr: 0.000019 - momentum: 0.000000
2024-03-26 09:45:10,941 epoch 5 - iter 24/48 - loss 0.14514482 - time (sec): 11.70 - samples/sec: 1431.68 - lr: 0.000018 - momentum: 0.000000
2024-03-26 09:45:13,001 epoch 5 - iter 28/48 - loss 0.14291018 - time (sec): 13.76 - samples/sec: 1419.68 - lr: 0.000018 - momentum: 0.000000
2024-03-26 09:45:15,244 epoch 5 - iter 32/48 - loss 0.14292917 - time (sec): 16.01 - samples/sec: 1447.11 - lr: 0.000018 - momentum: 0.000000
2024-03-26 09:45:16,701 epoch 5 - iter 36/48 - loss 0.14908240 - time (sec): 17.46 - samples/sec: 1470.91 - lr: 0.000018 - momentum: 0.000000
2024-03-26 09:45:19,231 epoch 5 - iter 40/48 - loss 0.14377708 - time (sec): 19.99 - samples/sec: 1420.95 - lr: 0.000017 - momentum: 0.000000
2024-03-26 09:45:21,304 epoch 5 - iter 44/48 - loss 0.14312629 - time (sec): 22.07 - samples/sec: 1433.66 - lr: 0.000017 - momentum: 0.000000
2024-03-26 09:45:23,247 epoch 5 - iter 48/48 - loss 0.14304449 - time (sec): 24.01 - samples/sec: 1435.77 - lr: 0.000017 - momentum: 0.000000
2024-03-26 09:45:23,247 ----------------------------------------------------------------------------------------------------
2024-03-26 09:45:23,247 EPOCH 5 done: loss 0.1430 - lr: 0.000017
2024-03-26 09:45:24,148 DEV : loss 0.19238987565040588 - f1-score (micro avg)  0.8786
2024-03-26 09:45:24,150 saving best model
2024-03-26 09:45:24,606 ----------------------------------------------------------------------------------------------------
2024-03-26 09:45:26,184 epoch 6 - iter 4/48 - loss 0.11206717 - time (sec): 1.58 - samples/sec: 1579.51 - lr: 0.000017 - momentum: 0.000000
2024-03-26 09:45:28,582 epoch 6 - iter 8/48 - loss 0.12248471 - time (sec): 3.97 - samples/sec: 1610.52 - lr: 0.000016 - momentum: 0.000000
2024-03-26 09:45:30,514 epoch 6 - iter 12/48 - loss 0.12381849 - time (sec): 5.91 - samples/sec: 1533.89 - lr: 0.000016 - momentum: 0.000000
2024-03-26 09:45:32,523 epoch 6 - iter 16/48 - loss 0.11636626 - time (sec): 7.91 - samples/sec: 1532.16 - lr: 0.000016 - momentum: 0.000000
2024-03-26 09:45:35,264 epoch 6 - iter 20/48 - loss 0.11487295 - time (sec): 10.66 - samples/sec: 1499.36 - lr: 0.000015 - momentum: 0.000000
2024-03-26 09:45:36,772 epoch 6 - iter 24/48 - loss 0.12472605 - time (sec): 12.16 - samples/sec: 1521.80 - lr: 0.000015 - momentum: 0.000000
2024-03-26 09:45:38,138 epoch 6 - iter 28/48 - loss 0.12552974 - time (sec): 13.53 - samples/sec: 1527.57 - lr: 0.000015 - momentum: 0.000000
2024-03-26 09:45:39,301 epoch 6 - iter 32/48 - loss 0.12340584 - time (sec): 14.69 - samples/sec: 1548.39 - lr: 0.000015 - momentum: 0.000000
2024-03-26 09:45:40,766 epoch 6 - iter 36/48 - loss 0.11845859 - time (sec): 16.16 - samples/sec: 1580.23 - lr: 0.000014 - momentum: 0.000000
2024-03-26 09:45:42,672 epoch 6 - iter 40/48 - loss 0.11916775 - time (sec): 18.06 - samples/sec: 1568.82 - lr: 0.000014 - momentum: 0.000000
2024-03-26 09:45:44,858 epoch 6 - iter 44/48 - loss 0.11527571 - time (sec): 20.25 - samples/sec: 1587.88 - lr: 0.000014 - momentum: 0.000000
2024-03-26 09:45:46,508 epoch 6 - iter 48/48 - loss 0.11522214 - time (sec): 21.90 - samples/sec: 1574.13 - lr: 0.000014 - momentum: 0.000000
2024-03-26 09:45:46,508 ----------------------------------------------------------------------------------------------------
2024-03-26 09:45:46,508 EPOCH 6 done: loss 0.1152 - lr: 0.000014
2024-03-26 09:45:47,403 DEV : loss 0.17289206385612488 - f1-score (micro avg)  0.8941
2024-03-26 09:45:47,404 saving best model
2024-03-26 09:45:47,857 ----------------------------------------------------------------------------------------------------
2024-03-26 09:45:49,548 epoch 7 - iter 4/48 - loss 0.07200689 - time (sec): 1.69 - samples/sec: 1441.68 - lr: 0.000013 - momentum: 0.000000
2024-03-26 09:45:51,129 epoch 7 - iter 8/48 - loss 0.09330083 - time (sec): 3.27 - samples/sec: 1514.70 - lr: 0.000013 - momentum: 0.000000
2024-03-26 09:45:53,214 epoch 7 - iter 12/48 - loss 0.09038192 - time (sec): 5.36 - samples/sec: 1469.64 - lr: 0.000013 - momentum: 0.000000
2024-03-26 09:45:55,204 epoch 7 - iter 16/48 - loss 0.09096213 - time (sec): 7.35 - samples/sec: 1516.73 - lr: 0.000012 - momentum: 0.000000
2024-03-26 09:45:55,835 epoch 7 - iter 20/48 - loss 0.08593354 - time (sec): 7.98 - samples/sec: 1624.73 - lr: 0.000012 - momentum: 0.000000
2024-03-26 09:45:57,400 epoch 7 - iter 24/48 - loss 0.08538935 - time (sec): 9.54 - samples/sec: 1605.80 - lr: 0.000012 - momentum: 0.000000
2024-03-26 09:46:00,234 epoch 7 - iter 28/48 - loss 0.08421075 - time (sec): 12.38 - samples/sec: 1504.87 - lr: 0.000012 - momentum: 0.000000
2024-03-26 09:46:02,989 epoch 7 - iter 32/48 - loss 0.08232354 - time (sec): 15.13 - samples/sec: 1431.85 - lr: 0.000011 - momentum: 0.000000
2024-03-26 09:46:05,703 epoch 7 - iter 36/48 - loss 0.08605663 - time (sec): 17.84 - samples/sec: 1444.74 - lr: 0.000011 - momentum: 0.000000
2024-03-26 09:46:07,663 epoch 7 - iter 40/48 - loss 0.08909139 - time (sec): 19.80 - samples/sec: 1451.56 - lr: 0.000011 - momentum: 0.000000
2024-03-26 09:46:10,190 epoch 7 - iter 44/48 - loss 0.09175837 - time (sec): 22.33 - samples/sec: 1426.44 - lr: 0.000010 - momentum: 0.000000
2024-03-26 09:46:11,923 epoch 7 - iter 48/48 - loss 0.09047578 - time (sec): 24.06 - samples/sec: 1432.51 - lr: 0.000010 - momentum: 0.000000
2024-03-26 09:46:11,923 ----------------------------------------------------------------------------------------------------
2024-03-26 09:46:11,923 EPOCH 7 done: loss 0.0905 - lr: 0.000010
2024-03-26 09:46:12,819 DEV : loss 0.17756354808807373 - f1-score (micro avg)  0.8921
2024-03-26 09:46:12,820 ----------------------------------------------------------------------------------------------------
2024-03-26 09:46:15,432 epoch 8 - iter 4/48 - loss 0.09145265 - time (sec): 2.61 - samples/sec: 1264.58 - lr: 0.000010 - momentum: 0.000000
2024-03-26 09:46:17,478 epoch 8 - iter 8/48 - loss 0.06932129 - time (sec): 4.66 - samples/sec: 1259.86 - lr: 0.000010 - momentum: 0.000000
2024-03-26 09:46:20,637 epoch 8 - iter 12/48 - loss 0.06960067 - time (sec): 7.82 - samples/sec: 1239.81 - lr: 0.000009 - momentum: 0.000000
2024-03-26 09:46:22,554 epoch 8 - iter 16/48 - loss 0.08062008 - time (sec): 9.73 - samples/sec: 1268.10 - lr: 0.000009 - momentum: 0.000000
2024-03-26 09:46:23,990 epoch 8 - iter 20/48 - loss 0.08058957 - time (sec): 11.17 - samples/sec: 1314.68 - lr: 0.000009 - momentum: 0.000000
2024-03-26 09:46:26,394 epoch 8 - iter 24/48 - loss 0.07978568 - time (sec): 13.57 - samples/sec: 1314.65 - lr: 0.000009 - momentum: 0.000000
2024-03-26 09:46:28,134 epoch 8 - iter 28/48 - loss 0.08353391 - time (sec): 15.31 - samples/sec: 1350.27 - lr: 0.000008 - momentum: 0.000000
2024-03-26 09:46:29,789 epoch 8 - iter 32/48 - loss 0.08022088 - time (sec): 16.97 - samples/sec: 1371.01 - lr: 0.000008 - momentum: 0.000000
2024-03-26 09:46:31,061 epoch 8 - iter 36/48 - loss 0.07954960 - time (sec): 18.24 - samples/sec: 1402.54 - lr: 0.000008 - momentum: 0.000000
2024-03-26 09:46:33,360 epoch 8 - iter 40/48 - loss 0.07805199 - time (sec): 20.54 - samples/sec: 1411.53 - lr: 0.000007 - momentum: 0.000000
2024-03-26 09:46:36,162 epoch 8 - iter 44/48 - loss 0.07479444 - time (sec): 23.34 - samples/sec: 1380.24 - lr: 0.000007 - momentum: 0.000000
2024-03-26 09:46:38,065 epoch 8 - iter 48/48 - loss 0.07452892 - time (sec): 25.24 - samples/sec: 1365.52 - lr: 0.000007 - momentum: 0.000000
2024-03-26 09:46:38,065 ----------------------------------------------------------------------------------------------------
2024-03-26 09:46:38,065 EPOCH 8 done: loss 0.0745 - lr: 0.000007
2024-03-26 09:46:38,956 DEV : loss 0.1639167219400406 - f1-score (micro avg)  0.9223
2024-03-26 09:46:38,957 saving best model
2024-03-26 09:46:39,402 ----------------------------------------------------------------------------------------------------
2024-03-26 09:46:41,189 epoch 9 - iter 4/48 - loss 0.08119094 - time (sec): 1.78 - samples/sec: 1593.39 - lr: 0.000007 - momentum: 0.000000
2024-03-26 09:46:43,560 epoch 9 - iter 8/48 - loss 0.06826402 - time (sec): 4.16 - samples/sec: 1475.47 - lr: 0.000006 - momentum: 0.000000
2024-03-26 09:46:45,875 epoch 9 - iter 12/48 - loss 0.08047549 - time (sec): 6.47 - samples/sec: 1426.58 - lr: 0.000006 - momentum: 0.000000
2024-03-26 09:46:47,883 epoch 9 - iter 16/48 - loss 0.07774196 - time (sec): 8.48 - samples/sec: 1426.44 - lr: 0.000006 - momentum: 0.000000
2024-03-26 09:46:49,315 epoch 9 - iter 20/48 - loss 0.06981163 - time (sec): 9.91 - samples/sec: 1487.02 - lr: 0.000006 - momentum: 0.000000
2024-03-26 09:46:50,501 epoch 9 - iter 24/48 - loss 0.06576084 - time (sec): 11.10 - samples/sec: 1535.18 - lr: 0.000005 - momentum: 0.000000
2024-03-26 09:46:52,176 epoch 9 - iter 28/48 - loss 0.06423120 - time (sec): 12.77 - samples/sec: 1548.47 - lr: 0.000005 - momentum: 0.000000
2024-03-26 09:46:54,403 epoch 9 - iter 32/48 - loss 0.06847394 - time (sec): 15.00 - samples/sec: 1533.11 - lr: 0.000005 - momentum: 0.000000
2024-03-26 09:46:57,049 epoch 9 - iter 36/48 - loss 0.06724059 - time (sec): 17.64 - samples/sec: 1480.42 - lr: 0.000004 - momentum: 0.000000
2024-03-26 09:46:59,934 epoch 9 - iter 40/48 - loss 0.06646050 - time (sec): 20.53 - samples/sec: 1435.67 - lr: 0.000004 - momentum: 0.000000
2024-03-26 09:47:01,715 epoch 9 - iter 44/48 - loss 0.06605033 - time (sec): 22.31 - samples/sec: 1451.29 - lr: 0.000004 - momentum: 0.000000
2024-03-26 09:47:02,735 epoch 9 - iter 48/48 - loss 0.06612071 - time (sec): 23.33 - samples/sec: 1477.52 - lr: 0.000004 - momentum: 0.000000
2024-03-26 09:47:02,735 ----------------------------------------------------------------------------------------------------
2024-03-26 09:47:02,735 EPOCH 9 done: loss 0.0661 - lr: 0.000004
2024-03-26 09:47:03,634 DEV : loss 0.15946133434772491 - f1-score (micro avg)  0.9256
2024-03-26 09:47:03,635 saving best model
2024-03-26 09:47:04,087 ----------------------------------------------------------------------------------------------------
2024-03-26 09:47:06,375 epoch 10 - iter 4/48 - loss 0.03203042 - time (sec): 2.29 - samples/sec: 1444.51 - lr: 0.000003 - momentum: 0.000000
2024-03-26 09:47:08,406 epoch 10 - iter 8/48 - loss 0.04820710 - time (sec): 4.32 - samples/sec: 1430.94 - lr: 0.000003 - momentum: 0.000000
2024-03-26 09:47:10,303 epoch 10 - iter 12/48 - loss 0.04809610 - time (sec): 6.21 - samples/sec: 1420.07 - lr: 0.000003 - momentum: 0.000000
2024-03-26 09:47:11,525 epoch 10 - iter 16/48 - loss 0.05153512 - time (sec): 7.44 - samples/sec: 1481.96 - lr: 0.000002 - momentum: 0.000000
2024-03-26 09:47:13,416 epoch 10 - iter 20/48 - loss 0.05993410 - time (sec): 9.33 - samples/sec: 1469.67 - lr: 0.000002 - momentum: 0.000000
2024-03-26 09:47:15,606 epoch 10 - iter 24/48 - loss 0.06493684 - time (sec): 11.52 - samples/sec: 1441.99 - lr: 0.000002 - momentum: 0.000000
2024-03-26 09:47:16,480 epoch 10 - iter 28/48 - loss 0.06441247 - time (sec): 12.39 - samples/sec: 1516.33 - lr: 0.000002 - momentum: 0.000000
2024-03-26 09:47:17,728 epoch 10 - iter 32/48 - loss 0.06219799 - time (sec): 13.64 - samples/sec: 1557.26 - lr: 0.000001 - momentum: 0.000000
2024-03-26 09:47:20,461 epoch 10 - iter 36/48 - loss 0.05975149 - time (sec): 16.37 - samples/sec: 1508.27 - lr: 0.000001 - momentum: 0.000000
2024-03-26 09:47:22,867 epoch 10 - iter 40/48 - loss 0.06004247 - time (sec): 18.78 - samples/sec: 1531.22 - lr: 0.000001 - momentum: 0.000000
2024-03-26 09:47:25,399 epoch 10 - iter 44/48 - loss 0.05988235 - time (sec): 21.31 - samples/sec: 1505.10 - lr: 0.000001 - momentum: 0.000000
2024-03-26 09:47:27,307 epoch 10 - iter 48/48 - loss 0.05921491 - time (sec): 23.22 - samples/sec: 1484.68 - lr: 0.000000 - momentum: 0.000000
2024-03-26 09:47:27,308 ----------------------------------------------------------------------------------------------------
2024-03-26 09:47:27,308 EPOCH 10 done: loss 0.0592 - lr: 0.000000
2024-03-26 09:47:28,208 DEV : loss 0.1646251529455185 - f1-score (micro avg)  0.9228
2024-03-26 09:47:28,489 ----------------------------------------------------------------------------------------------------
2024-03-26 09:47:28,490 Loading model from best epoch ...
2024-03-26 09:47:29,430 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software
2024-03-26 09:47:30,175 
Results:
- F-score (micro) 0.8969
- F-score (macro) 0.6818
- Accuracy 0.8175

By class:
              precision    recall  f1-score   support

 Unternehmen     0.9105    0.8797    0.8948       266
 Auslagerung     0.8371    0.8876    0.8616       249
         Ort     0.9565    0.9851    0.9706       134
    Software     0.0000    0.0000    0.0000         0

   micro avg     0.8894    0.9045    0.8969       649
   macro avg     0.6760    0.6881    0.6818       649
weighted avg     0.8919    0.9045    0.8977       649

2024-03-26 09:47:30,176 ----------------------------------------------------------------------------------------------------