hamedkhaledi commited on
Commit
816ca68
1 Parent(s): 0b5f626

Update model

Browse files
Files changed (3) hide show
  1. loss.tsv +11 -0
  2. pytorch_model.bin +3 -0
  3. training.log +522 -0
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP BAD_EPOCHS LEARNING_RATE TRAIN_LOSS
2
+ 1 15:21:48 0 0.1000 0.27953692015655984
3
+ 2 15:31:22 0 0.1000 0.15365227826273328
4
+ 3 15:41:06 0 0.1000 0.12001519515322241
5
+ 4 15:50:54 0 0.1000 0.10328522111398844
6
+ 5 16:00:45 0 0.1000 0.09241386713466632
7
+ 6 16:10:29 0 0.1000 0.08505490679055881
8
+ 7 16:20:25 0 0.1000 0.07861811519301767
9
+ 8 16:30:21 0 0.1000 0.07341135664633389
10
+ 9 16:40:13 0 0.1000 0.06911533349940868
11
+ 10 16:50:01 0 0.1000 0.06593435410093888
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:698a7ca2b501a853c807de4defc42901968d932393a86d6a636d5ff4346dc54a
3
+ size 494428971
training.log ADDED
@@ -0,0 +1,522 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2022-08-06 15:12:29,180 ----------------------------------------------------------------------------------------------------
2
+ 2022-08-06 15:12:29,182 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(42000, 768, padding_idx=0)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0): BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ (1): BertLayer(
39
+ (attention): BertAttention(
40
+ (self): BertSelfAttention(
41
+ (query): Linear(in_features=768, out_features=768, bias=True)
42
+ (key): Linear(in_features=768, out_features=768, bias=True)
43
+ (value): Linear(in_features=768, out_features=768, bias=True)
44
+ (dropout): Dropout(p=0.1, inplace=False)
45
+ )
46
+ (output): BertSelfOutput(
47
+ (dense): Linear(in_features=768, out_features=768, bias=True)
48
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
49
+ (dropout): Dropout(p=0.1, inplace=False)
50
+ )
51
+ )
52
+ (intermediate): BertIntermediate(
53
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
54
+ (intermediate_act_fn): GELUActivation()
55
+ )
56
+ (output): BertOutput(
57
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
58
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
59
+ (dropout): Dropout(p=0.1, inplace=False)
60
+ )
61
+ )
62
+ (2): BertLayer(
63
+ (attention): BertAttention(
64
+ (self): BertSelfAttention(
65
+ (query): Linear(in_features=768, out_features=768, bias=True)
66
+ (key): Linear(in_features=768, out_features=768, bias=True)
67
+ (value): Linear(in_features=768, out_features=768, bias=True)
68
+ (dropout): Dropout(p=0.1, inplace=False)
69
+ )
70
+ (output): BertSelfOutput(
71
+ (dense): Linear(in_features=768, out_features=768, bias=True)
72
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
73
+ (dropout): Dropout(p=0.1, inplace=False)
74
+ )
75
+ )
76
+ (intermediate): BertIntermediate(
77
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
78
+ (intermediate_act_fn): GELUActivation()
79
+ )
80
+ (output): BertOutput(
81
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
82
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
83
+ (dropout): Dropout(p=0.1, inplace=False)
84
+ )
85
+ )
86
+ (3): BertLayer(
87
+ (attention): BertAttention(
88
+ (self): BertSelfAttention(
89
+ (query): Linear(in_features=768, out_features=768, bias=True)
90
+ (key): Linear(in_features=768, out_features=768, bias=True)
91
+ (value): Linear(in_features=768, out_features=768, bias=True)
92
+ (dropout): Dropout(p=0.1, inplace=False)
93
+ )
94
+ (output): BertSelfOutput(
95
+ (dense): Linear(in_features=768, out_features=768, bias=True)
96
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
97
+ (dropout): Dropout(p=0.1, inplace=False)
98
+ )
99
+ )
100
+ (intermediate): BertIntermediate(
101
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
102
+ (intermediate_act_fn): GELUActivation()
103
+ )
104
+ (output): BertOutput(
105
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
106
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
107
+ (dropout): Dropout(p=0.1, inplace=False)
108
+ )
109
+ )
110
+ (4): BertLayer(
111
+ (attention): BertAttention(
112
+ (self): BertSelfAttention(
113
+ (query): Linear(in_features=768, out_features=768, bias=True)
114
+ (key): Linear(in_features=768, out_features=768, bias=True)
115
+ (value): Linear(in_features=768, out_features=768, bias=True)
116
+ (dropout): Dropout(p=0.1, inplace=False)
117
+ )
118
+ (output): BertSelfOutput(
119
+ (dense): Linear(in_features=768, out_features=768, bias=True)
120
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
121
+ (dropout): Dropout(p=0.1, inplace=False)
122
+ )
123
+ )
124
+ (intermediate): BertIntermediate(
125
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
126
+ (intermediate_act_fn): GELUActivation()
127
+ )
128
+ (output): BertOutput(
129
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
130
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
131
+ (dropout): Dropout(p=0.1, inplace=False)
132
+ )
133
+ )
134
+ (5): BertLayer(
135
+ (attention): BertAttention(
136
+ (self): BertSelfAttention(
137
+ (query): Linear(in_features=768, out_features=768, bias=True)
138
+ (key): Linear(in_features=768, out_features=768, bias=True)
139
+ (value): Linear(in_features=768, out_features=768, bias=True)
140
+ (dropout): Dropout(p=0.1, inplace=False)
141
+ )
142
+ (output): BertSelfOutput(
143
+ (dense): Linear(in_features=768, out_features=768, bias=True)
144
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
145
+ (dropout): Dropout(p=0.1, inplace=False)
146
+ )
147
+ )
148
+ (intermediate): BertIntermediate(
149
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
150
+ (intermediate_act_fn): GELUActivation()
151
+ )
152
+ (output): BertOutput(
153
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
154
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
155
+ (dropout): Dropout(p=0.1, inplace=False)
156
+ )
157
+ )
158
+ (6): BertLayer(
159
+ (attention): BertAttention(
160
+ (self): BertSelfAttention(
161
+ (query): Linear(in_features=768, out_features=768, bias=True)
162
+ (key): Linear(in_features=768, out_features=768, bias=True)
163
+ (value): Linear(in_features=768, out_features=768, bias=True)
164
+ (dropout): Dropout(p=0.1, inplace=False)
165
+ )
166
+ (output): BertSelfOutput(
167
+ (dense): Linear(in_features=768, out_features=768, bias=True)
168
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
169
+ (dropout): Dropout(p=0.1, inplace=False)
170
+ )
171
+ )
172
+ (intermediate): BertIntermediate(
173
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
174
+ (intermediate_act_fn): GELUActivation()
175
+ )
176
+ (output): BertOutput(
177
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
178
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
179
+ (dropout): Dropout(p=0.1, inplace=False)
180
+ )
181
+ )
182
+ (7): BertLayer(
183
+ (attention): BertAttention(
184
+ (self): BertSelfAttention(
185
+ (query): Linear(in_features=768, out_features=768, bias=True)
186
+ (key): Linear(in_features=768, out_features=768, bias=True)
187
+ (value): Linear(in_features=768, out_features=768, bias=True)
188
+ (dropout): Dropout(p=0.1, inplace=False)
189
+ )
190
+ (output): BertSelfOutput(
191
+ (dense): Linear(in_features=768, out_features=768, bias=True)
192
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
193
+ (dropout): Dropout(p=0.1, inplace=False)
194
+ )
195
+ )
196
+ (intermediate): BertIntermediate(
197
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
198
+ (intermediate_act_fn): GELUActivation()
199
+ )
200
+ (output): BertOutput(
201
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
202
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
203
+ (dropout): Dropout(p=0.1, inplace=False)
204
+ )
205
+ )
206
+ (8): BertLayer(
207
+ (attention): BertAttention(
208
+ (self): BertSelfAttention(
209
+ (query): Linear(in_features=768, out_features=768, bias=True)
210
+ (key): Linear(in_features=768, out_features=768, bias=True)
211
+ (value): Linear(in_features=768, out_features=768, bias=True)
212
+ (dropout): Dropout(p=0.1, inplace=False)
213
+ )
214
+ (output): BertSelfOutput(
215
+ (dense): Linear(in_features=768, out_features=768, bias=True)
216
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
217
+ (dropout): Dropout(p=0.1, inplace=False)
218
+ )
219
+ )
220
+ (intermediate): BertIntermediate(
221
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
222
+ (intermediate_act_fn): GELUActivation()
223
+ )
224
+ (output): BertOutput(
225
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
226
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
227
+ (dropout): Dropout(p=0.1, inplace=False)
228
+ )
229
+ )
230
+ (9): BertLayer(
231
+ (attention): BertAttention(
232
+ (self): BertSelfAttention(
233
+ (query): Linear(in_features=768, out_features=768, bias=True)
234
+ (key): Linear(in_features=768, out_features=768, bias=True)
235
+ (value): Linear(in_features=768, out_features=768, bias=True)
236
+ (dropout): Dropout(p=0.1, inplace=False)
237
+ )
238
+ (output): BertSelfOutput(
239
+ (dense): Linear(in_features=768, out_features=768, bias=True)
240
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
241
+ (dropout): Dropout(p=0.1, inplace=False)
242
+ )
243
+ )
244
+ (intermediate): BertIntermediate(
245
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
246
+ (intermediate_act_fn): GELUActivation()
247
+ )
248
+ (output): BertOutput(
249
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
250
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
251
+ (dropout): Dropout(p=0.1, inplace=False)
252
+ )
253
+ )
254
+ (10): BertLayer(
255
+ (attention): BertAttention(
256
+ (self): BertSelfAttention(
257
+ (query): Linear(in_features=768, out_features=768, bias=True)
258
+ (key): Linear(in_features=768, out_features=768, bias=True)
259
+ (value): Linear(in_features=768, out_features=768, bias=True)
260
+ (dropout): Dropout(p=0.1, inplace=False)
261
+ )
262
+ (output): BertSelfOutput(
263
+ (dense): Linear(in_features=768, out_features=768, bias=True)
264
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
265
+ (dropout): Dropout(p=0.1, inplace=False)
266
+ )
267
+ )
268
+ (intermediate): BertIntermediate(
269
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
270
+ (intermediate_act_fn): GELUActivation()
271
+ )
272
+ (output): BertOutput(
273
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
274
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
275
+ (dropout): Dropout(p=0.1, inplace=False)
276
+ )
277
+ )
278
+ (11): BertLayer(
279
+ (attention): BertAttention(
280
+ (self): BertSelfAttention(
281
+ (query): Linear(in_features=768, out_features=768, bias=True)
282
+ (key): Linear(in_features=768, out_features=768, bias=True)
283
+ (value): Linear(in_features=768, out_features=768, bias=True)
284
+ (dropout): Dropout(p=0.1, inplace=False)
285
+ )
286
+ (output): BertSelfOutput(
287
+ (dense): Linear(in_features=768, out_features=768, bias=True)
288
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
289
+ (dropout): Dropout(p=0.1, inplace=False)
290
+ )
291
+ )
292
+ (intermediate): BertIntermediate(
293
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
294
+ (intermediate_act_fn): GELUActivation()
295
+ )
296
+ (output): BertOutput(
297
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
298
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
299
+ (dropout): Dropout(p=0.1, inplace=False)
300
+ )
301
+ )
302
+ )
303
+ )
304
+ (pooler): BertPooler(
305
+ (dense): Linear(in_features=768, out_features=768, bias=True)
306
+ (activation): Tanh()
307
+ )
308
+ )
309
+ )
310
+ (word_dropout): WordDropout(p=0.05)
311
+ (locked_dropout): LockedDropout(p=0.5)
312
+ (rnn): LSTM(768, 512, batch_first=True, bidirectional=True)
313
+ (linear): Linear(in_features=1024, out_features=30, bias=True)
314
+ (beta): 1.0
315
+ (weights): None
316
+ (weight_tensor) None
317
+ )"
318
+ 2022-08-06 15:12:29,182 ----------------------------------------------------------------------------------------------------
319
+ 2022-08-06 15:12:29,183 Corpus: "Corpus: 24000 train + 3000 dev + 3000 test sentences"
320
+ 2022-08-06 15:12:29,183 ----------------------------------------------------------------------------------------------------
321
+ 2022-08-06 15:12:29,183 Parameters:
322
+ 2022-08-06 15:12:29,183 - learning_rate: "0.1"
323
+ 2022-08-06 15:12:29,183 - mini_batch_size: "8"
324
+ 2022-08-06 15:12:29,183 - patience: "3"
325
+ 2022-08-06 15:12:29,183 - anneal_factor: "0.5"
326
+ 2022-08-06 15:12:29,183 - max_epochs: "10"
327
+ 2022-08-06 15:12:29,183 - shuffle: "True"
328
+ 2022-08-06 15:12:29,183 - train_with_dev: "True"
329
+ 2022-08-06 15:12:29,183 - batch_growth_annealing: "False"
330
+ 2022-08-06 15:12:29,183 ----------------------------------------------------------------------------------------------------
331
+ 2022-08-06 15:12:29,183 Model training base path: "data/pos-Uppsala/model"
332
+ 2022-08-06 15:12:29,183 ----------------------------------------------------------------------------------------------------
333
+ 2022-08-06 15:12:29,183 Device: cuda:0
334
+ 2022-08-06 15:12:29,183 ----------------------------------------------------------------------------------------------------
335
+ 2022-08-06 15:12:29,184 Embeddings storage mode: gpu
336
+ 2022-08-06 15:12:29,185 ----------------------------------------------------------------------------------------------------
337
+ 2022-08-06 15:13:18,972 epoch 1 - iter 337/3375 - loss 0.74289984 - samples/sec: 54.18 - lr: 0.100000
338
+ 2022-08-06 15:14:15,036 epoch 1 - iter 674/3375 - loss 0.53599298 - samples/sec: 48.11 - lr: 0.100000
339
+ 2022-08-06 15:15:12,610 epoch 1 - iter 1011/3375 - loss 0.45754038 - samples/sec: 46.85 - lr: 0.100000
340
+ 2022-08-06 15:16:09,043 epoch 1 - iter 1348/3375 - loss 0.40111208 - samples/sec: 47.79 - lr: 0.100000
341
+ 2022-08-06 15:17:04,137 epoch 1 - iter 1685/3375 - loss 0.36712663 - samples/sec: 48.96 - lr: 0.100000
342
+ 2022-08-06 15:17:58,402 epoch 1 - iter 2022/3375 - loss 0.34049225 - samples/sec: 49.70 - lr: 0.100000
343
+ 2022-08-06 15:18:55,276 epoch 1 - iter 2359/3375 - loss 0.32076226 - samples/sec: 47.42 - lr: 0.100000
344
+ 2022-08-06 15:19:49,979 epoch 1 - iter 2696/3375 - loss 0.31015506 - samples/sec: 49.31 - lr: 0.100000
345
+ 2022-08-06 15:20:48,410 epoch 1 - iter 3033/3375 - loss 0.29391699 - samples/sec: 46.16 - lr: 0.100000
346
+ 2022-08-06 15:21:47,572 epoch 1 - iter 3370/3375 - loss 0.27989028 - samples/sec: 45.59 - lr: 0.100000
347
+ 2022-08-06 15:21:48,555 ----------------------------------------------------------------------------------------------------
348
+ 2022-08-06 15:21:48,555 EPOCH 1 done: loss 0.2795 - lr 0.1000000
349
+ 2022-08-06 15:21:48,555 BAD EPOCHS (no improvement): 0
350
+ 2022-08-06 15:21:48,555 ----------------------------------------------------------------------------------------------------
351
+ 2022-08-06 15:22:45,590 epoch 2 - iter 337/3375 - loss 0.18085661 - samples/sec: 47.29 - lr: 0.100000
352
+ 2022-08-06 15:23:42,698 epoch 2 - iter 674/3375 - loss 0.17216272 - samples/sec: 47.23 - lr: 0.100000
353
+ 2022-08-06 15:24:38,534 epoch 2 - iter 1011/3375 - loss 0.16694117 - samples/sec: 48.31 - lr: 0.100000
354
+ 2022-08-06 15:25:36,464 epoch 2 - iter 1348/3375 - loss 0.16500505 - samples/sec: 46.56 - lr: 0.100000
355
+ 2022-08-06 15:26:32,174 epoch 2 - iter 1685/3375 - loss 0.16167195 - samples/sec: 48.42 - lr: 0.100000
356
+ 2022-08-06 15:27:28,418 epoch 2 - iter 2022/3375 - loss 0.15991464 - samples/sec: 47.96 - lr: 0.100000
357
+ 2022-08-06 15:28:30,730 epoch 2 - iter 2359/3375 - loss 0.15942296 - samples/sec: 43.29 - lr: 0.100000
358
+ 2022-08-06 15:29:27,444 epoch 2 - iter 2696/3375 - loss 0.15779417 - samples/sec: 47.56 - lr: 0.100000
359
+ 2022-08-06 15:30:25,187 epoch 2 - iter 3033/3375 - loss 0.15553239 - samples/sec: 46.71 - lr: 0.100000
360
+ 2022-08-06 15:31:21,714 epoch 2 - iter 3370/3375 - loss 0.15352182 - samples/sec: 47.72 - lr: 0.100000
361
+ 2022-08-06 15:31:22,712 ----------------------------------------------------------------------------------------------------
362
+ 2022-08-06 15:31:22,712 EPOCH 2 done: loss 0.1537 - lr 0.1000000
363
+ 2022-08-06 15:31:22,712 BAD EPOCHS (no improvement): 0
364
+ 2022-08-06 15:31:22,712 ----------------------------------------------------------------------------------------------------
365
+ 2022-08-06 15:32:23,790 epoch 3 - iter 337/3375 - loss 0.11867195 - samples/sec: 44.16 - lr: 0.100000
366
+ 2022-08-06 15:33:21,161 epoch 3 - iter 674/3375 - loss 0.11878234 - samples/sec: 47.02 - lr: 0.100000
367
+ 2022-08-06 15:34:20,702 epoch 3 - iter 1011/3375 - loss 0.11942785 - samples/sec: 45.31 - lr: 0.100000
368
+ 2022-08-06 15:35:18,259 epoch 3 - iter 1348/3375 - loss 0.11958903 - samples/sec: 46.86 - lr: 0.100000
369
+ 2022-08-06 15:36:16,967 epoch 3 - iter 1685/3375 - loss 0.11914369 - samples/sec: 45.94 - lr: 0.100000
370
+ 2022-08-06 15:37:13,560 epoch 3 - iter 2022/3375 - loss 0.11916365 - samples/sec: 47.66 - lr: 0.100000
371
+ 2022-08-06 15:38:10,624 epoch 3 - iter 2359/3375 - loss 0.12096981 - samples/sec: 47.27 - lr: 0.100000
372
+ 2022-08-06 15:39:10,034 epoch 3 - iter 2696/3375 - loss 0.11987245 - samples/sec: 45.40 - lr: 0.100000
373
+ 2022-08-06 15:40:07,877 epoch 3 - iter 3033/3375 - loss 0.11973164 - samples/sec: 46.63 - lr: 0.100000
374
+ 2022-08-06 15:41:05,610 epoch 3 - iter 3370/3375 - loss 0.12003917 - samples/sec: 46.72 - lr: 0.100000
375
+ 2022-08-06 15:41:06,450 ----------------------------------------------------------------------------------------------------
376
+ 2022-08-06 15:41:06,450 EPOCH 3 done: loss 0.1200 - lr 0.1000000
377
+ 2022-08-06 15:41:06,450 BAD EPOCHS (no improvement): 0
378
+ 2022-08-06 15:41:06,451 ----------------------------------------------------------------------------------------------------
379
+ 2022-08-06 15:42:04,442 epoch 4 - iter 337/3375 - loss 0.09805702 - samples/sec: 46.51 - lr: 0.100000
380
+ 2022-08-06 15:43:05,164 epoch 4 - iter 674/3375 - loss 0.09888569 - samples/sec: 44.42 - lr: 0.100000
381
+ 2022-08-06 15:44:02,546 epoch 4 - iter 1011/3375 - loss 0.10053644 - samples/sec: 47.01 - lr: 0.100000
382
+ 2022-08-06 15:45:01,384 epoch 4 - iter 1348/3375 - loss 0.10119574 - samples/sec: 45.84 - lr: 0.100000
383
+ 2022-08-06 15:46:00,229 epoch 4 - iter 1685/3375 - loss 0.10374826 - samples/sec: 45.84 - lr: 0.100000
384
+ 2022-08-06 15:46:59,791 epoch 4 - iter 2022/3375 - loss 0.10405522 - samples/sec: 45.28 - lr: 0.100000
385
+ 2022-08-06 15:47:57,607 epoch 4 - iter 2359/3375 - loss 0.10411718 - samples/sec: 46.65 - lr: 0.100000
386
+ 2022-08-06 15:48:55,410 epoch 4 - iter 2696/3375 - loss 0.10394934 - samples/sec: 46.66 - lr: 0.100000
387
+ 2022-08-06 15:49:56,783 epoch 4 - iter 3033/3375 - loss 0.10374714 - samples/sec: 43.95 - lr: 0.100000
388
+ 2022-08-06 15:50:54,113 epoch 4 - iter 3370/3375 - loss 0.10333066 - samples/sec: 47.05 - lr: 0.100000
389
+ 2022-08-06 15:50:54,961 ----------------------------------------------------------------------------------------------------
390
+ 2022-08-06 15:50:54,961 EPOCH 4 done: loss 0.1033 - lr 0.1000000
391
+ 2022-08-06 15:50:54,961 BAD EPOCHS (no improvement): 0
392
+ 2022-08-06 15:50:54,961 ----------------------------------------------------------------------------------------------------
393
+ 2022-08-06 15:51:52,151 epoch 5 - iter 337/3375 - loss 0.08744228 - samples/sec: 47.17 - lr: 0.100000
394
+ 2022-08-06 15:52:49,910 epoch 5 - iter 674/3375 - loss 0.08896766 - samples/sec: 46.70 - lr: 0.100000
395
+ 2022-08-06 15:53:50,861 epoch 5 - iter 1011/3375 - loss 0.09000325 - samples/sec: 44.25 - lr: 0.100000
396
+ 2022-08-06 15:54:48,357 epoch 5 - iter 1348/3375 - loss 0.09103779 - samples/sec: 46.91 - lr: 0.100000
397
+ 2022-08-06 15:55:48,122 epoch 5 - iter 1685/3375 - loss 0.09107958 - samples/sec: 45.13 - lr: 0.100000
398
+ 2022-08-06 15:56:49,324 epoch 5 - iter 2022/3375 - loss 0.09135469 - samples/sec: 44.07 - lr: 0.100000
399
+ 2022-08-06 15:57:47,393 epoch 5 - iter 2359/3375 - loss 0.09172710 - samples/sec: 46.45 - lr: 0.100000
400
+ 2022-08-06 15:58:45,694 epoch 5 - iter 2696/3375 - loss 0.09238154 - samples/sec: 46.27 - lr: 0.100000
401
+ 2022-08-06 15:59:42,885 epoch 5 - iter 3033/3375 - loss 0.09253470 - samples/sec: 47.16 - lr: 0.100000
402
+ 2022-08-06 16:00:44,492 epoch 5 - iter 3370/3375 - loss 0.09240350 - samples/sec: 43.78 - lr: 0.100000
403
+ 2022-08-06 16:00:45,327 ----------------------------------------------------------------------------------------------------
404
+ 2022-08-06 16:00:45,328 EPOCH 5 done: loss 0.0924 - lr 0.1000000
405
+ 2022-08-06 16:00:45,328 BAD EPOCHS (no improvement): 0
406
+ 2022-08-06 16:00:45,328 ----------------------------------------------------------------------------------------------------
407
+ 2022-08-06 16:01:42,167 epoch 6 - iter 337/3375 - loss 0.08075428 - samples/sec: 47.46 - lr: 0.100000
408
+ 2022-08-06 16:02:39,509 epoch 6 - iter 674/3375 - loss 0.08099115 - samples/sec: 47.04 - lr: 0.100000
409
+ 2022-08-06 16:03:37,688 epoch 6 - iter 1011/3375 - loss 0.08140463 - samples/sec: 46.36 - lr: 0.100000
410
+ 2022-08-06 16:04:38,640 epoch 6 - iter 1348/3375 - loss 0.08175190 - samples/sec: 44.25 - lr: 0.100000
411
+ 2022-08-06 16:05:35,459 epoch 6 - iter 1685/3375 - loss 0.08233525 - samples/sec: 47.47 - lr: 0.100000
412
+ 2022-08-06 16:06:33,941 epoch 6 - iter 2022/3375 - loss 0.08333964 - samples/sec: 46.12 - lr: 0.100000
413
+ 2022-08-06 16:07:34,247 epoch 6 - iter 2359/3375 - loss 0.08370656 - samples/sec: 44.73 - lr: 0.100000
414
+ 2022-08-06 16:08:32,546 epoch 6 - iter 2696/3375 - loss 0.08503503 - samples/sec: 46.27 - lr: 0.100000
415
+ 2022-08-06 16:09:30,447 epoch 6 - iter 3033/3375 - loss 0.08526801 - samples/sec: 46.58 - lr: 0.100000
416
+ 2022-08-06 16:10:29,216 epoch 6 - iter 3370/3375 - loss 0.08506276 - samples/sec: 45.90 - lr: 0.100000
417
+ 2022-08-06 16:10:29,946 ----------------------------------------------------------------------------------------------------
418
+ 2022-08-06 16:10:29,947 EPOCH 6 done: loss 0.0851 - lr 0.1000000
419
+ 2022-08-06 16:10:29,947 BAD EPOCHS (no improvement): 0
420
+ 2022-08-06 16:10:29,947 ----------------------------------------------------------------------------------------------------
421
+ 2022-08-06 16:11:31,042 epoch 7 - iter 337/3375 - loss 0.07328964 - samples/sec: 44.15 - lr: 0.100000
422
+ 2022-08-06 16:12:31,218 epoch 7 - iter 674/3375 - loss 0.07556648 - samples/sec: 44.82 - lr: 0.100000
423
+ 2022-08-06 16:13:28,468 epoch 7 - iter 1011/3375 - loss 0.07578294 - samples/sec: 47.11 - lr: 0.100000
424
+ 2022-08-06 16:14:28,318 epoch 7 - iter 1348/3375 - loss 0.07581855 - samples/sec: 45.07 - lr: 0.100000
425
+ 2022-08-06 16:15:27,119 epoch 7 - iter 1685/3375 - loss 0.07674717 - samples/sec: 45.87 - lr: 0.100000
426
+ 2022-08-06 16:16:25,205 epoch 7 - iter 2022/3375 - loss 0.07800463 - samples/sec: 46.44 - lr: 0.100000
427
+ 2022-08-06 16:17:25,635 epoch 7 - iter 2359/3375 - loss 0.07788540 - samples/sec: 44.64 - lr: 0.100000
428
+ 2022-08-06 16:18:25,934 epoch 7 - iter 2696/3375 - loss 0.07823310 - samples/sec: 44.73 - lr: 0.100000
429
+ 2022-08-06 16:19:25,742 epoch 7 - iter 3033/3375 - loss 0.07862489 - samples/sec: 45.10 - lr: 0.100000
430
+ 2022-08-06 16:20:24,514 epoch 7 - iter 3370/3375 - loss 0.07864779 - samples/sec: 45.89 - lr: 0.100000
431
+ 2022-08-06 16:20:25,316 ----------------------------------------------------------------------------------------------------
432
+ 2022-08-06 16:20:25,317 EPOCH 7 done: loss 0.0786 - lr 0.1000000
433
+ 2022-08-06 16:20:25,317 BAD EPOCHS (no improvement): 0
434
+ 2022-08-06 16:20:25,317 ----------------------------------------------------------------------------------------------------
435
+ 2022-08-06 16:21:23,040 epoch 8 - iter 337/3375 - loss 0.06876001 - samples/sec: 46.73 - lr: 0.100000
436
+ 2022-08-06 16:22:25,028 epoch 8 - iter 674/3375 - loss 0.06867038 - samples/sec: 43.51 - lr: 0.100000
437
+ 2022-08-06 16:23:25,046 epoch 8 - iter 1011/3375 - loss 0.07011779 - samples/sec: 44.94 - lr: 0.100000
438
+ 2022-08-06 16:24:23,287 epoch 8 - iter 1348/3375 - loss 0.07118411 - samples/sec: 46.31 - lr: 0.100000
439
+ 2022-08-06 16:25:24,939 epoch 8 - iter 1685/3375 - loss 0.07159055 - samples/sec: 43.75 - lr: 0.100000
440
+ 2022-08-06 16:26:23,316 epoch 8 - iter 2022/3375 - loss 0.07167687 - samples/sec: 46.21 - lr: 0.100000
441
+ 2022-08-06 16:27:22,234 epoch 8 - iter 2359/3375 - loss 0.07190781 - samples/sec: 45.78 - lr: 0.100000
442
+ 2022-08-06 16:28:20,921 epoch 8 - iter 2696/3375 - loss 0.07263123 - samples/sec: 45.96 - lr: 0.100000
443
+ 2022-08-06 16:29:21,637 epoch 8 - iter 3033/3375 - loss 0.07345723 - samples/sec: 44.42 - lr: 0.100000
444
+ 2022-08-06 16:30:20,403 epoch 8 - iter 3370/3375 - loss 0.07338627 - samples/sec: 45.90 - lr: 0.100000
445
+ 2022-08-06 16:30:21,375 ----------------------------------------------------------------------------------------------------
446
+ 2022-08-06 16:30:21,375 EPOCH 8 done: loss 0.0734 - lr 0.1000000
447
+ 2022-08-06 16:30:21,375 BAD EPOCHS (no improvement): 0
448
+ 2022-08-06 16:30:21,376 ----------------------------------------------------------------------------------------------------
449
+ 2022-08-06 16:31:18,803 epoch 9 - iter 337/3375 - loss 0.06314787 - samples/sec: 46.97 - lr: 0.100000
450
+ 2022-08-06 16:32:16,661 epoch 9 - iter 674/3375 - loss 0.06638022 - samples/sec: 46.62 - lr: 0.100000
451
+ 2022-08-06 16:33:15,745 epoch 9 - iter 1011/3375 - loss 0.06547021 - samples/sec: 45.65 - lr: 0.100000
452
+ 2022-08-06 16:34:14,632 epoch 9 - iter 1348/3375 - loss 0.06593581 - samples/sec: 45.81 - lr: 0.100000
453
+ 2022-08-06 16:35:13,668 epoch 9 - iter 1685/3375 - loss 0.06772817 - samples/sec: 45.69 - lr: 0.100000
454
+ 2022-08-06 16:36:15,567 epoch 9 - iter 2022/3375 - loss 0.06808051 - samples/sec: 43.58 - lr: 0.100000
455
+ 2022-08-06 16:37:16,651 epoch 9 - iter 2359/3375 - loss 0.06796916 - samples/sec: 44.16 - lr: 0.100000
456
+ 2022-08-06 16:38:14,513 epoch 9 - iter 2696/3375 - loss 0.06906572 - samples/sec: 46.62 - lr: 0.100000
457
+ 2022-08-06 16:39:13,107 epoch 9 - iter 3033/3375 - loss 0.06917054 - samples/sec: 46.03 - lr: 0.100000
458
+ 2022-08-06 16:40:12,475 epoch 9 - iter 3370/3375 - loss 0.06913866 - samples/sec: 45.43 - lr: 0.100000
459
+ 2022-08-06 16:40:13,344 ----------------------------------------------------------------------------------------------------
460
+ 2022-08-06 16:40:13,344 EPOCH 9 done: loss 0.0691 - lr 0.1000000
461
+ 2022-08-06 16:40:13,344 BAD EPOCHS (no improvement): 0
462
+ 2022-08-06 16:40:13,345 ----------------------------------------------------------------------------------------------------
463
+ 2022-08-06 16:41:11,629 epoch 10 - iter 337/3375 - loss 0.05727560 - samples/sec: 46.28 - lr: 0.100000
464
+ 2022-08-06 16:42:09,047 epoch 10 - iter 674/3375 - loss 0.06063155 - samples/sec: 46.98 - lr: 0.100000
465
+ 2022-08-06 16:43:09,515 epoch 10 - iter 1011/3375 - loss 0.06369582 - samples/sec: 44.61 - lr: 0.100000
466
+ 2022-08-06 16:44:07,978 epoch 10 - iter 1348/3375 - loss 0.06421773 - samples/sec: 46.14 - lr: 0.100000
467
+ 2022-08-06 16:45:07,015 epoch 10 - iter 1685/3375 - loss 0.06397856 - samples/sec: 45.69 - lr: 0.100000
468
+ 2022-08-06 16:46:05,736 epoch 10 - iter 2022/3375 - loss 0.06424947 - samples/sec: 45.93 - lr: 0.100000
469
+ 2022-08-06 16:47:06,945 epoch 10 - iter 2359/3375 - loss 0.06511606 - samples/sec: 44.07 - lr: 0.100000
470
+ 2022-08-06 16:48:05,819 epoch 10 - iter 2696/3375 - loss 0.06574495 - samples/sec: 45.82 - lr: 0.100000
471
+ 2022-08-06 16:49:03,924 epoch 10 - iter 3033/3375 - loss 0.06552271 - samples/sec: 46.42 - lr: 0.100000
472
+ 2022-08-06 16:50:00,641 epoch 10 - iter 3370/3375 - loss 0.06594147 - samples/sec: 47.56 - lr: 0.100000
473
+ 2022-08-06 16:50:01,493 ----------------------------------------------------------------------------------------------------
474
+ 2022-08-06 16:50:01,493 EPOCH 10 done: loss 0.0659 - lr 0.1000000
475
+ 2022-08-06 16:50:01,493 BAD EPOCHS (no improvement): 0
476
+ 2022-08-06 16:50:02,708 ----------------------------------------------------------------------------------------------------
477
+ 2022-08-06 16:50:02,709 Testing using last state of model ...
478
+ 2022-08-06 16:53:40,214 0.9632 0.9632 0.9632 0.9632
479
+ 2022-08-06 16:53:40,215
480
+ Results:
481
+ - F-score (micro) 0.9632
482
+ - F-score (macro) 0.9031
483
+ - Accuracy 0.9632
484
+
485
+ By class:
486
+ precision recall f1-score support
487
+
488
+ N_SING 0.9691 0.9565 0.9627 30553
489
+ P 0.9560 0.9937 0.9745 9951
490
+ DELM 0.9936 0.9906 0.9921 8122
491
+ ADJ 0.9205 0.9152 0.9179 7466
492
+ CON 0.9892 0.9799 0.9845 6823
493
+ N_PL 0.9476 0.9642 0.9558 5163
494
+ V_PA 0.9729 0.9746 0.9737 2873
495
+ V_PRS 0.9825 0.9898 0.9861 2841
496
+ PRO 0.9656 0.9455 0.9555 2258
497
+ NUM 0.9937 0.9933 0.9935 2232
498
+ DET 0.9423 0.9698 0.9559 1853
499
+ CLITIC 0.9992 1.0000 0.9996 1259
500
+ V_PP 0.9699 0.9741 0.9720 1158
501
+ V_SUB 0.9620 0.9573 0.9596 1031
502
+ ADV 0.7784 0.8182 0.7978 880
503
+ ADV_TIME 0.9126 0.9611 0.9363 489
504
+ V_AUX 0.9869 0.9974 0.9921 379
505
+ ADJ_SUP 0.9851 0.9815 0.9833 270
506
+ ADJ_CMPR 0.9246 0.9534 0.9388 193
507
+ ADJ_INO 0.7294 0.7381 0.7337 168
508
+ ADV_NEG 0.9034 0.8792 0.8912 149
509
+ ADV_I 0.8926 0.7714 0.8276 140
510
+ FW 0.6893 0.5772 0.6283 123
511
+ ADV_COMP 0.8267 0.8158 0.8212 76
512
+ ADV_LOC 0.9722 0.9589 0.9655 73
513
+ V_IMP 0.7292 0.6250 0.6731 56
514
+ PREV 0.9286 0.8125 0.8667 32
515
+ INT 0.9231 0.5000 0.6486 24
516
+
517
+ micro avg 0.9632 0.9632 0.9632 86635
518
+ macro avg 0.9195 0.8926 0.9031 86635
519
+ weighted avg 0.9633 0.9632 0.9631 86635
520
+ samples avg 0.9632 0.9632 0.9632 86635
521
+
522
+ 2022-08-06 16:53:40,215 ----------------------------------------------------------------------------------------------------