stefan-it commited on
Commit
16cb29c
1 Parent(s): e03c84e

Upload ./training.log with huggingface_hub

Browse files
Files changed (1) hide show
  1. training.log +261 -0
training.log ADDED
@@ -0,0 +1,261 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-19 02:35:16,034 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-19 02:35:16,035 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(31103, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=81, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-19 02:35:16,035 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-19 02:35:16,035 Corpus: 6900 train + 1576 dev + 1833 test sentences
52
+ 2023-10-19 02:35:16,035 ----------------------------------------------------------------------------------------------------
53
+ 2023-10-19 02:35:16,035 Train: 6900 sentences
54
+ 2023-10-19 02:35:16,036 (train_with_dev=False, train_with_test=False)
55
+ 2023-10-19 02:35:16,036 ----------------------------------------------------------------------------------------------------
56
+ 2023-10-19 02:35:16,036 Training Params:
57
+ 2023-10-19 02:35:16,036 - learning_rate: "5e-05"
58
+ 2023-10-19 02:35:16,036 - mini_batch_size: "16"
59
+ 2023-10-19 02:35:16,036 - max_epochs: "10"
60
+ 2023-10-19 02:35:16,036 - shuffle: "True"
61
+ 2023-10-19 02:35:16,036 ----------------------------------------------------------------------------------------------------
62
+ 2023-10-19 02:35:16,036 Plugins:
63
+ 2023-10-19 02:35:16,036 - TensorboardLogger
64
+ 2023-10-19 02:35:16,036 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2023-10-19 02:35:16,036 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-19 02:35:16,036 Final evaluation on model from best epoch (best-model.pt)
67
+ 2023-10-19 02:35:16,036 - metric: "('micro avg', 'f1-score')"
68
+ 2023-10-19 02:35:16,036 ----------------------------------------------------------------------------------------------------
69
+ 2023-10-19 02:35:16,036 Computation:
70
+ 2023-10-19 02:35:16,036 - compute on device: cuda:0
71
+ 2023-10-19 02:35:16,036 - embedding storage: none
72
+ 2023-10-19 02:35:16,036 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-19 02:35:16,037 Model training base path: "autotrain-flair-mobie-gbert_base-bs16-e10-lr5e-05-4"
74
+ 2023-10-19 02:35:16,037 ----------------------------------------------------------------------------------------------------
75
+ 2023-10-19 02:35:16,037 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-19 02:35:16,037 Logging anything other than scalars to TensorBoard is currently not supported.
77
+ 2023-10-19 02:35:30,501 epoch 1 - iter 43/432 - loss 4.71793452 - time (sec): 14.46 - samples/sec: 451.91 - lr: 0.000005 - momentum: 0.000000
78
+ 2023-10-19 02:35:45,140 epoch 1 - iter 86/432 - loss 3.56588335 - time (sec): 29.10 - samples/sec: 432.16 - lr: 0.000010 - momentum: 0.000000
79
+ 2023-10-19 02:35:59,499 epoch 1 - iter 129/432 - loss 2.94353643 - time (sec): 43.46 - samples/sec: 428.24 - lr: 0.000015 - momentum: 0.000000
80
+ 2023-10-19 02:36:14,178 epoch 1 - iter 172/432 - loss 2.57618338 - time (sec): 58.14 - samples/sec: 429.10 - lr: 0.000020 - momentum: 0.000000
81
+ 2023-10-19 02:36:29,937 epoch 1 - iter 215/432 - loss 2.31404215 - time (sec): 73.90 - samples/sec: 420.18 - lr: 0.000025 - momentum: 0.000000
82
+ 2023-10-19 02:36:43,961 epoch 1 - iter 258/432 - loss 2.08985425 - time (sec): 87.92 - samples/sec: 423.81 - lr: 0.000030 - momentum: 0.000000
83
+ 2023-10-19 02:36:58,508 epoch 1 - iter 301/432 - loss 1.90122408 - time (sec): 102.47 - samples/sec: 423.32 - lr: 0.000035 - momentum: 0.000000
84
+ 2023-10-19 02:37:13,202 epoch 1 - iter 344/432 - loss 1.75799120 - time (sec): 117.16 - samples/sec: 423.69 - lr: 0.000040 - momentum: 0.000000
85
+ 2023-10-19 02:37:27,381 epoch 1 - iter 387/432 - loss 1.64106351 - time (sec): 131.34 - samples/sec: 422.49 - lr: 0.000045 - momentum: 0.000000
86
+ 2023-10-19 02:37:42,690 epoch 1 - iter 430/432 - loss 1.53540212 - time (sec): 146.65 - samples/sec: 420.47 - lr: 0.000050 - momentum: 0.000000
87
+ 2023-10-19 02:37:43,317 ----------------------------------------------------------------------------------------------------
88
+ 2023-10-19 02:37:43,317 EPOCH 1 done: loss 1.5323 - lr: 0.000050
89
+ 2023-10-19 02:37:56,983 DEV : loss 0.4536784291267395 - f1-score (micro avg) 0.732
90
+ 2023-10-19 02:37:57,007 saving best model
91
+ 2023-10-19 02:37:57,475 ----------------------------------------------------------------------------------------------------
92
+ 2023-10-19 02:38:11,866 epoch 2 - iter 43/432 - loss 0.49805725 - time (sec): 14.39 - samples/sec: 413.14 - lr: 0.000049 - momentum: 0.000000
93
+ 2023-10-19 02:38:25,989 epoch 2 - iter 86/432 - loss 0.49067153 - time (sec): 28.51 - samples/sec: 441.62 - lr: 0.000049 - momentum: 0.000000
94
+ 2023-10-19 02:38:41,083 epoch 2 - iter 129/432 - loss 0.46877508 - time (sec): 43.61 - samples/sec: 419.98 - lr: 0.000048 - momentum: 0.000000
95
+ 2023-10-19 02:38:55,498 epoch 2 - iter 172/432 - loss 0.45344862 - time (sec): 58.02 - samples/sec: 421.19 - lr: 0.000048 - momentum: 0.000000
96
+ 2023-10-19 02:39:10,199 epoch 2 - iter 215/432 - loss 0.44579460 - time (sec): 72.72 - samples/sec: 418.60 - lr: 0.000047 - momentum: 0.000000
97
+ 2023-10-19 02:39:25,223 epoch 2 - iter 258/432 - loss 0.43685439 - time (sec): 87.75 - samples/sec: 419.45 - lr: 0.000047 - momentum: 0.000000
98
+ 2023-10-19 02:39:41,013 epoch 2 - iter 301/432 - loss 0.42520139 - time (sec): 103.54 - samples/sec: 414.52 - lr: 0.000046 - momentum: 0.000000
99
+ 2023-10-19 02:39:56,695 epoch 2 - iter 344/432 - loss 0.41678397 - time (sec): 119.22 - samples/sec: 408.54 - lr: 0.000046 - momentum: 0.000000
100
+ 2023-10-19 02:40:13,095 epoch 2 - iter 387/432 - loss 0.40679354 - time (sec): 135.62 - samples/sec: 406.53 - lr: 0.000045 - momentum: 0.000000
101
+ 2023-10-19 02:40:28,122 epoch 2 - iter 430/432 - loss 0.39818144 - time (sec): 150.65 - samples/sec: 409.30 - lr: 0.000044 - momentum: 0.000000
102
+ 2023-10-19 02:40:28,692 ----------------------------------------------------------------------------------------------------
103
+ 2023-10-19 02:40:28,693 EPOCH 2 done: loss 0.3981 - lr: 0.000044
104
+ 2023-10-19 02:40:42,056 DEV : loss 0.3231566250324249 - f1-score (micro avg) 0.7892
105
+ 2023-10-19 02:40:42,080 saving best model
106
+ 2023-10-19 02:40:43,381 ----------------------------------------------------------------------------------------------------
107
+ 2023-10-19 02:40:58,096 epoch 3 - iter 43/432 - loss 0.25451405 - time (sec): 14.71 - samples/sec: 422.81 - lr: 0.000044 - momentum: 0.000000
108
+ 2023-10-19 02:41:12,217 epoch 3 - iter 86/432 - loss 0.24314495 - time (sec): 28.83 - samples/sec: 428.82 - lr: 0.000043 - momentum: 0.000000
109
+ 2023-10-19 02:41:27,101 epoch 3 - iter 129/432 - loss 0.24063026 - time (sec): 43.72 - samples/sec: 421.24 - lr: 0.000043 - momentum: 0.000000
110
+ 2023-10-19 02:41:42,160 epoch 3 - iter 172/432 - loss 0.24182864 - time (sec): 58.78 - samples/sec: 421.72 - lr: 0.000042 - momentum: 0.000000
111
+ 2023-10-19 02:41:57,511 epoch 3 - iter 215/432 - loss 0.24316250 - time (sec): 74.13 - samples/sec: 415.84 - lr: 0.000042 - momentum: 0.000000
112
+ 2023-10-19 02:42:12,451 epoch 3 - iter 258/432 - loss 0.24456626 - time (sec): 89.07 - samples/sec: 415.21 - lr: 0.000041 - momentum: 0.000000
113
+ 2023-10-19 02:42:28,393 epoch 3 - iter 301/432 - loss 0.24484776 - time (sec): 105.01 - samples/sec: 412.79 - lr: 0.000041 - momentum: 0.000000
114
+ 2023-10-19 02:42:43,735 epoch 3 - iter 344/432 - loss 0.24662885 - time (sec): 120.35 - samples/sec: 410.98 - lr: 0.000040 - momentum: 0.000000
115
+ 2023-10-19 02:42:59,091 epoch 3 - iter 387/432 - loss 0.24622643 - time (sec): 135.71 - samples/sec: 410.98 - lr: 0.000039 - momentum: 0.000000
116
+ 2023-10-19 02:43:13,132 epoch 3 - iter 430/432 - loss 0.24525886 - time (sec): 149.75 - samples/sec: 411.59 - lr: 0.000039 - momentum: 0.000000
117
+ 2023-10-19 02:43:13,680 ----------------------------------------------------------------------------------------------------
118
+ 2023-10-19 02:43:13,680 EPOCH 3 done: loss 0.2450 - lr: 0.000039
119
+ 2023-10-19 02:43:27,003 DEV : loss 0.30216994881629944 - f1-score (micro avg) 0.8187
120
+ 2023-10-19 02:43:27,027 saving best model
121
+ 2023-10-19 02:43:28,322 ----------------------------------------------------------------------------------------------------
122
+ 2023-10-19 02:43:42,926 epoch 4 - iter 43/432 - loss 0.17397582 - time (sec): 14.60 - samples/sec: 414.72 - lr: 0.000038 - momentum: 0.000000
123
+ 2023-10-19 02:43:58,788 epoch 4 - iter 86/432 - loss 0.18289248 - time (sec): 30.46 - samples/sec: 397.85 - lr: 0.000038 - momentum: 0.000000
124
+ 2023-10-19 02:44:13,897 epoch 4 - iter 129/432 - loss 0.18392678 - time (sec): 45.57 - samples/sec: 401.13 - lr: 0.000037 - momentum: 0.000000
125
+ 2023-10-19 02:44:29,294 epoch 4 - iter 172/432 - loss 0.18556978 - time (sec): 60.97 - samples/sec: 399.38 - lr: 0.000037 - momentum: 0.000000
126
+ 2023-10-19 02:44:43,232 epoch 4 - iter 215/432 - loss 0.18331243 - time (sec): 74.91 - samples/sec: 405.80 - lr: 0.000036 - momentum: 0.000000
127
+ 2023-10-19 02:44:58,645 epoch 4 - iter 258/432 - loss 0.18158645 - time (sec): 90.32 - samples/sec: 400.76 - lr: 0.000036 - momentum: 0.000000
128
+ 2023-10-19 02:45:13,347 epoch 4 - iter 301/432 - loss 0.17820410 - time (sec): 105.02 - samples/sec: 406.14 - lr: 0.000035 - momentum: 0.000000
129
+ 2023-10-19 02:45:28,906 epoch 4 - iter 344/432 - loss 0.17684603 - time (sec): 120.58 - samples/sec: 409.95 - lr: 0.000034 - momentum: 0.000000
130
+ 2023-10-19 02:45:44,187 epoch 4 - iter 387/432 - loss 0.17775296 - time (sec): 135.86 - samples/sec: 407.99 - lr: 0.000034 - momentum: 0.000000
131
+ 2023-10-19 02:45:58,637 epoch 4 - iter 430/432 - loss 0.17798160 - time (sec): 150.31 - samples/sec: 410.10 - lr: 0.000033 - momentum: 0.000000
132
+ 2023-10-19 02:45:59,219 ----------------------------------------------------------------------------------------------------
133
+ 2023-10-19 02:45:59,220 EPOCH 4 done: loss 0.1782 - lr: 0.000033
134
+ 2023-10-19 02:46:12,571 DEV : loss 0.30217471718788147 - f1-score (micro avg) 0.8235
135
+ 2023-10-19 02:46:12,595 saving best model
136
+ 2023-10-19 02:46:13,892 ----------------------------------------------------------------------------------------------------
137
+ 2023-10-19 02:46:28,339 epoch 5 - iter 43/432 - loss 0.11679026 - time (sec): 14.45 - samples/sec: 412.45 - lr: 0.000033 - momentum: 0.000000
138
+ 2023-10-19 02:46:42,977 epoch 5 - iter 86/432 - loss 0.11734061 - time (sec): 29.08 - samples/sec: 418.83 - lr: 0.000032 - momentum: 0.000000
139
+ 2023-10-19 02:46:57,663 epoch 5 - iter 129/432 - loss 0.12729745 - time (sec): 43.77 - samples/sec: 428.70 - lr: 0.000032 - momentum: 0.000000
140
+ 2023-10-19 02:47:12,672 epoch 5 - iter 172/432 - loss 0.12477897 - time (sec): 58.78 - samples/sec: 426.72 - lr: 0.000031 - momentum: 0.000000
141
+ 2023-10-19 02:47:28,128 epoch 5 - iter 215/432 - loss 0.12434555 - time (sec): 74.24 - samples/sec: 413.09 - lr: 0.000031 - momentum: 0.000000
142
+ 2023-10-19 02:47:42,287 epoch 5 - iter 258/432 - loss 0.12362897 - time (sec): 88.39 - samples/sec: 413.90 - lr: 0.000030 - momentum: 0.000000
143
+ 2023-10-19 02:47:56,715 epoch 5 - iter 301/432 - loss 0.12465833 - time (sec): 102.82 - samples/sec: 416.26 - lr: 0.000029 - momentum: 0.000000
144
+ 2023-10-19 02:48:12,495 epoch 5 - iter 344/432 - loss 0.12779382 - time (sec): 118.60 - samples/sec: 414.22 - lr: 0.000029 - momentum: 0.000000
145
+ 2023-10-19 02:48:27,371 epoch 5 - iter 387/432 - loss 0.12918879 - time (sec): 133.48 - samples/sec: 414.75 - lr: 0.000028 - momentum: 0.000000
146
+ 2023-10-19 02:48:42,352 epoch 5 - iter 430/432 - loss 0.12891599 - time (sec): 148.46 - samples/sec: 415.21 - lr: 0.000028 - momentum: 0.000000
147
+ 2023-10-19 02:48:42,872 ----------------------------------------------------------------------------------------------------
148
+ 2023-10-19 02:48:42,872 EPOCH 5 done: loss 0.1291 - lr: 0.000028
149
+ 2023-10-19 02:48:54,964 DEV : loss 0.3222440779209137 - f1-score (micro avg) 0.8314
150
+ 2023-10-19 02:48:54,988 saving best model
151
+ 2023-10-19 02:48:56,284 ----------------------------------------------------------------------------------------------------
152
+ 2023-10-19 02:49:09,855 epoch 6 - iter 43/432 - loss 0.09462246 - time (sec): 13.57 - samples/sec: 464.65 - lr: 0.000027 - momentum: 0.000000
153
+ 2023-10-19 02:49:23,342 epoch 6 - iter 86/432 - loss 0.09120079 - time (sec): 27.06 - samples/sec: 463.98 - lr: 0.000027 - momentum: 0.000000
154
+ 2023-10-19 02:49:37,940 epoch 6 - iter 129/432 - loss 0.08754151 - time (sec): 41.65 - samples/sec: 453.11 - lr: 0.000026 - momentum: 0.000000
155
+ 2023-10-19 02:49:51,924 epoch 6 - iter 172/432 - loss 0.08707151 - time (sec): 55.64 - samples/sec: 452.29 - lr: 0.000026 - momentum: 0.000000
156
+ 2023-10-19 02:50:05,558 epoch 6 - iter 215/432 - loss 0.08872203 - time (sec): 69.27 - samples/sec: 452.71 - lr: 0.000025 - momentum: 0.000000
157
+ 2023-10-19 02:50:18,749 epoch 6 - iter 258/432 - loss 0.09178993 - time (sec): 82.46 - samples/sec: 449.21 - lr: 0.000024 - momentum: 0.000000
158
+ 2023-10-19 02:50:31,972 epoch 6 - iter 301/432 - loss 0.09361707 - time (sec): 95.69 - samples/sec: 450.51 - lr: 0.000024 - momentum: 0.000000
159
+ 2023-10-19 02:50:45,324 epoch 6 - iter 344/432 - loss 0.09605405 - time (sec): 109.04 - samples/sec: 452.64 - lr: 0.000023 - momentum: 0.000000
160
+ 2023-10-19 02:50:58,578 epoch 6 - iter 387/432 - loss 0.09794359 - time (sec): 122.29 - samples/sec: 453.12 - lr: 0.000023 - momentum: 0.000000
161
+ 2023-10-19 02:51:11,851 epoch 6 - iter 430/432 - loss 0.09911853 - time (sec): 135.57 - samples/sec: 454.89 - lr: 0.000022 - momentum: 0.000000
162
+ 2023-10-19 02:51:12,551 ----------------------------------------------------------------------------------------------------
163
+ 2023-10-19 02:51:12,551 EPOCH 6 done: loss 0.0992 - lr: 0.000022
164
+ 2023-10-19 02:51:24,641 DEV : loss 0.341653436422348 - f1-score (micro avg) 0.8264
165
+ 2023-10-19 02:51:24,666 ----------------------------------------------------------------------------------------------------
166
+ 2023-10-19 02:51:37,807 epoch 7 - iter 43/432 - loss 0.06758819 - time (sec): 13.14 - samples/sec: 475.32 - lr: 0.000022 - momentum: 0.000000
167
+ 2023-10-19 02:51:51,410 epoch 7 - iter 86/432 - loss 0.07314013 - time (sec): 26.74 - samples/sec: 455.60 - lr: 0.000021 - momentum: 0.000000
168
+ 2023-10-19 02:52:06,003 epoch 7 - iter 129/432 - loss 0.07144795 - time (sec): 41.34 - samples/sec: 448.25 - lr: 0.000021 - momentum: 0.000000
169
+ 2023-10-19 02:52:19,135 epoch 7 - iter 172/432 - loss 0.07183481 - time (sec): 54.47 - samples/sec: 449.24 - lr: 0.000020 - momentum: 0.000000
170
+ 2023-10-19 02:52:32,443 epoch 7 - iter 215/432 - loss 0.07341790 - time (sec): 67.78 - samples/sec: 446.53 - lr: 0.000019 - momentum: 0.000000
171
+ 2023-10-19 02:52:46,676 epoch 7 - iter 258/432 - loss 0.07333719 - time (sec): 82.01 - samples/sec: 444.19 - lr: 0.000019 - momentum: 0.000000
172
+ 2023-10-19 02:53:01,228 epoch 7 - iter 301/432 - loss 0.07256051 - time (sec): 96.56 - samples/sec: 443.94 - lr: 0.000018 - momentum: 0.000000
173
+ 2023-10-19 02:53:15,259 epoch 7 - iter 344/432 - loss 0.07318786 - time (sec): 110.59 - samples/sec: 441.26 - lr: 0.000018 - momentum: 0.000000
174
+ 2023-10-19 02:53:29,458 epoch 7 - iter 387/432 - loss 0.07390957 - time (sec): 124.79 - samples/sec: 443.27 - lr: 0.000017 - momentum: 0.000000
175
+ 2023-10-19 02:53:44,406 epoch 7 - iter 430/432 - loss 0.07553520 - time (sec): 139.74 - samples/sec: 441.18 - lr: 0.000017 - momentum: 0.000000
176
+ 2023-10-19 02:53:44,877 ----------------------------------------------------------------------------------------------------
177
+ 2023-10-19 02:53:44,878 EPOCH 7 done: loss 0.0759 - lr: 0.000017
178
+ 2023-10-19 02:53:57,808 DEV : loss 0.3510221242904663 - f1-score (micro avg) 0.8318
179
+ 2023-10-19 02:53:57,844 saving best model
180
+ 2023-10-19 02:53:59,183 ----------------------------------------------------------------------------------------------------
181
+ 2023-10-19 02:54:13,218 epoch 8 - iter 43/432 - loss 0.06618778 - time (sec): 14.03 - samples/sec: 460.75 - lr: 0.000016 - momentum: 0.000000
182
+ 2023-10-19 02:54:27,166 epoch 8 - iter 86/432 - loss 0.06330724 - time (sec): 27.98 - samples/sec: 461.69 - lr: 0.000016 - momentum: 0.000000
183
+ 2023-10-19 02:54:42,151 epoch 8 - iter 129/432 - loss 0.05910976 - time (sec): 42.97 - samples/sec: 445.97 - lr: 0.000015 - momentum: 0.000000
184
+ 2023-10-19 02:54:56,679 epoch 8 - iter 172/432 - loss 0.05755682 - time (sec): 57.50 - samples/sec: 433.46 - lr: 0.000014 - momentum: 0.000000
185
+ 2023-10-19 02:55:11,485 epoch 8 - iter 215/432 - loss 0.05604373 - time (sec): 72.30 - samples/sec: 434.16 - lr: 0.000014 - momentum: 0.000000
186
+ 2023-10-19 02:55:26,413 epoch 8 - iter 258/432 - loss 0.05518064 - time (sec): 87.23 - samples/sec: 435.18 - lr: 0.000013 - momentum: 0.000000
187
+ 2023-10-19 02:55:40,955 epoch 8 - iter 301/432 - loss 0.05387243 - time (sec): 101.77 - samples/sec: 428.81 - lr: 0.000013 - momentum: 0.000000
188
+ 2023-10-19 02:55:56,874 epoch 8 - iter 344/432 - loss 0.05296670 - time (sec): 117.69 - samples/sec: 419.48 - lr: 0.000012 - momentum: 0.000000
189
+ 2023-10-19 02:56:12,201 epoch 8 - iter 387/432 - loss 0.05451018 - time (sec): 133.02 - samples/sec: 417.31 - lr: 0.000012 - momentum: 0.000000
190
+ 2023-10-19 02:56:27,397 epoch 8 - iter 430/432 - loss 0.05443317 - time (sec): 148.21 - samples/sec: 416.29 - lr: 0.000011 - momentum: 0.000000
191
+ 2023-10-19 02:56:27,923 ----------------------------------------------------------------------------------------------------
192
+ 2023-10-19 02:56:27,924 EPOCH 8 done: loss 0.0544 - lr: 0.000011
193
+ 2023-10-19 02:56:41,680 DEV : loss 0.37782010436058044 - f1-score (micro avg) 0.839
194
+ 2023-10-19 02:56:41,704 saving best model
195
+ 2023-10-19 02:56:43,007 ----------------------------------------------------------------------------------------------------
196
+ 2023-10-19 02:56:56,698 epoch 9 - iter 43/432 - loss 0.03577487 - time (sec): 13.69 - samples/sec: 441.62 - lr: 0.000011 - momentum: 0.000000
197
+ 2023-10-19 02:57:13,125 epoch 9 - iter 86/432 - loss 0.03920506 - time (sec): 30.12 - samples/sec: 393.06 - lr: 0.000010 - momentum: 0.000000
198
+ 2023-10-19 02:57:27,828 epoch 9 - iter 129/432 - loss 0.04567232 - time (sec): 44.82 - samples/sec: 395.69 - lr: 0.000009 - momentum: 0.000000
199
+ 2023-10-19 02:57:42,760 epoch 9 - iter 172/432 - loss 0.04411163 - time (sec): 59.75 - samples/sec: 397.14 - lr: 0.000009 - momentum: 0.000000
200
+ 2023-10-19 02:57:57,684 epoch 9 - iter 215/432 - loss 0.04223026 - time (sec): 74.68 - samples/sec: 400.69 - lr: 0.000008 - momentum: 0.000000
201
+ 2023-10-19 02:58:13,414 epoch 9 - iter 258/432 - loss 0.04167944 - time (sec): 90.41 - samples/sec: 398.53 - lr: 0.000008 - momentum: 0.000000
202
+ 2023-10-19 02:58:28,035 epoch 9 - iter 301/432 - loss 0.04175253 - time (sec): 105.03 - samples/sec: 402.79 - lr: 0.000007 - momentum: 0.000000
203
+ 2023-10-19 02:58:41,647 epoch 9 - iter 344/432 - loss 0.04009740 - time (sec): 118.64 - samples/sec: 410.90 - lr: 0.000007 - momentum: 0.000000
204
+ 2023-10-19 02:58:55,026 epoch 9 - iter 387/432 - loss 0.04027926 - time (sec): 132.02 - samples/sec: 418.43 - lr: 0.000006 - momentum: 0.000000
205
+ 2023-10-19 02:59:08,652 epoch 9 - iter 430/432 - loss 0.04130173 - time (sec): 145.64 - samples/sec: 423.11 - lr: 0.000006 - momentum: 0.000000
206
+ 2023-10-19 02:59:09,081 ----------------------------------------------------------------------------------------------------
207
+ 2023-10-19 02:59:09,081 EPOCH 9 done: loss 0.0412 - lr: 0.000006
208
+ 2023-10-19 02:59:21,114 DEV : loss 0.41709104180336 - f1-score (micro avg) 0.8413
209
+ 2023-10-19 02:59:21,138 saving best model
210
+ 2023-10-19 02:59:22,432 ----------------------------------------------------------------------------------------------------
211
+ 2023-10-19 02:59:36,187 epoch 10 - iter 43/432 - loss 0.03959866 - time (sec): 13.75 - samples/sec: 477.71 - lr: 0.000005 - momentum: 0.000000
212
+ 2023-10-19 02:59:50,368 epoch 10 - iter 86/432 - loss 0.03400023 - time (sec): 27.93 - samples/sec: 443.69 - lr: 0.000004 - momentum: 0.000000
213
+ 2023-10-19 03:00:03,803 epoch 10 - iter 129/432 - loss 0.03597026 - time (sec): 41.37 - samples/sec: 451.88 - lr: 0.000004 - momentum: 0.000000
214
+ 2023-10-19 03:00:17,392 epoch 10 - iter 172/432 - loss 0.03332945 - time (sec): 54.96 - samples/sec: 453.09 - lr: 0.000003 - momentum: 0.000000
215
+ 2023-10-19 03:00:31,438 epoch 10 - iter 215/432 - loss 0.03275216 - time (sec): 69.00 - samples/sec: 450.42 - lr: 0.000003 - momentum: 0.000000
216
+ 2023-10-19 03:00:44,289 epoch 10 - iter 258/432 - loss 0.03302964 - time (sec): 81.86 - samples/sec: 451.13 - lr: 0.000002 - momentum: 0.000000
217
+ 2023-10-19 03:00:57,604 epoch 10 - iter 301/432 - loss 0.03268473 - time (sec): 95.17 - samples/sec: 448.65 - lr: 0.000002 - momentum: 0.000000
218
+ 2023-10-19 03:01:11,615 epoch 10 - iter 344/432 - loss 0.03345149 - time (sec): 109.18 - samples/sec: 448.84 - lr: 0.000001 - momentum: 0.000000
219
+ 2023-10-19 03:01:25,592 epoch 10 - iter 387/432 - loss 0.03394635 - time (sec): 123.16 - samples/sec: 447.46 - lr: 0.000001 - momentum: 0.000000
220
+ 2023-10-19 03:01:39,565 epoch 10 - iter 430/432 - loss 0.03475794 - time (sec): 137.13 - samples/sec: 450.08 - lr: 0.000000 - momentum: 0.000000
221
+ 2023-10-19 03:01:40,005 ----------------------------------------------------------------------------------------------------
222
+ 2023-10-19 03:01:40,005 EPOCH 10 done: loss 0.0347 - lr: 0.000000
223
+ 2023-10-19 03:01:52,142 DEV : loss 0.4283430576324463 - f1-score (micro avg) 0.8419
224
+ 2023-10-19 03:01:52,167 saving best model
225
+ 2023-10-19 03:01:54,294 ----------------------------------------------------------------------------------------------------
226
+ 2023-10-19 03:01:54,295 Loading model from best epoch ...
227
+ 2023-10-19 03:01:56,523 SequenceTagger predicts: Dictionary with 81 tags: O, S-location-route, B-location-route, E-location-route, I-location-route, S-location-stop, B-location-stop, E-location-stop, I-location-stop, S-trigger, B-trigger, E-trigger, I-trigger, S-organization-company, B-organization-company, E-organization-company, I-organization-company, S-location-city, B-location-city, E-location-city, I-location-city, S-location, B-location, E-location, I-location, S-event-cause, B-event-cause, E-event-cause, I-event-cause, S-location-street, B-location-street, E-location-street, I-location-street, S-time, B-time, E-time, I-time, S-date, B-date, E-date, I-date, S-number, B-number, E-number, I-number, S-duration, B-duration, E-duration, I-duration, S-organization
228
+ 2023-10-19 03:02:13,037
229
+ Results:
230
+ - F-score (micro) 0.7766
231
+ - F-score (macro) 0.5901
232
+ - Accuracy 0.6791
233
+
234
+ By class:
235
+ precision recall f1-score support
236
+
237
+ trigger 0.7289 0.6002 0.6583 833
238
+ location-stop 0.8575 0.8418 0.8496 765
239
+ location 0.8194 0.8256 0.8225 665
240
+ location-city 0.8127 0.8816 0.8458 566
241
+ date 0.8868 0.8553 0.8708 394
242
+ location-street 0.9290 0.8808 0.9043 386
243
+ time 0.7747 0.8867 0.8270 256
244
+ location-route 0.8504 0.7606 0.8030 284
245
+ organization-company 0.8373 0.6944 0.7592 252
246
+ distance 1.0000 1.0000 1.0000 167
247
+ number 0.6910 0.8255 0.7523 149
248
+ duration 0.3709 0.3436 0.3567 163
249
+ event-cause 0.0000 0.0000 0.0000 0
250
+ disaster-type 0.9118 0.4493 0.6019 69
251
+ organization 0.5357 0.5357 0.5357 28
252
+ person 0.4545 1.0000 0.6250 10
253
+ set 0.0000 0.0000 0.0000 0
254
+ org-position 0.0000 0.0000 0.0000 1
255
+ money 0.0000 0.0000 0.0000 0
256
+
257
+ micro avg 0.7736 0.7797 0.7766 4988
258
+ macro avg 0.6032 0.5990 0.5901 4988
259
+ weighted avg 0.8099 0.7797 0.7913 4988
260
+
261
+ 2023-10-19 03:02:13,038 ----------------------------------------------------------------------------------------------------