stefan-it commited on
Commit
07f6bbc
·
1 Parent(s): 02949bd

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. best-model.pt +3 -0
  2. dev.tsv +0 -0
  3. loss.tsv +11 -0
  4. test.tsv +0 -0
  5. training.log +240 -0
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ce1e3e0914bd3f4c481089177c175a7c8949cdee79cbe0b9be6a6a1f79936472
3
+ size 443311111
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 20:18:02 0.0000 0.2704 0.1200 0.5291 0.7586 0.6234 0.4595
3
+ 2 20:21:02 0.0000 0.0953 0.1275 0.5193 0.7986 0.6294 0.4672
4
+ 3 20:23:59 0.0000 0.0727 0.2329 0.5383 0.7471 0.6258 0.4654
5
+ 4 20:26:48 0.0000 0.0524 0.2961 0.4937 0.8032 0.6115 0.4497
6
+ 5 20:29:38 0.0000 0.0367 0.3010 0.5472 0.7300 0.6255 0.4623
7
+ 6 20:32:31 0.0000 0.0254 0.3524 0.5419 0.7471 0.6282 0.4678
8
+ 7 20:35:23 0.0000 0.0180 0.3972 0.5427 0.7414 0.6267 0.4639
9
+ 8 20:38:16 0.0000 0.0108 0.4172 0.5368 0.7677 0.6318 0.4722
10
+ 9 20:41:11 0.0000 0.0075 0.4234 0.5411 0.7609 0.6324 0.4723
11
+ 10 20:44:08 0.0000 0.0055 0.4311 0.5464 0.7609 0.6361 0.4747
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,240 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-14 20:15:14,574 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-14 20:15:14,575 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=13, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-14 20:15:14,575 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-14 20:15:14,575 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
52
+ - NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
53
+ 2023-10-14 20:15:14,575 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-14 20:15:14,575 Train: 14465 sentences
55
+ 2023-10-14 20:15:14,576 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-14 20:15:14,576 Training Params:
58
+ 2023-10-14 20:15:14,576 - learning_rate: "3e-05"
59
+ 2023-10-14 20:15:14,576 - mini_batch_size: "4"
60
+ 2023-10-14 20:15:14,576 - max_epochs: "10"
61
+ 2023-10-14 20:15:14,576 - shuffle: "True"
62
+ 2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-14 20:15:14,576 Plugins:
64
+ 2023-10-14 20:15:14,576 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-14 20:15:14,576 Final evaluation on model from best epoch (best-model.pt)
67
+ 2023-10-14 20:15:14,576 - metric: "('micro avg', 'f1-score')"
68
+ 2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
69
+ 2023-10-14 20:15:14,576 Computation:
70
+ 2023-10-14 20:15:14,576 - compute on device: cuda:0
71
+ 2023-10-14 20:15:14,576 - embedding storage: none
72
+ 2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-14 20:15:14,576 Model training base path: "hmbench-letemps/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
74
+ 2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
75
+ 2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-14 20:15:30,762 epoch 1 - iter 361/3617 - loss 1.47398781 - time (sec): 16.18 - samples/sec: 2325.55 - lr: 0.000003 - momentum: 0.000000
77
+ 2023-10-14 20:15:47,116 epoch 1 - iter 722/3617 - loss 0.83970179 - time (sec): 32.54 - samples/sec: 2319.29 - lr: 0.000006 - momentum: 0.000000
78
+ 2023-10-14 20:16:03,219 epoch 1 - iter 1083/3617 - loss 0.62059022 - time (sec): 48.64 - samples/sec: 2290.84 - lr: 0.000009 - momentum: 0.000000
79
+ 2023-10-14 20:16:19,390 epoch 1 - iter 1444/3617 - loss 0.50223969 - time (sec): 64.81 - samples/sec: 2291.76 - lr: 0.000012 - momentum: 0.000000
80
+ 2023-10-14 20:16:35,953 epoch 1 - iter 1805/3617 - loss 0.42436958 - time (sec): 81.38 - samples/sec: 2313.82 - lr: 0.000015 - momentum: 0.000000
81
+ 2023-10-14 20:16:52,015 epoch 1 - iter 2166/3617 - loss 0.37437638 - time (sec): 97.44 - samples/sec: 2319.20 - lr: 0.000018 - momentum: 0.000000
82
+ 2023-10-14 20:17:07,780 epoch 1 - iter 2527/3617 - loss 0.33741262 - time (sec): 113.20 - samples/sec: 2348.50 - lr: 0.000021 - momentum: 0.000000
83
+ 2023-10-14 20:17:23,796 epoch 1 - iter 2888/3617 - loss 0.31042790 - time (sec): 129.22 - samples/sec: 2349.52 - lr: 0.000024 - momentum: 0.000000
84
+ 2023-10-14 20:17:39,981 epoch 1 - iter 3249/3617 - loss 0.28794148 - time (sec): 145.40 - samples/sec: 2345.55 - lr: 0.000027 - momentum: 0.000000
85
+ 2023-10-14 20:17:56,334 epoch 1 - iter 3610/3617 - loss 0.27061320 - time (sec): 161.76 - samples/sec: 2344.14 - lr: 0.000030 - momentum: 0.000000
86
+ 2023-10-14 20:17:56,632 ----------------------------------------------------------------------------------------------------
87
+ 2023-10-14 20:17:56,633 EPOCH 1 done: loss 0.2704 - lr: 0.000030
88
+ 2023-10-14 20:18:02,040 DEV : loss 0.11999412626028061 - f1-score (micro avg) 0.6234
89
+ 2023-10-14 20:18:02,080 saving best model
90
+ 2023-10-14 20:18:02,475 ----------------------------------------------------------------------------------------------------
91
+ 2023-10-14 20:18:21,558 epoch 2 - iter 361/3617 - loss 0.09782984 - time (sec): 19.08 - samples/sec: 2009.32 - lr: 0.000030 - momentum: 0.000000
92
+ 2023-10-14 20:18:39,215 epoch 2 - iter 722/3617 - loss 0.09421131 - time (sec): 36.74 - samples/sec: 2056.11 - lr: 0.000029 - momentum: 0.000000
93
+ 2023-10-14 20:18:55,988 epoch 2 - iter 1083/3617 - loss 0.09577052 - time (sec): 53.51 - samples/sec: 2106.61 - lr: 0.000029 - momentum: 0.000000
94
+ 2023-10-14 20:19:13,211 epoch 2 - iter 1444/3617 - loss 0.09595965 - time (sec): 70.73 - samples/sec: 2158.23 - lr: 0.000029 - momentum: 0.000000
95
+ 2023-10-14 20:19:29,358 epoch 2 - iter 1805/3617 - loss 0.09590618 - time (sec): 86.88 - samples/sec: 2185.24 - lr: 0.000028 - momentum: 0.000000
96
+ 2023-10-14 20:19:46,839 epoch 2 - iter 2166/3617 - loss 0.09403509 - time (sec): 104.36 - samples/sec: 2192.52 - lr: 0.000028 - momentum: 0.000000
97
+ 2023-10-14 20:20:03,705 epoch 2 - iter 2527/3617 - loss 0.09439687 - time (sec): 121.23 - samples/sec: 2211.93 - lr: 0.000028 - momentum: 0.000000
98
+ 2023-10-14 20:20:20,276 epoch 2 - iter 2888/3617 - loss 0.09538483 - time (sec): 137.80 - samples/sec: 2216.77 - lr: 0.000027 - momentum: 0.000000
99
+ 2023-10-14 20:20:36,508 epoch 2 - iter 3249/3617 - loss 0.09475508 - time (sec): 154.03 - samples/sec: 2215.51 - lr: 0.000027 - momentum: 0.000000
100
+ 2023-10-14 20:20:55,584 epoch 2 - iter 3610/3617 - loss 0.09527647 - time (sec): 173.11 - samples/sec: 2191.11 - lr: 0.000027 - momentum: 0.000000
101
+ 2023-10-14 20:20:55,956 ----------------------------------------------------------------------------------------------------
102
+ 2023-10-14 20:20:55,956 EPOCH 2 done: loss 0.0953 - lr: 0.000027
103
+ 2023-10-14 20:21:02,854 DEV : loss 0.12750780582427979 - f1-score (micro avg) 0.6294
104
+ 2023-10-14 20:21:02,888 saving best model
105
+ 2023-10-14 20:21:03,598 ----------------------------------------------------------------------------------------------------
106
+ 2023-10-14 20:21:22,467 epoch 3 - iter 361/3617 - loss 0.05413118 - time (sec): 18.87 - samples/sec: 1959.23 - lr: 0.000026 - momentum: 0.000000
107
+ 2023-10-14 20:21:41,460 epoch 3 - iter 722/3617 - loss 0.06427074 - time (sec): 37.86 - samples/sec: 1973.69 - lr: 0.000026 - momentum: 0.000000
108
+ 2023-10-14 20:21:57,897 epoch 3 - iter 1083/3617 - loss 0.07334315 - time (sec): 54.30 - samples/sec: 2076.57 - lr: 0.000026 - momentum: 0.000000
109
+ 2023-10-14 20:22:14,087 epoch 3 - iter 1444/3617 - loss 0.07359621 - time (sec): 70.49 - samples/sec: 2131.15 - lr: 0.000025 - momentum: 0.000000
110
+ 2023-10-14 20:22:30,386 epoch 3 - iter 1805/3617 - loss 0.07222582 - time (sec): 86.79 - samples/sec: 2165.52 - lr: 0.000025 - momentum: 0.000000
111
+ 2023-10-14 20:22:46,657 epoch 3 - iter 2166/3617 - loss 0.07167594 - time (sec): 103.06 - samples/sec: 2195.87 - lr: 0.000025 - momentum: 0.000000
112
+ 2023-10-14 20:23:03,231 epoch 3 - iter 2527/3617 - loss 0.07168900 - time (sec): 119.63 - samples/sec: 2219.22 - lr: 0.000024 - momentum: 0.000000
113
+ 2023-10-14 20:23:19,692 epoch 3 - iter 2888/3617 - loss 0.07149544 - time (sec): 136.09 - samples/sec: 2230.31 - lr: 0.000024 - momentum: 0.000000
114
+ 2023-10-14 20:23:36,017 epoch 3 - iter 3249/3617 - loss 0.07253118 - time (sec): 152.42 - samples/sec: 2240.40 - lr: 0.000024 - momentum: 0.000000
115
+ 2023-10-14 20:23:52,572 epoch 3 - iter 3610/3617 - loss 0.07273065 - time (sec): 168.97 - samples/sec: 2244.48 - lr: 0.000023 - momentum: 0.000000
116
+ 2023-10-14 20:23:52,880 ----------------------------------------------------------------------------------------------------
117
+ 2023-10-14 20:23:52,880 EPOCH 3 done: loss 0.0727 - lr: 0.000023
118
+ 2023-10-14 20:23:59,342 DEV : loss 0.23288682103157043 - f1-score (micro avg) 0.6258
119
+ 2023-10-14 20:23:59,373 ----------------------------------------------------------------------------------------------------
120
+ 2023-10-14 20:24:15,633 epoch 4 - iter 361/3617 - loss 0.05027835 - time (sec): 16.26 - samples/sec: 2260.02 - lr: 0.000023 - momentum: 0.000000
121
+ 2023-10-14 20:24:32,140 epoch 4 - iter 722/3617 - loss 0.05104940 - time (sec): 32.77 - samples/sec: 2293.15 - lr: 0.000023 - momentum: 0.000000
122
+ 2023-10-14 20:24:48,509 epoch 4 - iter 1083/3617 - loss 0.04972765 - time (sec): 49.13 - samples/sec: 2297.62 - lr: 0.000022 - momentum: 0.000000
123
+ 2023-10-14 20:25:04,840 epoch 4 - iter 1444/3617 - loss 0.04948816 - time (sec): 65.47 - samples/sec: 2300.32 - lr: 0.000022 - momentum: 0.000000
124
+ 2023-10-14 20:25:21,353 epoch 4 - iter 1805/3617 - loss 0.04959231 - time (sec): 81.98 - samples/sec: 2316.09 - lr: 0.000022 - momentum: 0.000000
125
+ 2023-10-14 20:25:37,665 epoch 4 - iter 2166/3617 - loss 0.05039825 - time (sec): 98.29 - samples/sec: 2325.84 - lr: 0.000021 - momentum: 0.000000
126
+ 2023-10-14 20:25:53,888 epoch 4 - iter 2527/3617 - loss 0.05098071 - time (sec): 114.51 - samples/sec: 2324.83 - lr: 0.000021 - momentum: 0.000000
127
+ 2023-10-14 20:26:09,988 epoch 4 - iter 2888/3617 - loss 0.05275888 - time (sec): 130.61 - samples/sec: 2333.25 - lr: 0.000021 - momentum: 0.000000
128
+ 2023-10-14 20:26:26,056 epoch 4 - iter 3249/3617 - loss 0.05280906 - time (sec): 146.68 - samples/sec: 2333.90 - lr: 0.000020 - momentum: 0.000000
129
+ 2023-10-14 20:26:42,216 epoch 4 - iter 3610/3617 - loss 0.05251226 - time (sec): 162.84 - samples/sec: 2328.47 - lr: 0.000020 - momentum: 0.000000
130
+ 2023-10-14 20:26:42,519 ----------------------------------------------------------------------------------------------------
131
+ 2023-10-14 20:26:42,519 EPOCH 4 done: loss 0.0524 - lr: 0.000020
132
+ 2023-10-14 20:26:48,265 DEV : loss 0.29611918330192566 - f1-score (micro avg) 0.6115
133
+ 2023-10-14 20:26:48,298 ----------------------------------------------------------------------------------------------------
134
+ 2023-10-14 20:27:05,568 epoch 5 - iter 361/3617 - loss 0.04203166 - time (sec): 17.27 - samples/sec: 2138.47 - lr: 0.000020 - momentum: 0.000000
135
+ 2023-10-14 20:27:21,925 epoch 5 - iter 722/3617 - loss 0.03700812 - time (sec): 33.63 - samples/sec: 2268.87 - lr: 0.000019 - momentum: 0.000000
136
+ 2023-10-14 20:27:38,355 epoch 5 - iter 1083/3617 - loss 0.03554194 - time (sec): 50.06 - samples/sec: 2282.52 - lr: 0.000019 - momentum: 0.000000
137
+ 2023-10-14 20:27:54,786 epoch 5 - iter 1444/3617 - loss 0.03570610 - time (sec): 66.49 - samples/sec: 2278.40 - lr: 0.000019 - momentum: 0.000000
138
+ 2023-10-14 20:28:11,254 epoch 5 - iter 1805/3617 - loss 0.03486095 - time (sec): 82.95 - samples/sec: 2272.79 - lr: 0.000018 - momentum: 0.000000
139
+ 2023-10-14 20:28:27,973 epoch 5 - iter 2166/3617 - loss 0.03550990 - time (sec): 99.67 - samples/sec: 2291.81 - lr: 0.000018 - momentum: 0.000000
140
+ 2023-10-14 20:28:44,336 epoch 5 - iter 2527/3617 - loss 0.03627469 - time (sec): 116.04 - samples/sec: 2296.38 - lr: 0.000018 - momentum: 0.000000
141
+ 2023-10-14 20:29:00,528 epoch 5 - iter 2888/3617 - loss 0.03591614 - time (sec): 132.23 - samples/sec: 2303.16 - lr: 0.000017 - momentum: 0.000000
142
+ 2023-10-14 20:29:16,672 epoch 5 - iter 3249/3617 - loss 0.03711630 - time (sec): 148.37 - samples/sec: 2304.33 - lr: 0.000017 - momentum: 0.000000
143
+ 2023-10-14 20:29:32,867 epoch 5 - iter 3610/3617 - loss 0.03676579 - time (sec): 164.57 - samples/sec: 2303.59 - lr: 0.000017 - momentum: 0.000000
144
+ 2023-10-14 20:29:33,174 ----------------------------------------------------------------------------------------------------
145
+ 2023-10-14 20:29:33,174 EPOCH 5 done: loss 0.0367 - lr: 0.000017
146
+ 2023-10-14 20:29:38,931 DEV : loss 0.30095481872558594 - f1-score (micro avg) 0.6255
147
+ 2023-10-14 20:29:38,964 ----------------------------------------------------------------------------------------------------
148
+ 2023-10-14 20:29:55,577 epoch 6 - iter 361/3617 - loss 0.02879097 - time (sec): 16.61 - samples/sec: 2327.19 - lr: 0.000016 - momentum: 0.000000
149
+ 2023-10-14 20:30:11,985 epoch 6 - iter 722/3617 - loss 0.02283689 - time (sec): 33.02 - samples/sec: 2300.73 - lr: 0.000016 - momentum: 0.000000
150
+ 2023-10-14 20:30:28,425 epoch 6 - iter 1083/3617 - loss 0.02449134 - time (sec): 49.46 - samples/sec: 2308.91 - lr: 0.000016 - momentum: 0.000000
151
+ 2023-10-14 20:30:44,812 epoch 6 - iter 1444/3617 - loss 0.02459281 - time (sec): 65.85 - samples/sec: 2288.93 - lr: 0.000015 - momentum: 0.000000
152
+ 2023-10-14 20:31:01,172 epoch 6 - iter 1805/3617 - loss 0.02646233 - time (sec): 82.21 - samples/sec: 2285.13 - lr: 0.000015 - momentum: 0.000000
153
+ 2023-10-14 20:31:17,640 epoch 6 - iter 2166/3617 - loss 0.02632674 - time (sec): 98.67 - samples/sec: 2283.10 - lr: 0.000015 - momentum: 0.000000
154
+ 2023-10-14 20:31:34,002 epoch 6 - iter 2527/3617 - loss 0.02565887 - time (sec): 115.04 - samples/sec: 2284.73 - lr: 0.000014 - momentum: 0.000000
155
+ 2023-10-14 20:31:50,346 epoch 6 - iter 2888/3617 - loss 0.02507191 - time (sec): 131.38 - samples/sec: 2294.77 - lr: 0.000014 - momentum: 0.000000
156
+ 2023-10-14 20:32:06,871 epoch 6 - iter 3249/3617 - loss 0.02538127 - time (sec): 147.91 - samples/sec: 2298.40 - lr: 0.000014 - momentum: 0.000000
157
+ 2023-10-14 20:32:23,470 epoch 6 - iter 3610/3617 - loss 0.02548542 - time (sec): 164.51 - samples/sec: 2305.67 - lr: 0.000013 - momentum: 0.000000
158
+ 2023-10-14 20:32:23,772 ----------------------------------------------------------------------------------------------------
159
+ 2023-10-14 20:32:23,772 EPOCH 6 done: loss 0.0254 - lr: 0.000013
160
+ 2023-10-14 20:32:31,112 DEV : loss 0.35236480832099915 - f1-score (micro avg) 0.6282
161
+ 2023-10-14 20:32:31,150 ----------------------------------------------------------------------------------------------------
162
+ 2023-10-14 20:32:48,700 epoch 7 - iter 361/3617 - loss 0.01714858 - time (sec): 17.55 - samples/sec: 2204.32 - lr: 0.000013 - momentum: 0.000000
163
+ 2023-10-14 20:33:05,924 epoch 7 - iter 722/3617 - loss 0.01781415 - time (sec): 34.77 - samples/sec: 2200.52 - lr: 0.000013 - momentum: 0.000000
164
+ 2023-10-14 20:33:22,908 epoch 7 - iter 1083/3617 - loss 0.01660375 - time (sec): 51.76 - samples/sec: 2206.15 - lr: 0.000012 - momentum: 0.000000
165
+ 2023-10-14 20:33:38,636 epoch 7 - iter 1444/3617 - loss 0.01611126 - time (sec): 67.49 - samples/sec: 2258.38 - lr: 0.000012 - momentum: 0.000000
166
+ 2023-10-14 20:33:55,037 epoch 7 - iter 1805/3617 - loss 0.01766198 - time (sec): 83.89 - samples/sec: 2270.09 - lr: 0.000012 - momentum: 0.000000
167
+ 2023-10-14 20:34:11,293 epoch 7 - iter 2166/3617 - loss 0.01772828 - time (sec): 100.14 - samples/sec: 2272.39 - lr: 0.000011 - momentum: 0.000000
168
+ 2023-10-14 20:34:27,657 epoch 7 - iter 2527/3617 - loss 0.01786773 - time (sec): 116.51 - samples/sec: 2275.25 - lr: 0.000011 - momentum: 0.000000
169
+ 2023-10-14 20:34:44,293 epoch 7 - iter 2888/3617 - loss 0.01874021 - time (sec): 133.14 - samples/sec: 2289.48 - lr: 0.000011 - momentum: 0.000000
170
+ 2023-10-14 20:35:00,684 epoch 7 - iter 3249/3617 - loss 0.01801121 - time (sec): 149.53 - samples/sec: 2286.88 - lr: 0.000010 - momentum: 0.000000
171
+ 2023-10-14 20:35:17,050 epoch 7 - iter 3610/3617 - loss 0.01800545 - time (sec): 165.90 - samples/sec: 2287.35 - lr: 0.000010 - momentum: 0.000000
172
+ 2023-10-14 20:35:17,351 ----------------------------------------------------------------------------------------------------
173
+ 2023-10-14 20:35:17,352 EPOCH 7 done: loss 0.0180 - lr: 0.000010
174
+ 2023-10-14 20:35:23,896 DEV : loss 0.3972169756889343 - f1-score (micro avg) 0.6267
175
+ 2023-10-14 20:35:23,934 ----------------------------------------------------------------------------------------------------
176
+ 2023-10-14 20:35:42,130 epoch 8 - iter 361/3617 - loss 0.00969253 - time (sec): 18.19 - samples/sec: 2088.14 - lr: 0.000010 - momentum: 0.000000
177
+ 2023-10-14 20:35:58,861 epoch 8 - iter 722/3617 - loss 0.00894942 - time (sec): 34.93 - samples/sec: 2194.36 - lr: 0.000009 - momentum: 0.000000
178
+ 2023-10-14 20:36:15,061 epoch 8 - iter 1083/3617 - loss 0.00849593 - time (sec): 51.12 - samples/sec: 2217.24 - lr: 0.000009 - momentum: 0.000000
179
+ 2023-10-14 20:36:30,777 epoch 8 - iter 1444/3617 - loss 0.00914655 - time (sec): 66.84 - samples/sec: 2270.51 - lr: 0.000009 - momentum: 0.000000
180
+ 2023-10-14 20:36:46,752 epoch 8 - iter 1805/3617 - loss 0.00874731 - time (sec): 82.82 - samples/sec: 2299.67 - lr: 0.000008 - momentum: 0.000000
181
+ 2023-10-14 20:37:03,211 epoch 8 - iter 2166/3617 - loss 0.00988221 - time (sec): 99.28 - samples/sec: 2296.48 - lr: 0.000008 - momentum: 0.000000
182
+ 2023-10-14 20:37:19,499 epoch 8 - iter 2527/3617 - loss 0.01064127 - time (sec): 115.56 - samples/sec: 2299.50 - lr: 0.000008 - momentum: 0.000000
183
+ 2023-10-14 20:37:35,851 epoch 8 - iter 2888/3617 - loss 0.01050074 - time (sec): 131.92 - samples/sec: 2306.37 - lr: 0.000007 - momentum: 0.000000
184
+ 2023-10-14 20:37:52,088 epoch 8 - iter 3249/3617 - loss 0.01062696 - time (sec): 148.15 - samples/sec: 2305.83 - lr: 0.000007 - momentum: 0.000000
185
+ 2023-10-14 20:38:08,575 epoch 8 - iter 3610/3617 - loss 0.01078664 - time (sec): 164.64 - samples/sec: 2304.14 - lr: 0.000007 - momentum: 0.000000
186
+ 2023-10-14 20:38:08,880 ----------------------------------------------------------------------------------------------------
187
+ 2023-10-14 20:38:08,880 EPOCH 8 done: loss 0.0108 - lr: 0.000007
188
+ 2023-10-14 20:38:16,090 DEV : loss 0.41721734404563904 - f1-score (micro avg) 0.6318
189
+ 2023-10-14 20:38:16,127 saving best model
190
+ 2023-10-14 20:38:16,663 ----------------------------------------------------------------------------------------------------
191
+ 2023-10-14 20:38:35,739 epoch 9 - iter 361/3617 - loss 0.01118969 - time (sec): 19.07 - samples/sec: 1994.33 - lr: 0.000006 - momentum: 0.000000
192
+ 2023-10-14 20:38:54,004 epoch 9 - iter 722/3617 - loss 0.00841180 - time (sec): 37.34 - samples/sec: 2045.19 - lr: 0.000006 - momentum: 0.000000
193
+ 2023-10-14 20:39:10,764 epoch 9 - iter 1083/3617 - loss 0.00775291 - time (sec): 54.10 - samples/sec: 2142.25 - lr: 0.000006 - momentum: 0.000000
194
+ 2023-10-14 20:39:27,106 epoch 9 - iter 1444/3617 - loss 0.00707364 - time (sec): 70.44 - samples/sec: 2169.75 - lr: 0.000005 - momentum: 0.000000
195
+ 2023-10-14 20:39:43,532 epoch 9 - iter 1805/3617 - loss 0.00759031 - time (sec): 86.87 - samples/sec: 2184.85 - lr: 0.000005 - momentum: 0.000000
196
+ 2023-10-14 20:39:59,860 epoch 9 - iter 2166/3617 - loss 0.00773152 - time (sec): 103.19 - samples/sec: 2200.69 - lr: 0.000005 - momentum: 0.000000
197
+ 2023-10-14 20:40:16,264 epoch 9 - iter 2527/3617 - loss 0.00812156 - time (sec): 119.60 - samples/sec: 2214.86 - lr: 0.000004 - momentum: 0.000000
198
+ 2023-10-14 20:40:32,665 epoch 9 - iter 2888/3617 - loss 0.00788977 - time (sec): 136.00 - samples/sec: 2226.94 - lr: 0.000004 - momentum: 0.000000
199
+ 2023-10-14 20:40:49,193 epoch 9 - iter 3249/3617 - loss 0.00758141 - time (sec): 152.53 - samples/sec: 2235.85 - lr: 0.000004 - momentum: 0.000000
200
+ 2023-10-14 20:41:05,613 epoch 9 - iter 3610/3617 - loss 0.00754676 - time (sec): 168.95 - samples/sec: 2245.32 - lr: 0.000003 - momentum: 0.000000
201
+ 2023-10-14 20:41:05,923 ----------------------------------------------------------------------------------------------------
202
+ 2023-10-14 20:41:05,923 EPOCH 9 done: loss 0.0075 - lr: 0.000003
203
+ 2023-10-14 20:41:11,659 DEV : loss 0.4234275221824646 - f1-score (micro avg) 0.6324
204
+ 2023-10-14 20:41:11,697 saving best model
205
+ 2023-10-14 20:41:12,297 ----------------------------------------------------------------------------------------------------
206
+ 2023-10-14 20:41:31,527 epoch 10 - iter 361/3617 - loss 0.00680316 - time (sec): 19.23 - samples/sec: 1971.51 - lr: 0.000003 - momentum: 0.000000
207
+ 2023-10-14 20:41:50,552 epoch 10 - iter 722/3617 - loss 0.00602000 - time (sec): 38.25 - samples/sec: 1971.41 - lr: 0.000003 - momentum: 0.000000
208
+ 2023-10-14 20:42:08,808 epoch 10 - iter 1083/3617 - loss 0.00486451 - time (sec): 56.51 - samples/sec: 2016.08 - lr: 0.000002 - momentum: 0.000000
209
+ 2023-10-14 20:42:25,580 epoch 10 - iter 1444/3617 - loss 0.00458704 - time (sec): 73.28 - samples/sec: 2060.75 - lr: 0.000002 - momentum: 0.000000
210
+ 2023-10-14 20:42:41,997 epoch 10 - iter 1805/3617 - loss 0.00443742 - time (sec): 89.70 - samples/sec: 2113.84 - lr: 0.000002 - momentum: 0.000000
211
+ 2023-10-14 20:42:58,053 epoch 10 - iter 2166/3617 - loss 0.00538436 - time (sec): 105.75 - samples/sec: 2138.96 - lr: 0.000001 - momentum: 0.000000
212
+ 2023-10-14 20:43:14,025 epoch 10 - iter 2527/3617 - loss 0.00544432 - time (sec): 121.72 - samples/sec: 2168.45 - lr: 0.000001 - momentum: 0.000000
213
+ 2023-10-14 20:43:30,436 epoch 10 - iter 2888/3617 - loss 0.00544999 - time (sec): 138.14 - samples/sec: 2196.22 - lr: 0.000001 - momentum: 0.000000
214
+ 2023-10-14 20:43:46,505 epoch 10 - iter 3249/3617 - loss 0.00524360 - time (sec): 154.20 - samples/sec: 2213.00 - lr: 0.000000 - momentum: 0.000000
215
+ 2023-10-14 20:44:02,713 epoch 10 - iter 3610/3617 - loss 0.00547411 - time (sec): 170.41 - samples/sec: 2225.38 - lr: 0.000000 - momentum: 0.000000
216
+ 2023-10-14 20:44:03,016 ----------------------------------------------------------------------------------------------------
217
+ 2023-10-14 20:44:03,016 EPOCH 10 done: loss 0.0055 - lr: 0.000000
218
+ 2023-10-14 20:44:08,786 DEV : loss 0.43111804127693176 - f1-score (micro avg) 0.6361
219
+ 2023-10-14 20:44:08,840 saving best model
220
+ 2023-10-14 20:44:09,784 ----------------------------------------------------------------------------------------------------
221
+ 2023-10-14 20:44:09,785 Loading model from best epoch ...
222
+ 2023-10-14 20:44:11,351 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
223
+ 2023-10-14 20:44:20,229
224
+ Results:
225
+ - F-score (micro) 0.6471
226
+ - F-score (macro) 0.5082
227
+ - Accuracy 0.4927
228
+
229
+ By class:
230
+ precision recall f1-score support
231
+
232
+ loc 0.6190 0.7834 0.6916 591
233
+ pers 0.5736 0.7423 0.6471 357
234
+ org 0.2400 0.1519 0.1860 79
235
+
236
+ micro avg 0.5873 0.7205 0.6471 1027
237
+ macro avg 0.4775 0.5592 0.5082 1027
238
+ weighted avg 0.5741 0.7205 0.6372 1027
239
+
240
+ 2023-10-14 20:44:20,230 ----------------------------------------------------------------------------------------------------