Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- test.tsv +0 -0
- training.log +240 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ce1e3e0914bd3f4c481089177c175a7c8949cdee79cbe0b9be6a6a1f79936472
|
3 |
+
size 443311111
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 20:18:02 0.0000 0.2704 0.1200 0.5291 0.7586 0.6234 0.4595
|
3 |
+
2 20:21:02 0.0000 0.0953 0.1275 0.5193 0.7986 0.6294 0.4672
|
4 |
+
3 20:23:59 0.0000 0.0727 0.2329 0.5383 0.7471 0.6258 0.4654
|
5 |
+
4 20:26:48 0.0000 0.0524 0.2961 0.4937 0.8032 0.6115 0.4497
|
6 |
+
5 20:29:38 0.0000 0.0367 0.3010 0.5472 0.7300 0.6255 0.4623
|
7 |
+
6 20:32:31 0.0000 0.0254 0.3524 0.5419 0.7471 0.6282 0.4678
|
8 |
+
7 20:35:23 0.0000 0.0180 0.3972 0.5427 0.7414 0.6267 0.4639
|
9 |
+
8 20:38:16 0.0000 0.0108 0.4172 0.5368 0.7677 0.6318 0.4722
|
10 |
+
9 20:41:11 0.0000 0.0075 0.4234 0.5411 0.7609 0.6324 0.4723
|
11 |
+
10 20:44:08 0.0000 0.0055 0.4311 0.5464 0.7609 0.6361 0.4747
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,240 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-14 20:15:14,574 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-14 20:15:14,575 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=13, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-14 20:15:14,575 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-14 20:15:14,575 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
|
53 |
+
2023-10-14 20:15:14,575 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-14 20:15:14,575 Train: 14465 sentences
|
55 |
+
2023-10-14 20:15:14,576 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-14 20:15:14,576 Training Params:
|
58 |
+
2023-10-14 20:15:14,576 - learning_rate: "3e-05"
|
59 |
+
2023-10-14 20:15:14,576 - mini_batch_size: "4"
|
60 |
+
2023-10-14 20:15:14,576 - max_epochs: "10"
|
61 |
+
2023-10-14 20:15:14,576 - shuffle: "True"
|
62 |
+
2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-14 20:15:14,576 Plugins:
|
64 |
+
2023-10-14 20:15:14,576 - LinearScheduler | warmup_fraction: '0.1'
|
65 |
+
2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-14 20:15:14,576 Final evaluation on model from best epoch (best-model.pt)
|
67 |
+
2023-10-14 20:15:14,576 - metric: "('micro avg', 'f1-score')"
|
68 |
+
2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
|
69 |
+
2023-10-14 20:15:14,576 Computation:
|
70 |
+
2023-10-14 20:15:14,576 - compute on device: cuda:0
|
71 |
+
2023-10-14 20:15:14,576 - embedding storage: none
|
72 |
+
2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-14 20:15:14,576 Model training base path: "hmbench-letemps/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
|
74 |
+
2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
|
75 |
+
2023-10-14 20:15:14,576 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-14 20:15:30,762 epoch 1 - iter 361/3617 - loss 1.47398781 - time (sec): 16.18 - samples/sec: 2325.55 - lr: 0.000003 - momentum: 0.000000
|
77 |
+
2023-10-14 20:15:47,116 epoch 1 - iter 722/3617 - loss 0.83970179 - time (sec): 32.54 - samples/sec: 2319.29 - lr: 0.000006 - momentum: 0.000000
|
78 |
+
2023-10-14 20:16:03,219 epoch 1 - iter 1083/3617 - loss 0.62059022 - time (sec): 48.64 - samples/sec: 2290.84 - lr: 0.000009 - momentum: 0.000000
|
79 |
+
2023-10-14 20:16:19,390 epoch 1 - iter 1444/3617 - loss 0.50223969 - time (sec): 64.81 - samples/sec: 2291.76 - lr: 0.000012 - momentum: 0.000000
|
80 |
+
2023-10-14 20:16:35,953 epoch 1 - iter 1805/3617 - loss 0.42436958 - time (sec): 81.38 - samples/sec: 2313.82 - lr: 0.000015 - momentum: 0.000000
|
81 |
+
2023-10-14 20:16:52,015 epoch 1 - iter 2166/3617 - loss 0.37437638 - time (sec): 97.44 - samples/sec: 2319.20 - lr: 0.000018 - momentum: 0.000000
|
82 |
+
2023-10-14 20:17:07,780 epoch 1 - iter 2527/3617 - loss 0.33741262 - time (sec): 113.20 - samples/sec: 2348.50 - lr: 0.000021 - momentum: 0.000000
|
83 |
+
2023-10-14 20:17:23,796 epoch 1 - iter 2888/3617 - loss 0.31042790 - time (sec): 129.22 - samples/sec: 2349.52 - lr: 0.000024 - momentum: 0.000000
|
84 |
+
2023-10-14 20:17:39,981 epoch 1 - iter 3249/3617 - loss 0.28794148 - time (sec): 145.40 - samples/sec: 2345.55 - lr: 0.000027 - momentum: 0.000000
|
85 |
+
2023-10-14 20:17:56,334 epoch 1 - iter 3610/3617 - loss 0.27061320 - time (sec): 161.76 - samples/sec: 2344.14 - lr: 0.000030 - momentum: 0.000000
|
86 |
+
2023-10-14 20:17:56,632 ----------------------------------------------------------------------------------------------------
|
87 |
+
2023-10-14 20:17:56,633 EPOCH 1 done: loss 0.2704 - lr: 0.000030
|
88 |
+
2023-10-14 20:18:02,040 DEV : loss 0.11999412626028061 - f1-score (micro avg) 0.6234
|
89 |
+
2023-10-14 20:18:02,080 saving best model
|
90 |
+
2023-10-14 20:18:02,475 ----------------------------------------------------------------------------------------------------
|
91 |
+
2023-10-14 20:18:21,558 epoch 2 - iter 361/3617 - loss 0.09782984 - time (sec): 19.08 - samples/sec: 2009.32 - lr: 0.000030 - momentum: 0.000000
|
92 |
+
2023-10-14 20:18:39,215 epoch 2 - iter 722/3617 - loss 0.09421131 - time (sec): 36.74 - samples/sec: 2056.11 - lr: 0.000029 - momentum: 0.000000
|
93 |
+
2023-10-14 20:18:55,988 epoch 2 - iter 1083/3617 - loss 0.09577052 - time (sec): 53.51 - samples/sec: 2106.61 - lr: 0.000029 - momentum: 0.000000
|
94 |
+
2023-10-14 20:19:13,211 epoch 2 - iter 1444/3617 - loss 0.09595965 - time (sec): 70.73 - samples/sec: 2158.23 - lr: 0.000029 - momentum: 0.000000
|
95 |
+
2023-10-14 20:19:29,358 epoch 2 - iter 1805/3617 - loss 0.09590618 - time (sec): 86.88 - samples/sec: 2185.24 - lr: 0.000028 - momentum: 0.000000
|
96 |
+
2023-10-14 20:19:46,839 epoch 2 - iter 2166/3617 - loss 0.09403509 - time (sec): 104.36 - samples/sec: 2192.52 - lr: 0.000028 - momentum: 0.000000
|
97 |
+
2023-10-14 20:20:03,705 epoch 2 - iter 2527/3617 - loss 0.09439687 - time (sec): 121.23 - samples/sec: 2211.93 - lr: 0.000028 - momentum: 0.000000
|
98 |
+
2023-10-14 20:20:20,276 epoch 2 - iter 2888/3617 - loss 0.09538483 - time (sec): 137.80 - samples/sec: 2216.77 - lr: 0.000027 - momentum: 0.000000
|
99 |
+
2023-10-14 20:20:36,508 epoch 2 - iter 3249/3617 - loss 0.09475508 - time (sec): 154.03 - samples/sec: 2215.51 - lr: 0.000027 - momentum: 0.000000
|
100 |
+
2023-10-14 20:20:55,584 epoch 2 - iter 3610/3617 - loss 0.09527647 - time (sec): 173.11 - samples/sec: 2191.11 - lr: 0.000027 - momentum: 0.000000
|
101 |
+
2023-10-14 20:20:55,956 ----------------------------------------------------------------------------------------------------
|
102 |
+
2023-10-14 20:20:55,956 EPOCH 2 done: loss 0.0953 - lr: 0.000027
|
103 |
+
2023-10-14 20:21:02,854 DEV : loss 0.12750780582427979 - f1-score (micro avg) 0.6294
|
104 |
+
2023-10-14 20:21:02,888 saving best model
|
105 |
+
2023-10-14 20:21:03,598 ----------------------------------------------------------------------------------------------------
|
106 |
+
2023-10-14 20:21:22,467 epoch 3 - iter 361/3617 - loss 0.05413118 - time (sec): 18.87 - samples/sec: 1959.23 - lr: 0.000026 - momentum: 0.000000
|
107 |
+
2023-10-14 20:21:41,460 epoch 3 - iter 722/3617 - loss 0.06427074 - time (sec): 37.86 - samples/sec: 1973.69 - lr: 0.000026 - momentum: 0.000000
|
108 |
+
2023-10-14 20:21:57,897 epoch 3 - iter 1083/3617 - loss 0.07334315 - time (sec): 54.30 - samples/sec: 2076.57 - lr: 0.000026 - momentum: 0.000000
|
109 |
+
2023-10-14 20:22:14,087 epoch 3 - iter 1444/3617 - loss 0.07359621 - time (sec): 70.49 - samples/sec: 2131.15 - lr: 0.000025 - momentum: 0.000000
|
110 |
+
2023-10-14 20:22:30,386 epoch 3 - iter 1805/3617 - loss 0.07222582 - time (sec): 86.79 - samples/sec: 2165.52 - lr: 0.000025 - momentum: 0.000000
|
111 |
+
2023-10-14 20:22:46,657 epoch 3 - iter 2166/3617 - loss 0.07167594 - time (sec): 103.06 - samples/sec: 2195.87 - lr: 0.000025 - momentum: 0.000000
|
112 |
+
2023-10-14 20:23:03,231 epoch 3 - iter 2527/3617 - loss 0.07168900 - time (sec): 119.63 - samples/sec: 2219.22 - lr: 0.000024 - momentum: 0.000000
|
113 |
+
2023-10-14 20:23:19,692 epoch 3 - iter 2888/3617 - loss 0.07149544 - time (sec): 136.09 - samples/sec: 2230.31 - lr: 0.000024 - momentum: 0.000000
|
114 |
+
2023-10-14 20:23:36,017 epoch 3 - iter 3249/3617 - loss 0.07253118 - time (sec): 152.42 - samples/sec: 2240.40 - lr: 0.000024 - momentum: 0.000000
|
115 |
+
2023-10-14 20:23:52,572 epoch 3 - iter 3610/3617 - loss 0.07273065 - time (sec): 168.97 - samples/sec: 2244.48 - lr: 0.000023 - momentum: 0.000000
|
116 |
+
2023-10-14 20:23:52,880 ----------------------------------------------------------------------------------------------------
|
117 |
+
2023-10-14 20:23:52,880 EPOCH 3 done: loss 0.0727 - lr: 0.000023
|
118 |
+
2023-10-14 20:23:59,342 DEV : loss 0.23288682103157043 - f1-score (micro avg) 0.6258
|
119 |
+
2023-10-14 20:23:59,373 ----------------------------------------------------------------------------------------------------
|
120 |
+
2023-10-14 20:24:15,633 epoch 4 - iter 361/3617 - loss 0.05027835 - time (sec): 16.26 - samples/sec: 2260.02 - lr: 0.000023 - momentum: 0.000000
|
121 |
+
2023-10-14 20:24:32,140 epoch 4 - iter 722/3617 - loss 0.05104940 - time (sec): 32.77 - samples/sec: 2293.15 - lr: 0.000023 - momentum: 0.000000
|
122 |
+
2023-10-14 20:24:48,509 epoch 4 - iter 1083/3617 - loss 0.04972765 - time (sec): 49.13 - samples/sec: 2297.62 - lr: 0.000022 - momentum: 0.000000
|
123 |
+
2023-10-14 20:25:04,840 epoch 4 - iter 1444/3617 - loss 0.04948816 - time (sec): 65.47 - samples/sec: 2300.32 - lr: 0.000022 - momentum: 0.000000
|
124 |
+
2023-10-14 20:25:21,353 epoch 4 - iter 1805/3617 - loss 0.04959231 - time (sec): 81.98 - samples/sec: 2316.09 - lr: 0.000022 - momentum: 0.000000
|
125 |
+
2023-10-14 20:25:37,665 epoch 4 - iter 2166/3617 - loss 0.05039825 - time (sec): 98.29 - samples/sec: 2325.84 - lr: 0.000021 - momentum: 0.000000
|
126 |
+
2023-10-14 20:25:53,888 epoch 4 - iter 2527/3617 - loss 0.05098071 - time (sec): 114.51 - samples/sec: 2324.83 - lr: 0.000021 - momentum: 0.000000
|
127 |
+
2023-10-14 20:26:09,988 epoch 4 - iter 2888/3617 - loss 0.05275888 - time (sec): 130.61 - samples/sec: 2333.25 - lr: 0.000021 - momentum: 0.000000
|
128 |
+
2023-10-14 20:26:26,056 epoch 4 - iter 3249/3617 - loss 0.05280906 - time (sec): 146.68 - samples/sec: 2333.90 - lr: 0.000020 - momentum: 0.000000
|
129 |
+
2023-10-14 20:26:42,216 epoch 4 - iter 3610/3617 - loss 0.05251226 - time (sec): 162.84 - samples/sec: 2328.47 - lr: 0.000020 - momentum: 0.000000
|
130 |
+
2023-10-14 20:26:42,519 ----------------------------------------------------------------------------------------------------
|
131 |
+
2023-10-14 20:26:42,519 EPOCH 4 done: loss 0.0524 - lr: 0.000020
|
132 |
+
2023-10-14 20:26:48,265 DEV : loss 0.29611918330192566 - f1-score (micro avg) 0.6115
|
133 |
+
2023-10-14 20:26:48,298 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-14 20:27:05,568 epoch 5 - iter 361/3617 - loss 0.04203166 - time (sec): 17.27 - samples/sec: 2138.47 - lr: 0.000020 - momentum: 0.000000
|
135 |
+
2023-10-14 20:27:21,925 epoch 5 - iter 722/3617 - loss 0.03700812 - time (sec): 33.63 - samples/sec: 2268.87 - lr: 0.000019 - momentum: 0.000000
|
136 |
+
2023-10-14 20:27:38,355 epoch 5 - iter 1083/3617 - loss 0.03554194 - time (sec): 50.06 - samples/sec: 2282.52 - lr: 0.000019 - momentum: 0.000000
|
137 |
+
2023-10-14 20:27:54,786 epoch 5 - iter 1444/3617 - loss 0.03570610 - time (sec): 66.49 - samples/sec: 2278.40 - lr: 0.000019 - momentum: 0.000000
|
138 |
+
2023-10-14 20:28:11,254 epoch 5 - iter 1805/3617 - loss 0.03486095 - time (sec): 82.95 - samples/sec: 2272.79 - lr: 0.000018 - momentum: 0.000000
|
139 |
+
2023-10-14 20:28:27,973 epoch 5 - iter 2166/3617 - loss 0.03550990 - time (sec): 99.67 - samples/sec: 2291.81 - lr: 0.000018 - momentum: 0.000000
|
140 |
+
2023-10-14 20:28:44,336 epoch 5 - iter 2527/3617 - loss 0.03627469 - time (sec): 116.04 - samples/sec: 2296.38 - lr: 0.000018 - momentum: 0.000000
|
141 |
+
2023-10-14 20:29:00,528 epoch 5 - iter 2888/3617 - loss 0.03591614 - time (sec): 132.23 - samples/sec: 2303.16 - lr: 0.000017 - momentum: 0.000000
|
142 |
+
2023-10-14 20:29:16,672 epoch 5 - iter 3249/3617 - loss 0.03711630 - time (sec): 148.37 - samples/sec: 2304.33 - lr: 0.000017 - momentum: 0.000000
|
143 |
+
2023-10-14 20:29:32,867 epoch 5 - iter 3610/3617 - loss 0.03676579 - time (sec): 164.57 - samples/sec: 2303.59 - lr: 0.000017 - momentum: 0.000000
|
144 |
+
2023-10-14 20:29:33,174 ----------------------------------------------------------------------------------------------------
|
145 |
+
2023-10-14 20:29:33,174 EPOCH 5 done: loss 0.0367 - lr: 0.000017
|
146 |
+
2023-10-14 20:29:38,931 DEV : loss 0.30095481872558594 - f1-score (micro avg) 0.6255
|
147 |
+
2023-10-14 20:29:38,964 ----------------------------------------------------------------------------------------------------
|
148 |
+
2023-10-14 20:29:55,577 epoch 6 - iter 361/3617 - loss 0.02879097 - time (sec): 16.61 - samples/sec: 2327.19 - lr: 0.000016 - momentum: 0.000000
|
149 |
+
2023-10-14 20:30:11,985 epoch 6 - iter 722/3617 - loss 0.02283689 - time (sec): 33.02 - samples/sec: 2300.73 - lr: 0.000016 - momentum: 0.000000
|
150 |
+
2023-10-14 20:30:28,425 epoch 6 - iter 1083/3617 - loss 0.02449134 - time (sec): 49.46 - samples/sec: 2308.91 - lr: 0.000016 - momentum: 0.000000
|
151 |
+
2023-10-14 20:30:44,812 epoch 6 - iter 1444/3617 - loss 0.02459281 - time (sec): 65.85 - samples/sec: 2288.93 - lr: 0.000015 - momentum: 0.000000
|
152 |
+
2023-10-14 20:31:01,172 epoch 6 - iter 1805/3617 - loss 0.02646233 - time (sec): 82.21 - samples/sec: 2285.13 - lr: 0.000015 - momentum: 0.000000
|
153 |
+
2023-10-14 20:31:17,640 epoch 6 - iter 2166/3617 - loss 0.02632674 - time (sec): 98.67 - samples/sec: 2283.10 - lr: 0.000015 - momentum: 0.000000
|
154 |
+
2023-10-14 20:31:34,002 epoch 6 - iter 2527/3617 - loss 0.02565887 - time (sec): 115.04 - samples/sec: 2284.73 - lr: 0.000014 - momentum: 0.000000
|
155 |
+
2023-10-14 20:31:50,346 epoch 6 - iter 2888/3617 - loss 0.02507191 - time (sec): 131.38 - samples/sec: 2294.77 - lr: 0.000014 - momentum: 0.000000
|
156 |
+
2023-10-14 20:32:06,871 epoch 6 - iter 3249/3617 - loss 0.02538127 - time (sec): 147.91 - samples/sec: 2298.40 - lr: 0.000014 - momentum: 0.000000
|
157 |
+
2023-10-14 20:32:23,470 epoch 6 - iter 3610/3617 - loss 0.02548542 - time (sec): 164.51 - samples/sec: 2305.67 - lr: 0.000013 - momentum: 0.000000
|
158 |
+
2023-10-14 20:32:23,772 ----------------------------------------------------------------------------------------------------
|
159 |
+
2023-10-14 20:32:23,772 EPOCH 6 done: loss 0.0254 - lr: 0.000013
|
160 |
+
2023-10-14 20:32:31,112 DEV : loss 0.35236480832099915 - f1-score (micro avg) 0.6282
|
161 |
+
2023-10-14 20:32:31,150 ----------------------------------------------------------------------------------------------------
|
162 |
+
2023-10-14 20:32:48,700 epoch 7 - iter 361/3617 - loss 0.01714858 - time (sec): 17.55 - samples/sec: 2204.32 - lr: 0.000013 - momentum: 0.000000
|
163 |
+
2023-10-14 20:33:05,924 epoch 7 - iter 722/3617 - loss 0.01781415 - time (sec): 34.77 - samples/sec: 2200.52 - lr: 0.000013 - momentum: 0.000000
|
164 |
+
2023-10-14 20:33:22,908 epoch 7 - iter 1083/3617 - loss 0.01660375 - time (sec): 51.76 - samples/sec: 2206.15 - lr: 0.000012 - momentum: 0.000000
|
165 |
+
2023-10-14 20:33:38,636 epoch 7 - iter 1444/3617 - loss 0.01611126 - time (sec): 67.49 - samples/sec: 2258.38 - lr: 0.000012 - momentum: 0.000000
|
166 |
+
2023-10-14 20:33:55,037 epoch 7 - iter 1805/3617 - loss 0.01766198 - time (sec): 83.89 - samples/sec: 2270.09 - lr: 0.000012 - momentum: 0.000000
|
167 |
+
2023-10-14 20:34:11,293 epoch 7 - iter 2166/3617 - loss 0.01772828 - time (sec): 100.14 - samples/sec: 2272.39 - lr: 0.000011 - momentum: 0.000000
|
168 |
+
2023-10-14 20:34:27,657 epoch 7 - iter 2527/3617 - loss 0.01786773 - time (sec): 116.51 - samples/sec: 2275.25 - lr: 0.000011 - momentum: 0.000000
|
169 |
+
2023-10-14 20:34:44,293 epoch 7 - iter 2888/3617 - loss 0.01874021 - time (sec): 133.14 - samples/sec: 2289.48 - lr: 0.000011 - momentum: 0.000000
|
170 |
+
2023-10-14 20:35:00,684 epoch 7 - iter 3249/3617 - loss 0.01801121 - time (sec): 149.53 - samples/sec: 2286.88 - lr: 0.000010 - momentum: 0.000000
|
171 |
+
2023-10-14 20:35:17,050 epoch 7 - iter 3610/3617 - loss 0.01800545 - time (sec): 165.90 - samples/sec: 2287.35 - lr: 0.000010 - momentum: 0.000000
|
172 |
+
2023-10-14 20:35:17,351 ----------------------------------------------------------------------------------------------------
|
173 |
+
2023-10-14 20:35:17,352 EPOCH 7 done: loss 0.0180 - lr: 0.000010
|
174 |
+
2023-10-14 20:35:23,896 DEV : loss 0.3972169756889343 - f1-score (micro avg) 0.6267
|
175 |
+
2023-10-14 20:35:23,934 ----------------------------------------------------------------------------------------------------
|
176 |
+
2023-10-14 20:35:42,130 epoch 8 - iter 361/3617 - loss 0.00969253 - time (sec): 18.19 - samples/sec: 2088.14 - lr: 0.000010 - momentum: 0.000000
|
177 |
+
2023-10-14 20:35:58,861 epoch 8 - iter 722/3617 - loss 0.00894942 - time (sec): 34.93 - samples/sec: 2194.36 - lr: 0.000009 - momentum: 0.000000
|
178 |
+
2023-10-14 20:36:15,061 epoch 8 - iter 1083/3617 - loss 0.00849593 - time (sec): 51.12 - samples/sec: 2217.24 - lr: 0.000009 - momentum: 0.000000
|
179 |
+
2023-10-14 20:36:30,777 epoch 8 - iter 1444/3617 - loss 0.00914655 - time (sec): 66.84 - samples/sec: 2270.51 - lr: 0.000009 - momentum: 0.000000
|
180 |
+
2023-10-14 20:36:46,752 epoch 8 - iter 1805/3617 - loss 0.00874731 - time (sec): 82.82 - samples/sec: 2299.67 - lr: 0.000008 - momentum: 0.000000
|
181 |
+
2023-10-14 20:37:03,211 epoch 8 - iter 2166/3617 - loss 0.00988221 - time (sec): 99.28 - samples/sec: 2296.48 - lr: 0.000008 - momentum: 0.000000
|
182 |
+
2023-10-14 20:37:19,499 epoch 8 - iter 2527/3617 - loss 0.01064127 - time (sec): 115.56 - samples/sec: 2299.50 - lr: 0.000008 - momentum: 0.000000
|
183 |
+
2023-10-14 20:37:35,851 epoch 8 - iter 2888/3617 - loss 0.01050074 - time (sec): 131.92 - samples/sec: 2306.37 - lr: 0.000007 - momentum: 0.000000
|
184 |
+
2023-10-14 20:37:52,088 epoch 8 - iter 3249/3617 - loss 0.01062696 - time (sec): 148.15 - samples/sec: 2305.83 - lr: 0.000007 - momentum: 0.000000
|
185 |
+
2023-10-14 20:38:08,575 epoch 8 - iter 3610/3617 - loss 0.01078664 - time (sec): 164.64 - samples/sec: 2304.14 - lr: 0.000007 - momentum: 0.000000
|
186 |
+
2023-10-14 20:38:08,880 ----------------------------------------------------------------------------------------------------
|
187 |
+
2023-10-14 20:38:08,880 EPOCH 8 done: loss 0.0108 - lr: 0.000007
|
188 |
+
2023-10-14 20:38:16,090 DEV : loss 0.41721734404563904 - f1-score (micro avg) 0.6318
|
189 |
+
2023-10-14 20:38:16,127 saving best model
|
190 |
+
2023-10-14 20:38:16,663 ----------------------------------------------------------------------------------------------------
|
191 |
+
2023-10-14 20:38:35,739 epoch 9 - iter 361/3617 - loss 0.01118969 - time (sec): 19.07 - samples/sec: 1994.33 - lr: 0.000006 - momentum: 0.000000
|
192 |
+
2023-10-14 20:38:54,004 epoch 9 - iter 722/3617 - loss 0.00841180 - time (sec): 37.34 - samples/sec: 2045.19 - lr: 0.000006 - momentum: 0.000000
|
193 |
+
2023-10-14 20:39:10,764 epoch 9 - iter 1083/3617 - loss 0.00775291 - time (sec): 54.10 - samples/sec: 2142.25 - lr: 0.000006 - momentum: 0.000000
|
194 |
+
2023-10-14 20:39:27,106 epoch 9 - iter 1444/3617 - loss 0.00707364 - time (sec): 70.44 - samples/sec: 2169.75 - lr: 0.000005 - momentum: 0.000000
|
195 |
+
2023-10-14 20:39:43,532 epoch 9 - iter 1805/3617 - loss 0.00759031 - time (sec): 86.87 - samples/sec: 2184.85 - lr: 0.000005 - momentum: 0.000000
|
196 |
+
2023-10-14 20:39:59,860 epoch 9 - iter 2166/3617 - loss 0.00773152 - time (sec): 103.19 - samples/sec: 2200.69 - lr: 0.000005 - momentum: 0.000000
|
197 |
+
2023-10-14 20:40:16,264 epoch 9 - iter 2527/3617 - loss 0.00812156 - time (sec): 119.60 - samples/sec: 2214.86 - lr: 0.000004 - momentum: 0.000000
|
198 |
+
2023-10-14 20:40:32,665 epoch 9 - iter 2888/3617 - loss 0.00788977 - time (sec): 136.00 - samples/sec: 2226.94 - lr: 0.000004 - momentum: 0.000000
|
199 |
+
2023-10-14 20:40:49,193 epoch 9 - iter 3249/3617 - loss 0.00758141 - time (sec): 152.53 - samples/sec: 2235.85 - lr: 0.000004 - momentum: 0.000000
|
200 |
+
2023-10-14 20:41:05,613 epoch 9 - iter 3610/3617 - loss 0.00754676 - time (sec): 168.95 - samples/sec: 2245.32 - lr: 0.000003 - momentum: 0.000000
|
201 |
+
2023-10-14 20:41:05,923 ----------------------------------------------------------------------------------------------------
|
202 |
+
2023-10-14 20:41:05,923 EPOCH 9 done: loss 0.0075 - lr: 0.000003
|
203 |
+
2023-10-14 20:41:11,659 DEV : loss 0.4234275221824646 - f1-score (micro avg) 0.6324
|
204 |
+
2023-10-14 20:41:11,697 saving best model
|
205 |
+
2023-10-14 20:41:12,297 ----------------------------------------------------------------------------------------------------
|
206 |
+
2023-10-14 20:41:31,527 epoch 10 - iter 361/3617 - loss 0.00680316 - time (sec): 19.23 - samples/sec: 1971.51 - lr: 0.000003 - momentum: 0.000000
|
207 |
+
2023-10-14 20:41:50,552 epoch 10 - iter 722/3617 - loss 0.00602000 - time (sec): 38.25 - samples/sec: 1971.41 - lr: 0.000003 - momentum: 0.000000
|
208 |
+
2023-10-14 20:42:08,808 epoch 10 - iter 1083/3617 - loss 0.00486451 - time (sec): 56.51 - samples/sec: 2016.08 - lr: 0.000002 - momentum: 0.000000
|
209 |
+
2023-10-14 20:42:25,580 epoch 10 - iter 1444/3617 - loss 0.00458704 - time (sec): 73.28 - samples/sec: 2060.75 - lr: 0.000002 - momentum: 0.000000
|
210 |
+
2023-10-14 20:42:41,997 epoch 10 - iter 1805/3617 - loss 0.00443742 - time (sec): 89.70 - samples/sec: 2113.84 - lr: 0.000002 - momentum: 0.000000
|
211 |
+
2023-10-14 20:42:58,053 epoch 10 - iter 2166/3617 - loss 0.00538436 - time (sec): 105.75 - samples/sec: 2138.96 - lr: 0.000001 - momentum: 0.000000
|
212 |
+
2023-10-14 20:43:14,025 epoch 10 - iter 2527/3617 - loss 0.00544432 - time (sec): 121.72 - samples/sec: 2168.45 - lr: 0.000001 - momentum: 0.000000
|
213 |
+
2023-10-14 20:43:30,436 epoch 10 - iter 2888/3617 - loss 0.00544999 - time (sec): 138.14 - samples/sec: 2196.22 - lr: 0.000001 - momentum: 0.000000
|
214 |
+
2023-10-14 20:43:46,505 epoch 10 - iter 3249/3617 - loss 0.00524360 - time (sec): 154.20 - samples/sec: 2213.00 - lr: 0.000000 - momentum: 0.000000
|
215 |
+
2023-10-14 20:44:02,713 epoch 10 - iter 3610/3617 - loss 0.00547411 - time (sec): 170.41 - samples/sec: 2225.38 - lr: 0.000000 - momentum: 0.000000
|
216 |
+
2023-10-14 20:44:03,016 ----------------------------------------------------------------------------------------------------
|
217 |
+
2023-10-14 20:44:03,016 EPOCH 10 done: loss 0.0055 - lr: 0.000000
|
218 |
+
2023-10-14 20:44:08,786 DEV : loss 0.43111804127693176 - f1-score (micro avg) 0.6361
|
219 |
+
2023-10-14 20:44:08,840 saving best model
|
220 |
+
2023-10-14 20:44:09,784 ----------------------------------------------------------------------------------------------------
|
221 |
+
2023-10-14 20:44:09,785 Loading model from best epoch ...
|
222 |
+
2023-10-14 20:44:11,351 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
|
223 |
+
2023-10-14 20:44:20,229
|
224 |
+
Results:
|
225 |
+
- F-score (micro) 0.6471
|
226 |
+
- F-score (macro) 0.5082
|
227 |
+
- Accuracy 0.4927
|
228 |
+
|
229 |
+
By class:
|
230 |
+
precision recall f1-score support
|
231 |
+
|
232 |
+
loc 0.6190 0.7834 0.6916 591
|
233 |
+
pers 0.5736 0.7423 0.6471 357
|
234 |
+
org 0.2400 0.1519 0.1860 79
|
235 |
+
|
236 |
+
micro avg 0.5873 0.7205 0.6471 1027
|
237 |
+
macro avg 0.4775 0.5592 0.5082 1027
|
238 |
+
weighted avg 0.5741 0.7205 0.6372 1027
|
239 |
+
|
240 |
+
2023-10-14 20:44:20,230 ----------------------------------------------------------------------------------------------------
|