Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- test.tsv +0 -0
- training.log +245 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b729dae517f14a725d79176e3903472182a5d84b3033165391c9962c0104c6fa
|
3 |
+
size 443335879
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 11:37:57 0.0000 0.7797 0.1834 0.6461 0.6067 0.6258 0.4709
|
3 |
+
2 11:38:33 0.0000 0.1637 0.1276 0.6577 0.7287 0.6914 0.5469
|
4 |
+
3 11:39:13 0.0000 0.0835 0.1253 0.7248 0.7475 0.7360 0.6035
|
5 |
+
4 11:39:50 0.0000 0.0484 0.1411 0.7235 0.7834 0.7523 0.6208
|
6 |
+
5 11:40:30 0.0000 0.0331 0.1728 0.7615 0.7889 0.7750 0.6468
|
7 |
+
6 11:41:09 0.0000 0.0233 0.1889 0.7589 0.7873 0.7728 0.6451
|
8 |
+
7 11:41:48 0.0000 0.0151 0.2018 0.7480 0.8030 0.7745 0.6508
|
9 |
+
8 11:42:26 0.0000 0.0112 0.2058 0.7655 0.7991 0.7819 0.6577
|
10 |
+
9 11:43:05 0.0000 0.0066 0.2230 0.7719 0.7936 0.7826 0.6599
|
11 |
+
10 11:43:43 0.0000 0.0052 0.2202 0.7805 0.8006 0.7904 0.6710
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,245 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-13 11:37:23,100 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-13 11:37:23,101 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=21, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-13 11:37:23,101 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-13 11:37:23,101 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
|
53 |
+
2023-10-13 11:37:23,101 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-13 11:37:23,101 Train: 3575 sentences
|
55 |
+
2023-10-13 11:37:23,101 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-13 11:37:23,101 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-13 11:37:23,101 Training Params:
|
58 |
+
2023-10-13 11:37:23,101 - learning_rate: "3e-05"
|
59 |
+
2023-10-13 11:37:23,101 - mini_batch_size: "8"
|
60 |
+
2023-10-13 11:37:23,101 - max_epochs: "10"
|
61 |
+
2023-10-13 11:37:23,101 - shuffle: "True"
|
62 |
+
2023-10-13 11:37:23,101 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-13 11:37:23,101 Plugins:
|
64 |
+
2023-10-13 11:37:23,101 - LinearScheduler | warmup_fraction: '0.1'
|
65 |
+
2023-10-13 11:37:23,101 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-13 11:37:23,101 Final evaluation on model from best epoch (best-model.pt)
|
67 |
+
2023-10-13 11:37:23,101 - metric: "('micro avg', 'f1-score')"
|
68 |
+
2023-10-13 11:37:23,101 ----------------------------------------------------------------------------------------------------
|
69 |
+
2023-10-13 11:37:23,101 Computation:
|
70 |
+
2023-10-13 11:37:23,101 - compute on device: cuda:0
|
71 |
+
2023-10-13 11:37:23,102 - embedding storage: none
|
72 |
+
2023-10-13 11:37:23,102 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-13 11:37:23,102 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
|
74 |
+
2023-10-13 11:37:23,102 ----------------------------------------------------------------------------------------------------
|
75 |
+
2023-10-13 11:37:23,102 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-13 11:37:27,202 epoch 1 - iter 44/447 - loss 3.14196282 - time (sec): 4.10 - samples/sec: 2319.10 - lr: 0.000003 - momentum: 0.000000
|
77 |
+
2023-10-13 11:37:29,895 epoch 1 - iter 88/447 - loss 2.47803589 - time (sec): 6.79 - samples/sec: 2559.07 - lr: 0.000006 - momentum: 0.000000
|
78 |
+
2023-10-13 11:37:32,432 epoch 1 - iter 132/447 - loss 1.88263805 - time (sec): 9.33 - samples/sec: 2683.04 - lr: 0.000009 - momentum: 0.000000
|
79 |
+
2023-10-13 11:37:35,166 epoch 1 - iter 176/447 - loss 1.52213191 - time (sec): 12.06 - samples/sec: 2772.84 - lr: 0.000012 - momentum: 0.000000
|
80 |
+
2023-10-13 11:37:37,858 epoch 1 - iter 220/447 - loss 1.30514887 - time (sec): 14.75 - samples/sec: 2816.89 - lr: 0.000015 - momentum: 0.000000
|
81 |
+
2023-10-13 11:37:40,687 epoch 1 - iter 264/447 - loss 1.13508816 - time (sec): 17.58 - samples/sec: 2865.93 - lr: 0.000018 - momentum: 0.000000
|
82 |
+
2023-10-13 11:37:43,389 epoch 1 - iter 308/447 - loss 1.01636293 - time (sec): 20.29 - samples/sec: 2907.63 - lr: 0.000021 - momentum: 0.000000
|
83 |
+
2023-10-13 11:37:46,313 epoch 1 - iter 352/447 - loss 0.91617799 - time (sec): 23.21 - samples/sec: 2931.05 - lr: 0.000024 - momentum: 0.000000
|
84 |
+
2023-10-13 11:37:49,109 epoch 1 - iter 396/447 - loss 0.84517290 - time (sec): 26.01 - samples/sec: 2934.25 - lr: 0.000027 - momentum: 0.000000
|
85 |
+
2023-10-13 11:37:52,076 epoch 1 - iter 440/447 - loss 0.78671462 - time (sec): 28.97 - samples/sec: 2948.06 - lr: 0.000029 - momentum: 0.000000
|
86 |
+
2023-10-13 11:37:52,494 ----------------------------------------------------------------------------------------------------
|
87 |
+
2023-10-13 11:37:52,495 EPOCH 1 done: loss 0.7797 - lr: 0.000029
|
88 |
+
2023-10-13 11:37:57,330 DEV : loss 0.18337313830852509 - f1-score (micro avg) 0.6258
|
89 |
+
2023-10-13 11:37:57,355 saving best model
|
90 |
+
2023-10-13 11:37:57,656 ----------------------------------------------------------------------------------------------------
|
91 |
+
2023-10-13 11:38:00,311 epoch 2 - iter 44/447 - loss 0.18448565 - time (sec): 2.65 - samples/sec: 3222.29 - lr: 0.000030 - momentum: 0.000000
|
92 |
+
2023-10-13 11:38:03,060 epoch 2 - iter 88/447 - loss 0.20795048 - time (sec): 5.40 - samples/sec: 3154.56 - lr: 0.000029 - momentum: 0.000000
|
93 |
+
2023-10-13 11:38:05,773 epoch 2 - iter 132/447 - loss 0.19839698 - time (sec): 8.12 - samples/sec: 3166.50 - lr: 0.000029 - momentum: 0.000000
|
94 |
+
2023-10-13 11:38:08,634 epoch 2 - iter 176/447 - loss 0.19080864 - time (sec): 10.98 - samples/sec: 3105.64 - lr: 0.000029 - momentum: 0.000000
|
95 |
+
2023-10-13 11:38:11,210 epoch 2 - iter 220/447 - loss 0.18472852 - time (sec): 13.55 - samples/sec: 3096.46 - lr: 0.000028 - momentum: 0.000000
|
96 |
+
2023-10-13 11:38:14,043 epoch 2 - iter 264/447 - loss 0.17431429 - time (sec): 16.39 - samples/sec: 3085.73 - lr: 0.000028 - momentum: 0.000000
|
97 |
+
2023-10-13 11:38:16,896 epoch 2 - iter 308/447 - loss 0.17213378 - time (sec): 19.24 - samples/sec: 3110.09 - lr: 0.000028 - momentum: 0.000000
|
98 |
+
2023-10-13 11:38:19,523 epoch 2 - iter 352/447 - loss 0.17060590 - time (sec): 21.86 - samples/sec: 3102.96 - lr: 0.000027 - momentum: 0.000000
|
99 |
+
2023-10-13 11:38:22,075 epoch 2 - iter 396/447 - loss 0.16910987 - time (sec): 24.42 - samples/sec: 3108.60 - lr: 0.000027 - momentum: 0.000000
|
100 |
+
2023-10-13 11:38:25,003 epoch 2 - iter 440/447 - loss 0.16480794 - time (sec): 27.34 - samples/sec: 3120.88 - lr: 0.000027 - momentum: 0.000000
|
101 |
+
2023-10-13 11:38:25,414 ----------------------------------------------------------------------------------------------------
|
102 |
+
2023-10-13 11:38:25,415 EPOCH 2 done: loss 0.1637 - lr: 0.000027
|
103 |
+
2023-10-13 11:38:33,914 DEV : loss 0.1275636851787567 - f1-score (micro avg) 0.6914
|
104 |
+
2023-10-13 11:38:33,954 saving best model
|
105 |
+
2023-10-13 11:38:34,472 ----------------------------------------------------------------------------------------------------
|
106 |
+
2023-10-13 11:38:37,334 epoch 3 - iter 44/447 - loss 0.09103184 - time (sec): 2.86 - samples/sec: 2697.32 - lr: 0.000026 - momentum: 0.000000
|
107 |
+
2023-10-13 11:38:40,169 epoch 3 - iter 88/447 - loss 0.08197685 - time (sec): 5.69 - samples/sec: 2802.74 - lr: 0.000026 - momentum: 0.000000
|
108 |
+
2023-10-13 11:38:43,026 epoch 3 - iter 132/447 - loss 0.08754244 - time (sec): 8.55 - samples/sec: 2808.32 - lr: 0.000026 - momentum: 0.000000
|
109 |
+
2023-10-13 11:38:46,171 epoch 3 - iter 176/447 - loss 0.08248814 - time (sec): 11.70 - samples/sec: 2810.03 - lr: 0.000025 - momentum: 0.000000
|
110 |
+
2023-10-13 11:38:49,406 epoch 3 - iter 220/447 - loss 0.08252264 - time (sec): 14.93 - samples/sec: 2808.58 - lr: 0.000025 - momentum: 0.000000
|
111 |
+
2023-10-13 11:38:52,266 epoch 3 - iter 264/447 - loss 0.07964287 - time (sec): 17.79 - samples/sec: 2840.47 - lr: 0.000025 - momentum: 0.000000
|
112 |
+
2023-10-13 11:38:55,261 epoch 3 - iter 308/447 - loss 0.08228040 - time (sec): 20.79 - samples/sec: 2845.47 - lr: 0.000024 - momentum: 0.000000
|
113 |
+
2023-10-13 11:38:58,300 epoch 3 - iter 352/447 - loss 0.08248586 - time (sec): 23.83 - samples/sec: 2845.71 - lr: 0.000024 - momentum: 0.000000
|
114 |
+
2023-10-13 11:39:01,073 epoch 3 - iter 396/447 - loss 0.08367732 - time (sec): 26.60 - samples/sec: 2862.02 - lr: 0.000024 - momentum: 0.000000
|
115 |
+
2023-10-13 11:39:04,286 epoch 3 - iter 440/447 - loss 0.08345599 - time (sec): 29.81 - samples/sec: 2866.59 - lr: 0.000023 - momentum: 0.000000
|
116 |
+
2023-10-13 11:39:04,678 ----------------------------------------------------------------------------------------------------
|
117 |
+
2023-10-13 11:39:04,679 EPOCH 3 done: loss 0.0835 - lr: 0.000023
|
118 |
+
2023-10-13 11:39:13,300 DEV : loss 0.12531331181526184 - f1-score (micro avg) 0.736
|
119 |
+
2023-10-13 11:39:13,334 saving best model
|
120 |
+
2023-10-13 11:39:13,819 ----------------------------------------------------------------------------------------------------
|
121 |
+
2023-10-13 11:39:16,553 epoch 4 - iter 44/447 - loss 0.06081512 - time (sec): 2.73 - samples/sec: 3280.70 - lr: 0.000023 - momentum: 0.000000
|
122 |
+
2023-10-13 11:39:19,128 epoch 4 - iter 88/447 - loss 0.06130093 - time (sec): 5.31 - samples/sec: 3203.74 - lr: 0.000023 - momentum: 0.000000
|
123 |
+
2023-10-13 11:39:22,055 epoch 4 - iter 132/447 - loss 0.05727770 - time (sec): 8.23 - samples/sec: 3163.36 - lr: 0.000022 - momentum: 0.000000
|
124 |
+
2023-10-13 11:39:25,097 epoch 4 - iter 176/447 - loss 0.05437708 - time (sec): 11.28 - samples/sec: 3164.29 - lr: 0.000022 - momentum: 0.000000
|
125 |
+
2023-10-13 11:39:27,953 epoch 4 - iter 220/447 - loss 0.05094653 - time (sec): 14.13 - samples/sec: 3138.28 - lr: 0.000022 - momentum: 0.000000
|
126 |
+
2023-10-13 11:39:30,798 epoch 4 - iter 264/447 - loss 0.05163494 - time (sec): 16.98 - samples/sec: 3120.94 - lr: 0.000021 - momentum: 0.000000
|
127 |
+
2023-10-13 11:39:33,386 epoch 4 - iter 308/447 - loss 0.05153137 - time (sec): 19.57 - samples/sec: 3133.65 - lr: 0.000021 - momentum: 0.000000
|
128 |
+
2023-10-13 11:39:36,130 epoch 4 - iter 352/447 - loss 0.05076236 - time (sec): 22.31 - samples/sec: 3123.62 - lr: 0.000021 - momentum: 0.000000
|
129 |
+
2023-10-13 11:39:38,564 epoch 4 - iter 396/447 - loss 0.04840169 - time (sec): 24.74 - samples/sec: 3106.57 - lr: 0.000020 - momentum: 0.000000
|
130 |
+
2023-10-13 11:39:41,393 epoch 4 - iter 440/447 - loss 0.04853069 - time (sec): 27.57 - samples/sec: 3096.09 - lr: 0.000020 - momentum: 0.000000
|
131 |
+
2023-10-13 11:39:41,794 ----------------------------------------------------------------------------------------------------
|
132 |
+
2023-10-13 11:39:41,794 EPOCH 4 done: loss 0.0484 - lr: 0.000020
|
133 |
+
2023-10-13 11:39:50,889 DEV : loss 0.1411086916923523 - f1-score (micro avg) 0.7523
|
134 |
+
2023-10-13 11:39:50,925 saving best model
|
135 |
+
2023-10-13 11:39:51,411 ----------------------------------------------------------------------------------------------------
|
136 |
+
2023-10-13 11:39:54,906 epoch 5 - iter 44/447 - loss 0.03860730 - time (sec): 3.49 - samples/sec: 2760.43 - lr: 0.000020 - momentum: 0.000000
|
137 |
+
2023-10-13 11:39:57,725 epoch 5 - iter 88/447 - loss 0.03556237 - time (sec): 6.31 - samples/sec: 2793.51 - lr: 0.000019 - momentum: 0.000000
|
138 |
+
2023-10-13 11:40:00,832 epoch 5 - iter 132/447 - loss 0.03253877 - time (sec): 9.42 - samples/sec: 2787.23 - lr: 0.000019 - momentum: 0.000000
|
139 |
+
2023-10-13 11:40:03,721 epoch 5 - iter 176/447 - loss 0.03527827 - time (sec): 12.31 - samples/sec: 2792.55 - lr: 0.000019 - momentum: 0.000000
|
140 |
+
2023-10-13 11:40:06,971 epoch 5 - iter 220/447 - loss 0.03375370 - time (sec): 15.56 - samples/sec: 2788.49 - lr: 0.000018 - momentum: 0.000000
|
141 |
+
2023-10-13 11:40:09,908 epoch 5 - iter 264/447 - loss 0.03389079 - time (sec): 18.50 - samples/sec: 2812.29 - lr: 0.000018 - momentum: 0.000000
|
142 |
+
2023-10-13 11:40:12,805 epoch 5 - iter 308/447 - loss 0.03159410 - time (sec): 21.39 - samples/sec: 2808.51 - lr: 0.000018 - momentum: 0.000000
|
143 |
+
2023-10-13 11:40:15,843 epoch 5 - iter 352/447 - loss 0.03133658 - time (sec): 24.43 - samples/sec: 2814.90 - lr: 0.000017 - momentum: 0.000000
|
144 |
+
2023-10-13 11:40:18,798 epoch 5 - iter 396/447 - loss 0.03185176 - time (sec): 27.38 - samples/sec: 2799.64 - lr: 0.000017 - momentum: 0.000000
|
145 |
+
2023-10-13 11:40:21,613 epoch 5 - iter 440/447 - loss 0.03293362 - time (sec): 30.20 - samples/sec: 2823.75 - lr: 0.000017 - momentum: 0.000000
|
146 |
+
2023-10-13 11:40:22,058 ----------------------------------------------------------------------------------------------------
|
147 |
+
2023-10-13 11:40:22,058 EPOCH 5 done: loss 0.0331 - lr: 0.000017
|
148 |
+
2023-10-13 11:40:30,534 DEV : loss 0.17277590930461884 - f1-score (micro avg) 0.775
|
149 |
+
2023-10-13 11:40:30,562 saving best model
|
150 |
+
2023-10-13 11:40:31,156 ----------------------------------------------------------------------------------------------------
|
151 |
+
2023-10-13 11:40:34,245 epoch 6 - iter 44/447 - loss 0.01641274 - time (sec): 3.09 - samples/sec: 2782.37 - lr: 0.000016 - momentum: 0.000000
|
152 |
+
2023-10-13 11:40:36,970 epoch 6 - iter 88/447 - loss 0.02088310 - time (sec): 5.81 - samples/sec: 2784.85 - lr: 0.000016 - momentum: 0.000000
|
153 |
+
2023-10-13 11:40:40,026 epoch 6 - iter 132/447 - loss 0.02059010 - time (sec): 8.87 - samples/sec: 2823.72 - lr: 0.000016 - momentum: 0.000000
|
154 |
+
2023-10-13 11:40:43,022 epoch 6 - iter 176/447 - loss 0.02145822 - time (sec): 11.87 - samples/sec: 2862.37 - lr: 0.000015 - momentum: 0.000000
|
155 |
+
2023-10-13 11:40:45,676 epoch 6 - iter 220/447 - loss 0.02167903 - time (sec): 14.52 - samples/sec: 2861.66 - lr: 0.000015 - momentum: 0.000000
|
156 |
+
2023-10-13 11:40:48,507 epoch 6 - iter 264/447 - loss 0.02226561 - time (sec): 17.35 - samples/sec: 2858.77 - lr: 0.000015 - momentum: 0.000000
|
157 |
+
2023-10-13 11:40:51,255 epoch 6 - iter 308/447 - loss 0.02320899 - time (sec): 20.10 - samples/sec: 2855.54 - lr: 0.000014 - momentum: 0.000000
|
158 |
+
2023-10-13 11:40:54,001 epoch 6 - iter 352/447 - loss 0.02387174 - time (sec): 22.84 - samples/sec: 2888.48 - lr: 0.000014 - momentum: 0.000000
|
159 |
+
2023-10-13 11:40:57,475 epoch 6 - iter 396/447 - loss 0.02377590 - time (sec): 26.32 - samples/sec: 2894.39 - lr: 0.000014 - momentum: 0.000000
|
160 |
+
2023-10-13 11:41:00,559 epoch 6 - iter 440/447 - loss 0.02324946 - time (sec): 29.40 - samples/sec: 2898.65 - lr: 0.000013 - momentum: 0.000000
|
161 |
+
2023-10-13 11:41:00,995 ----------------------------------------------------------------------------------------------------
|
162 |
+
2023-10-13 11:41:00,995 EPOCH 6 done: loss 0.0233 - lr: 0.000013
|
163 |
+
2023-10-13 11:41:09,443 DEV : loss 0.18894143402576447 - f1-score (micro avg) 0.7728
|
164 |
+
2023-10-13 11:41:09,472 ----------------------------------------------------------------------------------------------------
|
165 |
+
2023-10-13 11:41:12,375 epoch 7 - iter 44/447 - loss 0.02022617 - time (sec): 2.90 - samples/sec: 3011.72 - lr: 0.000013 - momentum: 0.000000
|
166 |
+
2023-10-13 11:41:15,181 epoch 7 - iter 88/447 - loss 0.01788001 - time (sec): 5.71 - samples/sec: 2957.18 - lr: 0.000013 - momentum: 0.000000
|
167 |
+
2023-10-13 11:41:18,744 epoch 7 - iter 132/447 - loss 0.01554472 - time (sec): 9.27 - samples/sec: 2910.05 - lr: 0.000012 - momentum: 0.000000
|
168 |
+
2023-10-13 11:41:21,757 epoch 7 - iter 176/447 - loss 0.01521381 - time (sec): 12.28 - samples/sec: 2875.62 - lr: 0.000012 - momentum: 0.000000
|
169 |
+
2023-10-13 11:41:24,670 epoch 7 - iter 220/447 - loss 0.01618997 - time (sec): 15.20 - samples/sec: 2899.60 - lr: 0.000012 - momentum: 0.000000
|
170 |
+
2023-10-13 11:41:27,422 epoch 7 - iter 264/447 - loss 0.01631437 - time (sec): 17.95 - samples/sec: 2899.23 - lr: 0.000011 - momentum: 0.000000
|
171 |
+
2023-10-13 11:41:30,376 epoch 7 - iter 308/447 - loss 0.01443128 - time (sec): 20.90 - samples/sec: 2878.30 - lr: 0.000011 - momentum: 0.000000
|
172 |
+
2023-10-13 11:41:33,306 epoch 7 - iter 352/447 - loss 0.01478556 - time (sec): 23.83 - samples/sec: 2873.89 - lr: 0.000011 - momentum: 0.000000
|
173 |
+
2023-10-13 11:41:36,079 epoch 7 - iter 396/447 - loss 0.01588076 - time (sec): 26.61 - samples/sec: 2862.24 - lr: 0.000010 - momentum: 0.000000
|
174 |
+
2023-10-13 11:41:38,818 epoch 7 - iter 440/447 - loss 0.01554793 - time (sec): 29.35 - samples/sec: 2872.11 - lr: 0.000010 - momentum: 0.000000
|
175 |
+
2023-10-13 11:41:39,547 ----------------------------------------------------------------------------------------------------
|
176 |
+
2023-10-13 11:41:39,548 EPOCH 7 done: loss 0.0151 - lr: 0.000010
|
177 |
+
2023-10-13 11:41:48,012 DEV : loss 0.20179197192192078 - f1-score (micro avg) 0.7745
|
178 |
+
2023-10-13 11:41:48,038 ----------------------------------------------------------------------------------------------------
|
179 |
+
2023-10-13 11:41:51,122 epoch 8 - iter 44/447 - loss 0.01174812 - time (sec): 3.08 - samples/sec: 2779.76 - lr: 0.000010 - momentum: 0.000000
|
180 |
+
2023-10-13 11:41:54,327 epoch 8 - iter 88/447 - loss 0.01125552 - time (sec): 6.29 - samples/sec: 2797.47 - lr: 0.000009 - momentum: 0.000000
|
181 |
+
2023-10-13 11:41:57,342 epoch 8 - iter 132/447 - loss 0.01066657 - time (sec): 9.30 - samples/sec: 2866.41 - lr: 0.000009 - momentum: 0.000000
|
182 |
+
2023-10-13 11:42:00,586 epoch 8 - iter 176/447 - loss 0.00942978 - time (sec): 12.55 - samples/sec: 2878.82 - lr: 0.000009 - momentum: 0.000000
|
183 |
+
2023-10-13 11:42:03,430 epoch 8 - iter 220/447 - loss 0.01169514 - time (sec): 15.39 - samples/sec: 2852.35 - lr: 0.000008 - momentum: 0.000000
|
184 |
+
2023-10-13 11:42:06,472 epoch 8 - iter 264/447 - loss 0.01206510 - time (sec): 18.43 - samples/sec: 2820.51 - lr: 0.000008 - momentum: 0.000000
|
185 |
+
2023-10-13 11:42:09,416 epoch 8 - iter 308/447 - loss 0.01170503 - time (sec): 21.38 - samples/sec: 2850.99 - lr: 0.000008 - momentum: 0.000000
|
186 |
+
2023-10-13 11:42:12,188 epoch 8 - iter 352/447 - loss 0.01173590 - time (sec): 24.15 - samples/sec: 2867.64 - lr: 0.000007 - momentum: 0.000000
|
187 |
+
2023-10-13 11:42:15,076 epoch 8 - iter 396/447 - loss 0.01118873 - time (sec): 27.04 - samples/sec: 2862.49 - lr: 0.000007 - momentum: 0.000000
|
188 |
+
2023-10-13 11:42:17,941 epoch 8 - iter 440/447 - loss 0.01135881 - time (sec): 29.90 - samples/sec: 2852.70 - lr: 0.000007 - momentum: 0.000000
|
189 |
+
2023-10-13 11:42:18,361 ----------------------------------------------------------------------------------------------------
|
190 |
+
2023-10-13 11:42:18,361 EPOCH 8 done: loss 0.0112 - lr: 0.000007
|
191 |
+
2023-10-13 11:42:26,959 DEV : loss 0.20578144490718842 - f1-score (micro avg) 0.7819
|
192 |
+
2023-10-13 11:42:26,984 saving best model
|
193 |
+
2023-10-13 11:42:27,452 ----------------------------------------------------------------------------------------------------
|
194 |
+
2023-10-13 11:42:30,327 epoch 9 - iter 44/447 - loss 0.01161954 - time (sec): 2.87 - samples/sec: 2837.49 - lr: 0.000006 - momentum: 0.000000
|
195 |
+
2023-10-13 11:42:33,371 epoch 9 - iter 88/447 - loss 0.00638656 - time (sec): 5.92 - samples/sec: 2937.51 - lr: 0.000006 - momentum: 0.000000
|
196 |
+
2023-10-13 11:42:36,454 epoch 9 - iter 132/447 - loss 0.00666210 - time (sec): 9.00 - samples/sec: 2858.86 - lr: 0.000006 - momentum: 0.000000
|
197 |
+
2023-10-13 11:42:39,560 epoch 9 - iter 176/447 - loss 0.00568668 - time (sec): 12.10 - samples/sec: 2879.22 - lr: 0.000005 - momentum: 0.000000
|
198 |
+
2023-10-13 11:42:42,889 epoch 9 - iter 220/447 - loss 0.00591880 - time (sec): 15.43 - samples/sec: 2829.31 - lr: 0.000005 - momentum: 0.000000
|
199 |
+
2023-10-13 11:42:45,648 epoch 9 - iter 264/447 - loss 0.00762558 - time (sec): 18.19 - samples/sec: 2851.85 - lr: 0.000005 - momentum: 0.000000
|
200 |
+
2023-10-13 11:42:48,749 epoch 9 - iter 308/447 - loss 0.00714813 - time (sec): 21.29 - samples/sec: 2886.12 - lr: 0.000004 - momentum: 0.000000
|
201 |
+
2023-10-13 11:42:51,467 epoch 9 - iter 352/447 - loss 0.00676372 - time (sec): 24.01 - samples/sec: 2893.55 - lr: 0.000004 - momentum: 0.000000
|
202 |
+
2023-10-13 11:42:54,092 epoch 9 - iter 396/447 - loss 0.00645925 - time (sec): 26.64 - samples/sec: 2905.73 - lr: 0.000004 - momentum: 0.000000
|
203 |
+
2023-10-13 11:42:56,930 epoch 9 - iter 440/447 - loss 0.00653268 - time (sec): 29.47 - samples/sec: 2894.96 - lr: 0.000003 - momentum: 0.000000
|
204 |
+
2023-10-13 11:42:57,338 ----------------------------------------------------------------------------------------------------
|
205 |
+
2023-10-13 11:42:57,338 EPOCH 9 done: loss 0.0066 - lr: 0.000003
|
206 |
+
2023-10-13 11:43:05,394 DEV : loss 0.22298528254032135 - f1-score (micro avg) 0.7826
|
207 |
+
2023-10-13 11:43:05,420 saving best model
|
208 |
+
2023-10-13 11:43:05,878 ----------------------------------------------------------------------------------------------------
|
209 |
+
2023-10-13 11:43:08,761 epoch 10 - iter 44/447 - loss 0.00685104 - time (sec): 2.88 - samples/sec: 3017.48 - lr: 0.000003 - momentum: 0.000000
|
210 |
+
2023-10-13 11:43:11,462 epoch 10 - iter 88/447 - loss 0.00441715 - time (sec): 5.58 - samples/sec: 2963.15 - lr: 0.000003 - momentum: 0.000000
|
211 |
+
2023-10-13 11:43:14,101 epoch 10 - iter 132/447 - loss 0.00480244 - time (sec): 8.22 - samples/sec: 3045.37 - lr: 0.000002 - momentum: 0.000000
|
212 |
+
2023-10-13 11:43:16,895 epoch 10 - iter 176/447 - loss 0.00463435 - time (sec): 11.01 - samples/sec: 3056.83 - lr: 0.000002 - momentum: 0.000000
|
213 |
+
2023-10-13 11:43:19,970 epoch 10 - iter 220/447 - loss 0.00571713 - time (sec): 14.09 - samples/sec: 3038.35 - lr: 0.000002 - momentum: 0.000000
|
214 |
+
2023-10-13 11:43:23,223 epoch 10 - iter 264/447 - loss 0.00547250 - time (sec): 17.34 - samples/sec: 2971.42 - lr: 0.000001 - momentum: 0.000000
|
215 |
+
2023-10-13 11:43:26,251 epoch 10 - iter 308/447 - loss 0.00523613 - time (sec): 20.37 - samples/sec: 2963.30 - lr: 0.000001 - momentum: 0.000000
|
216 |
+
2023-10-13 11:43:28,858 epoch 10 - iter 352/447 - loss 0.00531243 - time (sec): 22.98 - samples/sec: 2974.47 - lr: 0.000001 - momentum: 0.000000
|
217 |
+
2023-10-13 11:43:31,533 epoch 10 - iter 396/447 - loss 0.00545590 - time (sec): 25.65 - samples/sec: 2979.26 - lr: 0.000000 - momentum: 0.000000
|
218 |
+
2023-10-13 11:43:34,543 epoch 10 - iter 440/447 - loss 0.00532315 - time (sec): 28.66 - samples/sec: 2963.33 - lr: 0.000000 - momentum: 0.000000
|
219 |
+
2023-10-13 11:43:35,033 ----------------------------------------------------------------------------------------------------
|
220 |
+
2023-10-13 11:43:35,034 EPOCH 10 done: loss 0.0052 - lr: 0.000000
|
221 |
+
2023-10-13 11:43:43,137 DEV : loss 0.2202497124671936 - f1-score (micro avg) 0.7904
|
222 |
+
2023-10-13 11:43:43,163 saving best model
|
223 |
+
2023-10-13 11:43:44,012 ----------------------------------------------------------------------------------------------------
|
224 |
+
2023-10-13 11:43:44,013 Loading model from best epoch ...
|
225 |
+
2023-10-13 11:43:45,811 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
|
226 |
+
2023-10-13 11:43:50,442
|
227 |
+
Results:
|
228 |
+
- F-score (micro) 0.7564
|
229 |
+
- F-score (macro) 0.6816
|
230 |
+
- Accuracy 0.6279
|
231 |
+
|
232 |
+
By class:
|
233 |
+
precision recall f1-score support
|
234 |
+
|
235 |
+
loc 0.8413 0.8540 0.8476 596
|
236 |
+
pers 0.6805 0.7868 0.7298 333
|
237 |
+
org 0.4885 0.4848 0.4867 132
|
238 |
+
prod 0.6852 0.5606 0.6167 66
|
239 |
+
time 0.7200 0.7347 0.7273 49
|
240 |
+
|
241 |
+
micro avg 0.7412 0.7721 0.7564 1176
|
242 |
+
macro avg 0.6831 0.6842 0.6816 1176
|
243 |
+
weighted avg 0.7424 0.7721 0.7558 1176
|
244 |
+
|
245 |
+
2023-10-13 11:43:50,442 ----------------------------------------------------------------------------------------------------
|