Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697675670.46dc0c540dd0.3802.8 +3 -0
- test.tsv +0 -0
- training.log +246 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:579d4039d87d4a309bd48c8f69a495d4779f4a01c96e7a947d19116c423c7686
|
3 |
+
size 19045922
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 00:35:27 0.0000 0.6964 0.1841 0.2006 0.1510 0.1723 0.0973
|
3 |
+
2 00:36:28 0.0000 0.1886 0.1665 0.3351 0.4416 0.3810 0.2446
|
4 |
+
3 00:37:28 0.0000 0.1594 0.1696 0.3783 0.3661 0.3721 0.2356
|
5 |
+
4 00:38:29 0.0000 0.1470 0.1681 0.3941 0.5618 0.4632 0.3127
|
6 |
+
5 00:39:29 0.0000 0.1339 0.1747 0.4094 0.5400 0.4657 0.3157
|
7 |
+
6 00:40:30 0.0000 0.1257 0.1802 0.4145 0.5767 0.4823 0.3298
|
8 |
+
7 00:41:31 0.0000 0.1186 0.1852 0.4338 0.5664 0.4913 0.3363
|
9 |
+
8 00:42:31 0.0000 0.1137 0.1906 0.4294 0.5847 0.4952 0.3395
|
10 |
+
9 00:43:32 0.0000 0.1109 0.1927 0.4227 0.6167 0.5016 0.3468
|
11 |
+
10 00:44:33 0.0000 0.1090 0.1961 0.4253 0.6121 0.5019 0.3470
|
runs/events.out.tfevents.1697675670.46dc0c540dd0.3802.8
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2af6452c2a00f3d0ea625ab1136942ca9cba9810d723d82c645deb29ceeebfa0
|
3 |
+
size 2030580
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,246 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-19 00:34:30,183 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-19 00:34:30,183 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 128)
|
7 |
+
(position_embeddings): Embedding(512, 128)
|
8 |
+
(token_type_embeddings): Embedding(2, 128)
|
9 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-1): 2 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=128, out_features=128, bias=True)
|
18 |
+
(key): Linear(in_features=128, out_features=128, bias=True)
|
19 |
+
(value): Linear(in_features=128, out_features=128, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=128, out_features=128, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=128, out_features=512, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=512, out_features=128, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=128, out_features=128, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=128, out_features=13, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-19 00:34:30,183 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-19 00:34:30,183 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
|
53 |
+
2023-10-19 00:34:30,183 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-19 00:34:30,183 Train: 14465 sentences
|
55 |
+
2023-10-19 00:34:30,183 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-19 00:34:30,183 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-19 00:34:30,183 Training Params:
|
58 |
+
2023-10-19 00:34:30,183 - learning_rate: "3e-05"
|
59 |
+
2023-10-19 00:34:30,183 - mini_batch_size: "4"
|
60 |
+
2023-10-19 00:34:30,183 - max_epochs: "10"
|
61 |
+
2023-10-19 00:34:30,184 - shuffle: "True"
|
62 |
+
2023-10-19 00:34:30,184 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-19 00:34:30,184 Plugins:
|
64 |
+
2023-10-19 00:34:30,184 - TensorboardLogger
|
65 |
+
2023-10-19 00:34:30,184 - LinearScheduler | warmup_fraction: '0.1'
|
66 |
+
2023-10-19 00:34:30,184 ----------------------------------------------------------------------------------------------------
|
67 |
+
2023-10-19 00:34:30,184 Final evaluation on model from best epoch (best-model.pt)
|
68 |
+
2023-10-19 00:34:30,184 - metric: "('micro avg', 'f1-score')"
|
69 |
+
2023-10-19 00:34:30,184 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-19 00:34:30,184 Computation:
|
71 |
+
2023-10-19 00:34:30,184 - compute on device: cuda:0
|
72 |
+
2023-10-19 00:34:30,184 - embedding storage: none
|
73 |
+
2023-10-19 00:34:30,184 ----------------------------------------------------------------------------------------------------
|
74 |
+
2023-10-19 00:34:30,184 Model training base path: "hmbench-letemps/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
|
75 |
+
2023-10-19 00:34:30,184 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-19 00:34:30,184 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-10-19 00:34:30,184 Logging anything other than scalars to TensorBoard is currently not supported.
|
78 |
+
2023-10-19 00:34:35,918 epoch 1 - iter 361/3617 - loss 2.91805740 - time (sec): 5.73 - samples/sec: 6815.20 - lr: 0.000003 - momentum: 0.000000
|
79 |
+
2023-10-19 00:34:41,682 epoch 1 - iter 722/3617 - loss 2.25303714 - time (sec): 11.50 - samples/sec: 6622.87 - lr: 0.000006 - momentum: 0.000000
|
80 |
+
2023-10-19 00:34:47,406 epoch 1 - iter 1083/3617 - loss 1.66388576 - time (sec): 17.22 - samples/sec: 6640.71 - lr: 0.000009 - momentum: 0.000000
|
81 |
+
2023-10-19 00:34:53,049 epoch 1 - iter 1444/3617 - loss 1.33761956 - time (sec): 22.86 - samples/sec: 6653.83 - lr: 0.000012 - momentum: 0.000000
|
82 |
+
2023-10-19 00:34:58,249 epoch 1 - iter 1805/3617 - loss 1.12626075 - time (sec): 28.06 - samples/sec: 6840.05 - lr: 0.000015 - momentum: 0.000000
|
83 |
+
2023-10-19 00:35:03,842 epoch 1 - iter 2166/3617 - loss 0.98677877 - time (sec): 33.66 - samples/sec: 6847.55 - lr: 0.000018 - momentum: 0.000000
|
84 |
+
2023-10-19 00:35:09,452 epoch 1 - iter 2527/3617 - loss 0.89109170 - time (sec): 39.27 - samples/sec: 6783.21 - lr: 0.000021 - momentum: 0.000000
|
85 |
+
2023-10-19 00:35:14,571 epoch 1 - iter 2888/3617 - loss 0.81168511 - time (sec): 44.39 - samples/sec: 6859.56 - lr: 0.000024 - momentum: 0.000000
|
86 |
+
2023-10-19 00:35:19,754 epoch 1 - iter 3249/3617 - loss 0.74884197 - time (sec): 49.57 - samples/sec: 6883.09 - lr: 0.000027 - momentum: 0.000000
|
87 |
+
2023-10-19 00:35:25,488 epoch 1 - iter 3610/3617 - loss 0.69704141 - time (sec): 55.30 - samples/sec: 6861.62 - lr: 0.000030 - momentum: 0.000000
|
88 |
+
2023-10-19 00:35:25,588 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-19 00:35:25,588 EPOCH 1 done: loss 0.6964 - lr: 0.000030
|
90 |
+
2023-10-19 00:35:27,894 DEV : loss 0.18414448201656342 - f1-score (micro avg) 0.1723
|
91 |
+
2023-10-19 00:35:27,923 saving best model
|
92 |
+
2023-10-19 00:35:27,957 ----------------------------------------------------------------------------------------------------
|
93 |
+
2023-10-19 00:35:33,423 epoch 2 - iter 361/3617 - loss 0.20767271 - time (sec): 5.46 - samples/sec: 6898.30 - lr: 0.000030 - momentum: 0.000000
|
94 |
+
2023-10-19 00:35:39,165 epoch 2 - iter 722/3617 - loss 0.20857724 - time (sec): 11.21 - samples/sec: 6785.54 - lr: 0.000029 - momentum: 0.000000
|
95 |
+
2023-10-19 00:35:44,839 epoch 2 - iter 1083/3617 - loss 0.20034477 - time (sec): 16.88 - samples/sec: 6781.60 - lr: 0.000029 - momentum: 0.000000
|
96 |
+
2023-10-19 00:35:50,478 epoch 2 - iter 1444/3617 - loss 0.19662744 - time (sec): 22.52 - samples/sec: 6670.61 - lr: 0.000029 - momentum: 0.000000
|
97 |
+
2023-10-19 00:35:56,146 epoch 2 - iter 1805/3617 - loss 0.19587540 - time (sec): 28.19 - samples/sec: 6615.87 - lr: 0.000028 - momentum: 0.000000
|
98 |
+
2023-10-19 00:36:01,731 epoch 2 - iter 2166/3617 - loss 0.19490222 - time (sec): 33.77 - samples/sec: 6691.46 - lr: 0.000028 - momentum: 0.000000
|
99 |
+
2023-10-19 00:36:07,501 epoch 2 - iter 2527/3617 - loss 0.19331286 - time (sec): 39.54 - samples/sec: 6672.70 - lr: 0.000028 - momentum: 0.000000
|
100 |
+
2023-10-19 00:36:13,181 epoch 2 - iter 2888/3617 - loss 0.19208740 - time (sec): 45.22 - samples/sec: 6649.22 - lr: 0.000027 - momentum: 0.000000
|
101 |
+
2023-10-19 00:36:18,875 epoch 2 - iter 3249/3617 - loss 0.18952749 - time (sec): 50.92 - samples/sec: 6673.34 - lr: 0.000027 - momentum: 0.000000
|
102 |
+
2023-10-19 00:36:24,598 epoch 2 - iter 3610/3617 - loss 0.18852611 - time (sec): 56.64 - samples/sec: 6696.51 - lr: 0.000027 - momentum: 0.000000
|
103 |
+
2023-10-19 00:36:24,705 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-19 00:36:24,706 EPOCH 2 done: loss 0.1886 - lr: 0.000027
|
105 |
+
2023-10-19 00:36:28,635 DEV : loss 0.1664983630180359 - f1-score (micro avg) 0.381
|
106 |
+
2023-10-19 00:36:28,663 saving best model
|
107 |
+
2023-10-19 00:36:28,696 ----------------------------------------------------------------------------------------------------
|
108 |
+
2023-10-19 00:36:34,434 epoch 3 - iter 361/3617 - loss 0.15091050 - time (sec): 5.74 - samples/sec: 6577.68 - lr: 0.000026 - momentum: 0.000000
|
109 |
+
2023-10-19 00:36:40,116 epoch 3 - iter 722/3617 - loss 0.15277438 - time (sec): 11.42 - samples/sec: 6633.98 - lr: 0.000026 - momentum: 0.000000
|
110 |
+
2023-10-19 00:36:45,517 epoch 3 - iter 1083/3617 - loss 0.15890046 - time (sec): 16.82 - samples/sec: 6773.25 - lr: 0.000026 - momentum: 0.000000
|
111 |
+
2023-10-19 00:36:51,398 epoch 3 - iter 1444/3617 - loss 0.16323482 - time (sec): 22.70 - samples/sec: 6688.62 - lr: 0.000025 - momentum: 0.000000
|
112 |
+
2023-10-19 00:36:57,123 epoch 3 - iter 1805/3617 - loss 0.15977729 - time (sec): 28.43 - samples/sec: 6707.84 - lr: 0.000025 - momentum: 0.000000
|
113 |
+
2023-10-19 00:37:02,800 epoch 3 - iter 2166/3617 - loss 0.16133560 - time (sec): 34.10 - samples/sec: 6683.91 - lr: 0.000025 - momentum: 0.000000
|
114 |
+
2023-10-19 00:37:08,622 epoch 3 - iter 2527/3617 - loss 0.16171229 - time (sec): 39.92 - samples/sec: 6680.19 - lr: 0.000024 - momentum: 0.000000
|
115 |
+
2023-10-19 00:37:14,112 epoch 3 - iter 2888/3617 - loss 0.16077563 - time (sec): 45.41 - samples/sec: 6699.91 - lr: 0.000024 - momentum: 0.000000
|
116 |
+
2023-10-19 00:37:19,821 epoch 3 - iter 3249/3617 - loss 0.15940779 - time (sec): 51.12 - samples/sec: 6685.32 - lr: 0.000024 - momentum: 0.000000
|
117 |
+
2023-10-19 00:37:25,531 epoch 3 - iter 3610/3617 - loss 0.15948787 - time (sec): 56.83 - samples/sec: 6671.03 - lr: 0.000023 - momentum: 0.000000
|
118 |
+
2023-10-19 00:37:25,641 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-19 00:37:25,641 EPOCH 3 done: loss 0.1594 - lr: 0.000023
|
120 |
+
2023-10-19 00:37:28,812 DEV : loss 0.16962358355522156 - f1-score (micro avg) 0.3721
|
121 |
+
2023-10-19 00:37:28,839 ----------------------------------------------------------------------------------------------------
|
122 |
+
2023-10-19 00:37:34,732 epoch 4 - iter 361/3617 - loss 0.14232155 - time (sec): 5.89 - samples/sec: 6302.73 - lr: 0.000023 - momentum: 0.000000
|
123 |
+
2023-10-19 00:37:40,531 epoch 4 - iter 722/3617 - loss 0.14496426 - time (sec): 11.69 - samples/sec: 6505.66 - lr: 0.000023 - momentum: 0.000000
|
124 |
+
2023-10-19 00:37:46,264 epoch 4 - iter 1083/3617 - loss 0.15214303 - time (sec): 17.42 - samples/sec: 6523.89 - lr: 0.000022 - momentum: 0.000000
|
125 |
+
2023-10-19 00:37:52,081 epoch 4 - iter 1444/3617 - loss 0.14974326 - time (sec): 23.24 - samples/sec: 6539.88 - lr: 0.000022 - momentum: 0.000000
|
126 |
+
2023-10-19 00:37:57,503 epoch 4 - iter 1805/3617 - loss 0.15035754 - time (sec): 28.66 - samples/sec: 6641.02 - lr: 0.000022 - momentum: 0.000000
|
127 |
+
2023-10-19 00:38:02,896 epoch 4 - iter 2166/3617 - loss 0.15039641 - time (sec): 34.06 - samples/sec: 6679.55 - lr: 0.000021 - momentum: 0.000000
|
128 |
+
2023-10-19 00:38:08,575 epoch 4 - iter 2527/3617 - loss 0.14933721 - time (sec): 39.73 - samples/sec: 6636.79 - lr: 0.000021 - momentum: 0.000000
|
129 |
+
2023-10-19 00:38:14,295 epoch 4 - iter 2888/3617 - loss 0.14723742 - time (sec): 45.46 - samples/sec: 6659.06 - lr: 0.000021 - momentum: 0.000000
|
130 |
+
2023-10-19 00:38:19,742 epoch 4 - iter 3249/3617 - loss 0.14686544 - time (sec): 50.90 - samples/sec: 6721.83 - lr: 0.000020 - momentum: 0.000000
|
131 |
+
2023-10-19 00:38:25,385 epoch 4 - iter 3610/3617 - loss 0.14710156 - time (sec): 56.55 - samples/sec: 6702.85 - lr: 0.000020 - momentum: 0.000000
|
132 |
+
2023-10-19 00:38:25,499 ----------------------------------------------------------------------------------------------------
|
133 |
+
2023-10-19 00:38:25,499 EPOCH 4 done: loss 0.1470 - lr: 0.000020
|
134 |
+
2023-10-19 00:38:29,396 DEV : loss 0.16811420023441315 - f1-score (micro avg) 0.4632
|
135 |
+
2023-10-19 00:38:29,424 saving best model
|
136 |
+
2023-10-19 00:38:29,457 ----------------------------------------------------------------------------------------------------
|
137 |
+
2023-10-19 00:38:35,207 epoch 5 - iter 361/3617 - loss 0.14721800 - time (sec): 5.75 - samples/sec: 6158.21 - lr: 0.000020 - momentum: 0.000000
|
138 |
+
2023-10-19 00:38:41,041 epoch 5 - iter 722/3617 - loss 0.14279723 - time (sec): 11.58 - samples/sec: 6442.74 - lr: 0.000019 - momentum: 0.000000
|
139 |
+
2023-10-19 00:38:46,517 epoch 5 - iter 1083/3617 - loss 0.13300252 - time (sec): 17.06 - samples/sec: 6532.03 - lr: 0.000019 - momentum: 0.000000
|
140 |
+
2023-10-19 00:38:52,413 epoch 5 - iter 1444/3617 - loss 0.13201950 - time (sec): 22.95 - samples/sec: 6528.86 - lr: 0.000019 - momentum: 0.000000
|
141 |
+
2023-10-19 00:38:58,140 epoch 5 - iter 1805/3617 - loss 0.13224927 - time (sec): 28.68 - samples/sec: 6504.15 - lr: 0.000018 - momentum: 0.000000
|
142 |
+
2023-10-19 00:39:03,858 epoch 5 - iter 2166/3617 - loss 0.13286286 - time (sec): 34.40 - samples/sec: 6560.32 - lr: 0.000018 - momentum: 0.000000
|
143 |
+
2023-10-19 00:39:09,676 epoch 5 - iter 2527/3617 - loss 0.13266831 - time (sec): 40.22 - samples/sec: 6601.60 - lr: 0.000018 - momentum: 0.000000
|
144 |
+
2023-10-19 00:39:15,284 epoch 5 - iter 2888/3617 - loss 0.13352438 - time (sec): 45.83 - samples/sec: 6614.43 - lr: 0.000017 - momentum: 0.000000
|
145 |
+
2023-10-19 00:39:20,653 epoch 5 - iter 3249/3617 - loss 0.13325418 - time (sec): 51.20 - samples/sec: 6681.12 - lr: 0.000017 - momentum: 0.000000
|
146 |
+
2023-10-19 00:39:26,465 epoch 5 - iter 3610/3617 - loss 0.13392825 - time (sec): 57.01 - samples/sec: 6652.32 - lr: 0.000017 - momentum: 0.000000
|
147 |
+
2023-10-19 00:39:26,599 ----------------------------------------------------------------------------------------------------
|
148 |
+
2023-10-19 00:39:26,600 EPOCH 5 done: loss 0.1339 - lr: 0.000017
|
149 |
+
2023-10-19 00:39:29,796 DEV : loss 0.17465609312057495 - f1-score (micro avg) 0.4657
|
150 |
+
2023-10-19 00:39:29,825 saving best model
|
151 |
+
2023-10-19 00:39:29,859 ----------------------------------------------------------------------------------------------------
|
152 |
+
2023-10-19 00:39:35,592 epoch 6 - iter 361/3617 - loss 0.12127706 - time (sec): 5.73 - samples/sec: 6721.20 - lr: 0.000016 - momentum: 0.000000
|
153 |
+
2023-10-19 00:39:41,260 epoch 6 - iter 722/3617 - loss 0.11778469 - time (sec): 11.40 - samples/sec: 6597.92 - lr: 0.000016 - momentum: 0.000000
|
154 |
+
2023-10-19 00:39:47,020 epoch 6 - iter 1083/3617 - loss 0.11656336 - time (sec): 17.16 - samples/sec: 6622.72 - lr: 0.000016 - momentum: 0.000000
|
155 |
+
2023-10-19 00:39:52,551 epoch 6 - iter 1444/3617 - loss 0.12094593 - time (sec): 22.69 - samples/sec: 6627.71 - lr: 0.000015 - momentum: 0.000000
|
156 |
+
2023-10-19 00:39:58,056 epoch 6 - iter 1805/3617 - loss 0.12103503 - time (sec): 28.20 - samples/sec: 6695.63 - lr: 0.000015 - momentum: 0.000000
|
157 |
+
2023-10-19 00:40:03,764 epoch 6 - iter 2166/3617 - loss 0.12382929 - time (sec): 33.90 - samples/sec: 6676.97 - lr: 0.000015 - momentum: 0.000000
|
158 |
+
2023-10-19 00:40:09,505 epoch 6 - iter 2527/3617 - loss 0.12571836 - time (sec): 39.65 - samples/sec: 6688.00 - lr: 0.000014 - momentum: 0.000000
|
159 |
+
2023-10-19 00:40:15,239 epoch 6 - iter 2888/3617 - loss 0.12660519 - time (sec): 45.38 - samples/sec: 6656.01 - lr: 0.000014 - momentum: 0.000000
|
160 |
+
2023-10-19 00:40:21,099 epoch 6 - iter 3249/3617 - loss 0.12660343 - time (sec): 51.24 - samples/sec: 6643.48 - lr: 0.000014 - momentum: 0.000000
|
161 |
+
2023-10-19 00:40:26,893 epoch 6 - iter 3610/3617 - loss 0.12570085 - time (sec): 57.03 - samples/sec: 6643.19 - lr: 0.000013 - momentum: 0.000000
|
162 |
+
2023-10-19 00:40:27,007 ----------------------------------------------------------------------------------------------------
|
163 |
+
2023-10-19 00:40:27,008 EPOCH 6 done: loss 0.1257 - lr: 0.000013
|
164 |
+
2023-10-19 00:40:30,241 DEV : loss 0.18021412193775177 - f1-score (micro avg) 0.4823
|
165 |
+
2023-10-19 00:40:30,269 saving best model
|
166 |
+
2023-10-19 00:40:30,301 ----------------------------------------------------------------------------------------------------
|
167 |
+
2023-10-19 00:40:36,131 epoch 7 - iter 361/3617 - loss 0.12149979 - time (sec): 5.83 - samples/sec: 6622.41 - lr: 0.000013 - momentum: 0.000000
|
168 |
+
2023-10-19 00:40:41,749 epoch 7 - iter 722/3617 - loss 0.11531578 - time (sec): 11.45 - samples/sec: 6667.10 - lr: 0.000013 - momentum: 0.000000
|
169 |
+
2023-10-19 00:40:47,463 epoch 7 - iter 1083/3617 - loss 0.11569164 - time (sec): 17.16 - samples/sec: 6666.03 - lr: 0.000012 - momentum: 0.000000
|
170 |
+
2023-10-19 00:40:52,785 epoch 7 - iter 1444/3617 - loss 0.11579758 - time (sec): 22.48 - samples/sec: 6774.95 - lr: 0.000012 - momentum: 0.000000
|
171 |
+
2023-10-19 00:40:58,121 epoch 7 - iter 1805/3617 - loss 0.11667595 - time (sec): 27.82 - samples/sec: 6819.17 - lr: 0.000012 - momentum: 0.000000
|
172 |
+
2023-10-19 00:41:03,894 epoch 7 - iter 2166/3617 - loss 0.12102352 - time (sec): 33.59 - samples/sec: 6764.86 - lr: 0.000011 - momentum: 0.000000
|
173 |
+
2023-10-19 00:41:09,701 epoch 7 - iter 2527/3617 - loss 0.12119387 - time (sec): 39.40 - samples/sec: 6738.92 - lr: 0.000011 - momentum: 0.000000
|
174 |
+
2023-10-19 00:41:15,459 epoch 7 - iter 2888/3617 - loss 0.12026071 - time (sec): 45.16 - samples/sec: 6692.51 - lr: 0.000011 - momentum: 0.000000
|
175 |
+
2023-10-19 00:41:21,204 epoch 7 - iter 3249/3617 - loss 0.11944450 - time (sec): 50.90 - samples/sec: 6680.71 - lr: 0.000010 - momentum: 0.000000
|
176 |
+
2023-10-19 00:41:27,019 epoch 7 - iter 3610/3617 - loss 0.11841485 - time (sec): 56.72 - samples/sec: 6679.74 - lr: 0.000010 - momentum: 0.000000
|
177 |
+
2023-10-19 00:41:27,134 ----------------------------------------------------------------------------------------------------
|
178 |
+
2023-10-19 00:41:27,135 EPOCH 7 done: loss 0.1186 - lr: 0.000010
|
179 |
+
2023-10-19 00:41:31,039 DEV : loss 0.1851833611726761 - f1-score (micro avg) 0.4913
|
180 |
+
2023-10-19 00:41:31,067 saving best model
|
181 |
+
2023-10-19 00:41:31,106 ----------------------------------------------------------------------------------------------------
|
182 |
+
2023-10-19 00:41:36,914 epoch 8 - iter 361/3617 - loss 0.11416187 - time (sec): 5.81 - samples/sec: 6707.95 - lr: 0.000010 - momentum: 0.000000
|
183 |
+
2023-10-19 00:41:42,734 epoch 8 - iter 722/3617 - loss 0.10800957 - time (sec): 11.63 - samples/sec: 6724.86 - lr: 0.000009 - momentum: 0.000000
|
184 |
+
2023-10-19 00:41:48,476 epoch 8 - iter 1083/3617 - loss 0.11095705 - time (sec): 17.37 - samples/sec: 6711.31 - lr: 0.000009 - momentum: 0.000000
|
185 |
+
2023-10-19 00:41:54,248 epoch 8 - iter 1444/3617 - loss 0.10817209 - time (sec): 23.14 - samples/sec: 6666.37 - lr: 0.000009 - momentum: 0.000000
|
186 |
+
2023-10-19 00:42:00,062 epoch 8 - iter 1805/3617 - loss 0.11279132 - time (sec): 28.96 - samples/sec: 6669.85 - lr: 0.000008 - momentum: 0.000000
|
187 |
+
2023-10-19 00:42:05,767 epoch 8 - iter 2166/3617 - loss 0.11285781 - time (sec): 34.66 - samples/sec: 6670.09 - lr: 0.000008 - momentum: 0.000000
|
188 |
+
2023-10-19 00:42:11,517 epoch 8 - iter 2527/3617 - loss 0.11301960 - time (sec): 40.41 - samples/sec: 6634.57 - lr: 0.000008 - momentum: 0.000000
|
189 |
+
2023-10-19 00:42:17,117 epoch 8 - iter 2888/3617 - loss 0.11249704 - time (sec): 46.01 - samples/sec: 6632.10 - lr: 0.000007 - momentum: 0.000000
|
190 |
+
2023-10-19 00:42:22,519 epoch 8 - iter 3249/3617 - loss 0.11359347 - time (sec): 51.41 - samples/sec: 6671.77 - lr: 0.000007 - momentum: 0.000000
|
191 |
+
2023-10-19 00:42:28,116 epoch 8 - iter 3610/3617 - loss 0.11389018 - time (sec): 57.01 - samples/sec: 6649.67 - lr: 0.000007 - momentum: 0.000000
|
192 |
+
2023-10-19 00:42:28,224 ----------------------------------------------------------------------------------------------------
|
193 |
+
2023-10-19 00:42:28,224 EPOCH 8 done: loss 0.1137 - lr: 0.000007
|
194 |
+
2023-10-19 00:42:31,463 DEV : loss 0.19062528014183044 - f1-score (micro avg) 0.4952
|
195 |
+
2023-10-19 00:42:31,491 saving best model
|
196 |
+
2023-10-19 00:42:31,528 ----------------------------------------------------------------------------------------------------
|
197 |
+
2023-10-19 00:42:37,260 epoch 9 - iter 361/3617 - loss 0.10534011 - time (sec): 5.73 - samples/sec: 6685.34 - lr: 0.000006 - momentum: 0.000000
|
198 |
+
2023-10-19 00:42:42,983 epoch 9 - iter 722/3617 - loss 0.11077406 - time (sec): 11.45 - samples/sec: 6598.82 - lr: 0.000006 - momentum: 0.000000
|
199 |
+
2023-10-19 00:42:48,253 epoch 9 - iter 1083/3617 - loss 0.10676885 - time (sec): 16.72 - samples/sec: 6791.81 - lr: 0.000006 - momentum: 0.000000
|
200 |
+
2023-10-19 00:42:54,088 epoch 9 - iter 1444/3617 - loss 0.10622678 - time (sec): 22.56 - samples/sec: 6690.49 - lr: 0.000005 - momentum: 0.000000
|
201 |
+
2023-10-19 00:42:59,808 epoch 9 - iter 1805/3617 - loss 0.10776910 - time (sec): 28.28 - samples/sec: 6707.15 - lr: 0.000005 - momentum: 0.000000
|
202 |
+
2023-10-19 00:43:05,595 epoch 9 - iter 2166/3617 - loss 0.11026578 - time (sec): 34.07 - samples/sec: 6654.23 - lr: 0.000005 - momentum: 0.000000
|
203 |
+
2023-10-19 00:43:11,285 epoch 9 - iter 2527/3617 - loss 0.11034727 - time (sec): 39.76 - samples/sec: 6657.01 - lr: 0.000004 - momentum: 0.000000
|
204 |
+
2023-10-19 00:43:17,141 epoch 9 - iter 2888/3617 - loss 0.11070044 - time (sec): 45.61 - samples/sec: 6632.25 - lr: 0.000004 - momentum: 0.000000
|
205 |
+
2023-10-19 00:43:22,779 epoch 9 - iter 3249/3617 - loss 0.10960559 - time (sec): 51.25 - samples/sec: 6625.34 - lr: 0.000004 - momentum: 0.000000
|
206 |
+
2023-10-19 00:43:28,625 epoch 9 - iter 3610/3617 - loss 0.11083827 - time (sec): 57.10 - samples/sec: 6642.00 - lr: 0.000003 - momentum: 0.000000
|
207 |
+
2023-10-19 00:43:28,729 ----------------------------------------------------------------------------------------------------
|
208 |
+
2023-10-19 00:43:28,729 EPOCH 9 done: loss 0.1109 - lr: 0.000003
|
209 |
+
2023-10-19 00:43:31,979 DEV : loss 0.19269128143787384 - f1-score (micro avg) 0.5016
|
210 |
+
2023-10-19 00:43:32,008 saving best model
|
211 |
+
2023-10-19 00:43:32,041 ----------------------------------------------------------------------------------------------------
|
212 |
+
2023-10-19 00:43:38,495 epoch 10 - iter 361/3617 - loss 0.10547873 - time (sec): 6.45 - samples/sec: 5966.48 - lr: 0.000003 - momentum: 0.000000
|
213 |
+
2023-10-19 00:43:44,265 epoch 10 - iter 722/3617 - loss 0.10323462 - time (sec): 12.22 - samples/sec: 6292.25 - lr: 0.000003 - momentum: 0.000000
|
214 |
+
2023-10-19 00:43:49,972 epoch 10 - iter 1083/3617 - loss 0.11136052 - time (sec): 17.93 - samples/sec: 6278.56 - lr: 0.000002 - momentum: 0.000000
|
215 |
+
2023-10-19 00:43:55,711 epoch 10 - iter 1444/3617 - loss 0.10802696 - time (sec): 23.67 - samples/sec: 6385.53 - lr: 0.000002 - momentum: 0.000000
|
216 |
+
2023-10-19 00:44:01,531 epoch 10 - iter 1805/3617 - loss 0.10816177 - time (sec): 29.49 - samples/sec: 6442.13 - lr: 0.000002 - momentum: 0.000000
|
217 |
+
2023-10-19 00:44:07,303 epoch 10 - iter 2166/3617 - loss 0.10603407 - time (sec): 35.26 - samples/sec: 6485.02 - lr: 0.000001 - momentum: 0.000000
|
218 |
+
2023-10-19 00:44:13,016 epoch 10 - iter 2527/3617 - loss 0.10623744 - time (sec): 40.97 - samples/sec: 6481.74 - lr: 0.000001 - momentum: 0.000000
|
219 |
+
2023-10-19 00:44:18,445 epoch 10 - iter 2888/3617 - loss 0.10604407 - time (sec): 46.40 - samples/sec: 6574.08 - lr: 0.000001 - momentum: 0.000000
|
220 |
+
2023-10-19 00:44:24,218 epoch 10 - iter 3249/3617 - loss 0.10773950 - time (sec): 52.18 - samples/sec: 6581.15 - lr: 0.000000 - momentum: 0.000000
|
221 |
+
2023-10-19 00:44:29,920 epoch 10 - iter 3610/3617 - loss 0.10913884 - time (sec): 57.88 - samples/sec: 6551.05 - lr: 0.000000 - momentum: 0.000000
|
222 |
+
2023-10-19 00:44:30,022 ----------------------------------------------------------------------------------------------------
|
223 |
+
2023-10-19 00:44:30,023 EPOCH 10 done: loss 0.1090 - lr: 0.000000
|
224 |
+
2023-10-19 00:44:33,292 DEV : loss 0.1960730254650116 - f1-score (micro avg) 0.5019
|
225 |
+
2023-10-19 00:44:33,321 saving best model
|
226 |
+
2023-10-19 00:44:33,388 ----------------------------------------------------------------------------------------------------
|
227 |
+
2023-10-19 00:44:33,389 Loading model from best epoch ...
|
228 |
+
2023-10-19 00:44:33,469 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
|
229 |
+
2023-10-19 00:44:37,684
|
230 |
+
Results:
|
231 |
+
- F-score (micro) 0.5164
|
232 |
+
- F-score (macro) 0.3449
|
233 |
+
- Accuracy 0.36
|
234 |
+
|
235 |
+
By class:
|
236 |
+
precision recall f1-score support
|
237 |
+
|
238 |
+
loc 0.5194 0.6785 0.5884 591
|
239 |
+
pers 0.3952 0.5126 0.4463 357
|
240 |
+
org 0.0000 0.0000 0.0000 79
|
241 |
+
|
242 |
+
micro avg 0.4729 0.5686 0.5164 1027
|
243 |
+
macro avg 0.3049 0.3970 0.3449 1027
|
244 |
+
weighted avg 0.4363 0.5686 0.4938 1027
|
245 |
+
|
246 |
+
2023-10-19 00:44:37,685 ----------------------------------------------------------------------------------------------------
|