stefan-it commited on
Commit
52e991d
·
1 Parent(s): 0cb1130

Upload folder using huggingface_hub

Browse files
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:579d4039d87d4a309bd48c8f69a495d4779f4a01c96e7a947d19116c423c7686
3
+ size 19045922
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 00:35:27 0.0000 0.6964 0.1841 0.2006 0.1510 0.1723 0.0973
3
+ 2 00:36:28 0.0000 0.1886 0.1665 0.3351 0.4416 0.3810 0.2446
4
+ 3 00:37:28 0.0000 0.1594 0.1696 0.3783 0.3661 0.3721 0.2356
5
+ 4 00:38:29 0.0000 0.1470 0.1681 0.3941 0.5618 0.4632 0.3127
6
+ 5 00:39:29 0.0000 0.1339 0.1747 0.4094 0.5400 0.4657 0.3157
7
+ 6 00:40:30 0.0000 0.1257 0.1802 0.4145 0.5767 0.4823 0.3298
8
+ 7 00:41:31 0.0000 0.1186 0.1852 0.4338 0.5664 0.4913 0.3363
9
+ 8 00:42:31 0.0000 0.1137 0.1906 0.4294 0.5847 0.4952 0.3395
10
+ 9 00:43:32 0.0000 0.1109 0.1927 0.4227 0.6167 0.5016 0.3468
11
+ 10 00:44:33 0.0000 0.1090 0.1961 0.4253 0.6121 0.5019 0.3470
runs/events.out.tfevents.1697675670.46dc0c540dd0.3802.8 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2af6452c2a00f3d0ea625ab1136942ca9cba9810d723d82c645deb29ceeebfa0
3
+ size 2030580
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,246 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-19 00:34:30,183 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-19 00:34:30,183 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(32001, 128)
7
+ (position_embeddings): Embedding(512, 128)
8
+ (token_type_embeddings): Embedding(2, 128)
9
+ (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-1): 2 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=128, out_features=128, bias=True)
18
+ (key): Linear(in_features=128, out_features=128, bias=True)
19
+ (value): Linear(in_features=128, out_features=128, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=128, out_features=128, bias=True)
24
+ (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=128, out_features=512, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=512, out_features=128, bias=True)
34
+ (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=128, out_features=128, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=128, out_features=13, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-19 00:34:30,183 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-19 00:34:30,183 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
52
+ - NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
53
+ 2023-10-19 00:34:30,183 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-19 00:34:30,183 Train: 14465 sentences
55
+ 2023-10-19 00:34:30,183 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-19 00:34:30,183 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-19 00:34:30,183 Training Params:
58
+ 2023-10-19 00:34:30,183 - learning_rate: "3e-05"
59
+ 2023-10-19 00:34:30,183 - mini_batch_size: "4"
60
+ 2023-10-19 00:34:30,183 - max_epochs: "10"
61
+ 2023-10-19 00:34:30,184 - shuffle: "True"
62
+ 2023-10-19 00:34:30,184 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-19 00:34:30,184 Plugins:
64
+ 2023-10-19 00:34:30,184 - TensorboardLogger
65
+ 2023-10-19 00:34:30,184 - LinearScheduler | warmup_fraction: '0.1'
66
+ 2023-10-19 00:34:30,184 ----------------------------------------------------------------------------------------------------
67
+ 2023-10-19 00:34:30,184 Final evaluation on model from best epoch (best-model.pt)
68
+ 2023-10-19 00:34:30,184 - metric: "('micro avg', 'f1-score')"
69
+ 2023-10-19 00:34:30,184 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-19 00:34:30,184 Computation:
71
+ 2023-10-19 00:34:30,184 - compute on device: cuda:0
72
+ 2023-10-19 00:34:30,184 - embedding storage: none
73
+ 2023-10-19 00:34:30,184 ----------------------------------------------------------------------------------------------------
74
+ 2023-10-19 00:34:30,184 Model training base path: "hmbench-letemps/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
75
+ 2023-10-19 00:34:30,184 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-19 00:34:30,184 ----------------------------------------------------------------------------------------------------
77
+ 2023-10-19 00:34:30,184 Logging anything other than scalars to TensorBoard is currently not supported.
78
+ 2023-10-19 00:34:35,918 epoch 1 - iter 361/3617 - loss 2.91805740 - time (sec): 5.73 - samples/sec: 6815.20 - lr: 0.000003 - momentum: 0.000000
79
+ 2023-10-19 00:34:41,682 epoch 1 - iter 722/3617 - loss 2.25303714 - time (sec): 11.50 - samples/sec: 6622.87 - lr: 0.000006 - momentum: 0.000000
80
+ 2023-10-19 00:34:47,406 epoch 1 - iter 1083/3617 - loss 1.66388576 - time (sec): 17.22 - samples/sec: 6640.71 - lr: 0.000009 - momentum: 0.000000
81
+ 2023-10-19 00:34:53,049 epoch 1 - iter 1444/3617 - loss 1.33761956 - time (sec): 22.86 - samples/sec: 6653.83 - lr: 0.000012 - momentum: 0.000000
82
+ 2023-10-19 00:34:58,249 epoch 1 - iter 1805/3617 - loss 1.12626075 - time (sec): 28.06 - samples/sec: 6840.05 - lr: 0.000015 - momentum: 0.000000
83
+ 2023-10-19 00:35:03,842 epoch 1 - iter 2166/3617 - loss 0.98677877 - time (sec): 33.66 - samples/sec: 6847.55 - lr: 0.000018 - momentum: 0.000000
84
+ 2023-10-19 00:35:09,452 epoch 1 - iter 2527/3617 - loss 0.89109170 - time (sec): 39.27 - samples/sec: 6783.21 - lr: 0.000021 - momentum: 0.000000
85
+ 2023-10-19 00:35:14,571 epoch 1 - iter 2888/3617 - loss 0.81168511 - time (sec): 44.39 - samples/sec: 6859.56 - lr: 0.000024 - momentum: 0.000000
86
+ 2023-10-19 00:35:19,754 epoch 1 - iter 3249/3617 - loss 0.74884197 - time (sec): 49.57 - samples/sec: 6883.09 - lr: 0.000027 - momentum: 0.000000
87
+ 2023-10-19 00:35:25,488 epoch 1 - iter 3610/3617 - loss 0.69704141 - time (sec): 55.30 - samples/sec: 6861.62 - lr: 0.000030 - momentum: 0.000000
88
+ 2023-10-19 00:35:25,588 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-19 00:35:25,588 EPOCH 1 done: loss 0.6964 - lr: 0.000030
90
+ 2023-10-19 00:35:27,894 DEV : loss 0.18414448201656342 - f1-score (micro avg) 0.1723
91
+ 2023-10-19 00:35:27,923 saving best model
92
+ 2023-10-19 00:35:27,957 ----------------------------------------------------------------------------------------------------
93
+ 2023-10-19 00:35:33,423 epoch 2 - iter 361/3617 - loss 0.20767271 - time (sec): 5.46 - samples/sec: 6898.30 - lr: 0.000030 - momentum: 0.000000
94
+ 2023-10-19 00:35:39,165 epoch 2 - iter 722/3617 - loss 0.20857724 - time (sec): 11.21 - samples/sec: 6785.54 - lr: 0.000029 - momentum: 0.000000
95
+ 2023-10-19 00:35:44,839 epoch 2 - iter 1083/3617 - loss 0.20034477 - time (sec): 16.88 - samples/sec: 6781.60 - lr: 0.000029 - momentum: 0.000000
96
+ 2023-10-19 00:35:50,478 epoch 2 - iter 1444/3617 - loss 0.19662744 - time (sec): 22.52 - samples/sec: 6670.61 - lr: 0.000029 - momentum: 0.000000
97
+ 2023-10-19 00:35:56,146 epoch 2 - iter 1805/3617 - loss 0.19587540 - time (sec): 28.19 - samples/sec: 6615.87 - lr: 0.000028 - momentum: 0.000000
98
+ 2023-10-19 00:36:01,731 epoch 2 - iter 2166/3617 - loss 0.19490222 - time (sec): 33.77 - samples/sec: 6691.46 - lr: 0.000028 - momentum: 0.000000
99
+ 2023-10-19 00:36:07,501 epoch 2 - iter 2527/3617 - loss 0.19331286 - time (sec): 39.54 - samples/sec: 6672.70 - lr: 0.000028 - momentum: 0.000000
100
+ 2023-10-19 00:36:13,181 epoch 2 - iter 2888/3617 - loss 0.19208740 - time (sec): 45.22 - samples/sec: 6649.22 - lr: 0.000027 - momentum: 0.000000
101
+ 2023-10-19 00:36:18,875 epoch 2 - iter 3249/3617 - loss 0.18952749 - time (sec): 50.92 - samples/sec: 6673.34 - lr: 0.000027 - momentum: 0.000000
102
+ 2023-10-19 00:36:24,598 epoch 2 - iter 3610/3617 - loss 0.18852611 - time (sec): 56.64 - samples/sec: 6696.51 - lr: 0.000027 - momentum: 0.000000
103
+ 2023-10-19 00:36:24,705 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-19 00:36:24,706 EPOCH 2 done: loss 0.1886 - lr: 0.000027
105
+ 2023-10-19 00:36:28,635 DEV : loss 0.1664983630180359 - f1-score (micro avg) 0.381
106
+ 2023-10-19 00:36:28,663 saving best model
107
+ 2023-10-19 00:36:28,696 ----------------------------------------------------------------------------------------------------
108
+ 2023-10-19 00:36:34,434 epoch 3 - iter 361/3617 - loss 0.15091050 - time (sec): 5.74 - samples/sec: 6577.68 - lr: 0.000026 - momentum: 0.000000
109
+ 2023-10-19 00:36:40,116 epoch 3 - iter 722/3617 - loss 0.15277438 - time (sec): 11.42 - samples/sec: 6633.98 - lr: 0.000026 - momentum: 0.000000
110
+ 2023-10-19 00:36:45,517 epoch 3 - iter 1083/3617 - loss 0.15890046 - time (sec): 16.82 - samples/sec: 6773.25 - lr: 0.000026 - momentum: 0.000000
111
+ 2023-10-19 00:36:51,398 epoch 3 - iter 1444/3617 - loss 0.16323482 - time (sec): 22.70 - samples/sec: 6688.62 - lr: 0.000025 - momentum: 0.000000
112
+ 2023-10-19 00:36:57,123 epoch 3 - iter 1805/3617 - loss 0.15977729 - time (sec): 28.43 - samples/sec: 6707.84 - lr: 0.000025 - momentum: 0.000000
113
+ 2023-10-19 00:37:02,800 epoch 3 - iter 2166/3617 - loss 0.16133560 - time (sec): 34.10 - samples/sec: 6683.91 - lr: 0.000025 - momentum: 0.000000
114
+ 2023-10-19 00:37:08,622 epoch 3 - iter 2527/3617 - loss 0.16171229 - time (sec): 39.92 - samples/sec: 6680.19 - lr: 0.000024 - momentum: 0.000000
115
+ 2023-10-19 00:37:14,112 epoch 3 - iter 2888/3617 - loss 0.16077563 - time (sec): 45.41 - samples/sec: 6699.91 - lr: 0.000024 - momentum: 0.000000
116
+ 2023-10-19 00:37:19,821 epoch 3 - iter 3249/3617 - loss 0.15940779 - time (sec): 51.12 - samples/sec: 6685.32 - lr: 0.000024 - momentum: 0.000000
117
+ 2023-10-19 00:37:25,531 epoch 3 - iter 3610/3617 - loss 0.15948787 - time (sec): 56.83 - samples/sec: 6671.03 - lr: 0.000023 - momentum: 0.000000
118
+ 2023-10-19 00:37:25,641 ----------------------------------------------------------------------------------------------------
119
+ 2023-10-19 00:37:25,641 EPOCH 3 done: loss 0.1594 - lr: 0.000023
120
+ 2023-10-19 00:37:28,812 DEV : loss 0.16962358355522156 - f1-score (micro avg) 0.3721
121
+ 2023-10-19 00:37:28,839 ----------------------------------------------------------------------------------------------------
122
+ 2023-10-19 00:37:34,732 epoch 4 - iter 361/3617 - loss 0.14232155 - time (sec): 5.89 - samples/sec: 6302.73 - lr: 0.000023 - momentum: 0.000000
123
+ 2023-10-19 00:37:40,531 epoch 4 - iter 722/3617 - loss 0.14496426 - time (sec): 11.69 - samples/sec: 6505.66 - lr: 0.000023 - momentum: 0.000000
124
+ 2023-10-19 00:37:46,264 epoch 4 - iter 1083/3617 - loss 0.15214303 - time (sec): 17.42 - samples/sec: 6523.89 - lr: 0.000022 - momentum: 0.000000
125
+ 2023-10-19 00:37:52,081 epoch 4 - iter 1444/3617 - loss 0.14974326 - time (sec): 23.24 - samples/sec: 6539.88 - lr: 0.000022 - momentum: 0.000000
126
+ 2023-10-19 00:37:57,503 epoch 4 - iter 1805/3617 - loss 0.15035754 - time (sec): 28.66 - samples/sec: 6641.02 - lr: 0.000022 - momentum: 0.000000
127
+ 2023-10-19 00:38:02,896 epoch 4 - iter 2166/3617 - loss 0.15039641 - time (sec): 34.06 - samples/sec: 6679.55 - lr: 0.000021 - momentum: 0.000000
128
+ 2023-10-19 00:38:08,575 epoch 4 - iter 2527/3617 - loss 0.14933721 - time (sec): 39.73 - samples/sec: 6636.79 - lr: 0.000021 - momentum: 0.000000
129
+ 2023-10-19 00:38:14,295 epoch 4 - iter 2888/3617 - loss 0.14723742 - time (sec): 45.46 - samples/sec: 6659.06 - lr: 0.000021 - momentum: 0.000000
130
+ 2023-10-19 00:38:19,742 epoch 4 - iter 3249/3617 - loss 0.14686544 - time (sec): 50.90 - samples/sec: 6721.83 - lr: 0.000020 - momentum: 0.000000
131
+ 2023-10-19 00:38:25,385 epoch 4 - iter 3610/3617 - loss 0.14710156 - time (sec): 56.55 - samples/sec: 6702.85 - lr: 0.000020 - momentum: 0.000000
132
+ 2023-10-19 00:38:25,499 ----------------------------------------------------------------------------------------------------
133
+ 2023-10-19 00:38:25,499 EPOCH 4 done: loss 0.1470 - lr: 0.000020
134
+ 2023-10-19 00:38:29,396 DEV : loss 0.16811420023441315 - f1-score (micro avg) 0.4632
135
+ 2023-10-19 00:38:29,424 saving best model
136
+ 2023-10-19 00:38:29,457 ----------------------------------------------------------------------------------------------------
137
+ 2023-10-19 00:38:35,207 epoch 5 - iter 361/3617 - loss 0.14721800 - time (sec): 5.75 - samples/sec: 6158.21 - lr: 0.000020 - momentum: 0.000000
138
+ 2023-10-19 00:38:41,041 epoch 5 - iter 722/3617 - loss 0.14279723 - time (sec): 11.58 - samples/sec: 6442.74 - lr: 0.000019 - momentum: 0.000000
139
+ 2023-10-19 00:38:46,517 epoch 5 - iter 1083/3617 - loss 0.13300252 - time (sec): 17.06 - samples/sec: 6532.03 - lr: 0.000019 - momentum: 0.000000
140
+ 2023-10-19 00:38:52,413 epoch 5 - iter 1444/3617 - loss 0.13201950 - time (sec): 22.95 - samples/sec: 6528.86 - lr: 0.000019 - momentum: 0.000000
141
+ 2023-10-19 00:38:58,140 epoch 5 - iter 1805/3617 - loss 0.13224927 - time (sec): 28.68 - samples/sec: 6504.15 - lr: 0.000018 - momentum: 0.000000
142
+ 2023-10-19 00:39:03,858 epoch 5 - iter 2166/3617 - loss 0.13286286 - time (sec): 34.40 - samples/sec: 6560.32 - lr: 0.000018 - momentum: 0.000000
143
+ 2023-10-19 00:39:09,676 epoch 5 - iter 2527/3617 - loss 0.13266831 - time (sec): 40.22 - samples/sec: 6601.60 - lr: 0.000018 - momentum: 0.000000
144
+ 2023-10-19 00:39:15,284 epoch 5 - iter 2888/3617 - loss 0.13352438 - time (sec): 45.83 - samples/sec: 6614.43 - lr: 0.000017 - momentum: 0.000000
145
+ 2023-10-19 00:39:20,653 epoch 5 - iter 3249/3617 - loss 0.13325418 - time (sec): 51.20 - samples/sec: 6681.12 - lr: 0.000017 - momentum: 0.000000
146
+ 2023-10-19 00:39:26,465 epoch 5 - iter 3610/3617 - loss 0.13392825 - time (sec): 57.01 - samples/sec: 6652.32 - lr: 0.000017 - momentum: 0.000000
147
+ 2023-10-19 00:39:26,599 ----------------------------------------------------------------------------------------------------
148
+ 2023-10-19 00:39:26,600 EPOCH 5 done: loss 0.1339 - lr: 0.000017
149
+ 2023-10-19 00:39:29,796 DEV : loss 0.17465609312057495 - f1-score (micro avg) 0.4657
150
+ 2023-10-19 00:39:29,825 saving best model
151
+ 2023-10-19 00:39:29,859 ----------------------------------------------------------------------------------------------------
152
+ 2023-10-19 00:39:35,592 epoch 6 - iter 361/3617 - loss 0.12127706 - time (sec): 5.73 - samples/sec: 6721.20 - lr: 0.000016 - momentum: 0.000000
153
+ 2023-10-19 00:39:41,260 epoch 6 - iter 722/3617 - loss 0.11778469 - time (sec): 11.40 - samples/sec: 6597.92 - lr: 0.000016 - momentum: 0.000000
154
+ 2023-10-19 00:39:47,020 epoch 6 - iter 1083/3617 - loss 0.11656336 - time (sec): 17.16 - samples/sec: 6622.72 - lr: 0.000016 - momentum: 0.000000
155
+ 2023-10-19 00:39:52,551 epoch 6 - iter 1444/3617 - loss 0.12094593 - time (sec): 22.69 - samples/sec: 6627.71 - lr: 0.000015 - momentum: 0.000000
156
+ 2023-10-19 00:39:58,056 epoch 6 - iter 1805/3617 - loss 0.12103503 - time (sec): 28.20 - samples/sec: 6695.63 - lr: 0.000015 - momentum: 0.000000
157
+ 2023-10-19 00:40:03,764 epoch 6 - iter 2166/3617 - loss 0.12382929 - time (sec): 33.90 - samples/sec: 6676.97 - lr: 0.000015 - momentum: 0.000000
158
+ 2023-10-19 00:40:09,505 epoch 6 - iter 2527/3617 - loss 0.12571836 - time (sec): 39.65 - samples/sec: 6688.00 - lr: 0.000014 - momentum: 0.000000
159
+ 2023-10-19 00:40:15,239 epoch 6 - iter 2888/3617 - loss 0.12660519 - time (sec): 45.38 - samples/sec: 6656.01 - lr: 0.000014 - momentum: 0.000000
160
+ 2023-10-19 00:40:21,099 epoch 6 - iter 3249/3617 - loss 0.12660343 - time (sec): 51.24 - samples/sec: 6643.48 - lr: 0.000014 - momentum: 0.000000
161
+ 2023-10-19 00:40:26,893 epoch 6 - iter 3610/3617 - loss 0.12570085 - time (sec): 57.03 - samples/sec: 6643.19 - lr: 0.000013 - momentum: 0.000000
162
+ 2023-10-19 00:40:27,007 ----------------------------------------------------------------------------------------------------
163
+ 2023-10-19 00:40:27,008 EPOCH 6 done: loss 0.1257 - lr: 0.000013
164
+ 2023-10-19 00:40:30,241 DEV : loss 0.18021412193775177 - f1-score (micro avg) 0.4823
165
+ 2023-10-19 00:40:30,269 saving best model
166
+ 2023-10-19 00:40:30,301 ----------------------------------------------------------------------------------------------------
167
+ 2023-10-19 00:40:36,131 epoch 7 - iter 361/3617 - loss 0.12149979 - time (sec): 5.83 - samples/sec: 6622.41 - lr: 0.000013 - momentum: 0.000000
168
+ 2023-10-19 00:40:41,749 epoch 7 - iter 722/3617 - loss 0.11531578 - time (sec): 11.45 - samples/sec: 6667.10 - lr: 0.000013 - momentum: 0.000000
169
+ 2023-10-19 00:40:47,463 epoch 7 - iter 1083/3617 - loss 0.11569164 - time (sec): 17.16 - samples/sec: 6666.03 - lr: 0.000012 - momentum: 0.000000
170
+ 2023-10-19 00:40:52,785 epoch 7 - iter 1444/3617 - loss 0.11579758 - time (sec): 22.48 - samples/sec: 6774.95 - lr: 0.000012 - momentum: 0.000000
171
+ 2023-10-19 00:40:58,121 epoch 7 - iter 1805/3617 - loss 0.11667595 - time (sec): 27.82 - samples/sec: 6819.17 - lr: 0.000012 - momentum: 0.000000
172
+ 2023-10-19 00:41:03,894 epoch 7 - iter 2166/3617 - loss 0.12102352 - time (sec): 33.59 - samples/sec: 6764.86 - lr: 0.000011 - momentum: 0.000000
173
+ 2023-10-19 00:41:09,701 epoch 7 - iter 2527/3617 - loss 0.12119387 - time (sec): 39.40 - samples/sec: 6738.92 - lr: 0.000011 - momentum: 0.000000
174
+ 2023-10-19 00:41:15,459 epoch 7 - iter 2888/3617 - loss 0.12026071 - time (sec): 45.16 - samples/sec: 6692.51 - lr: 0.000011 - momentum: 0.000000
175
+ 2023-10-19 00:41:21,204 epoch 7 - iter 3249/3617 - loss 0.11944450 - time (sec): 50.90 - samples/sec: 6680.71 - lr: 0.000010 - momentum: 0.000000
176
+ 2023-10-19 00:41:27,019 epoch 7 - iter 3610/3617 - loss 0.11841485 - time (sec): 56.72 - samples/sec: 6679.74 - lr: 0.000010 - momentum: 0.000000
177
+ 2023-10-19 00:41:27,134 ----------------------------------------------------------------------------------------------------
178
+ 2023-10-19 00:41:27,135 EPOCH 7 done: loss 0.1186 - lr: 0.000010
179
+ 2023-10-19 00:41:31,039 DEV : loss 0.1851833611726761 - f1-score (micro avg) 0.4913
180
+ 2023-10-19 00:41:31,067 saving best model
181
+ 2023-10-19 00:41:31,106 ----------------------------------------------------------------------------------------------------
182
+ 2023-10-19 00:41:36,914 epoch 8 - iter 361/3617 - loss 0.11416187 - time (sec): 5.81 - samples/sec: 6707.95 - lr: 0.000010 - momentum: 0.000000
183
+ 2023-10-19 00:41:42,734 epoch 8 - iter 722/3617 - loss 0.10800957 - time (sec): 11.63 - samples/sec: 6724.86 - lr: 0.000009 - momentum: 0.000000
184
+ 2023-10-19 00:41:48,476 epoch 8 - iter 1083/3617 - loss 0.11095705 - time (sec): 17.37 - samples/sec: 6711.31 - lr: 0.000009 - momentum: 0.000000
185
+ 2023-10-19 00:41:54,248 epoch 8 - iter 1444/3617 - loss 0.10817209 - time (sec): 23.14 - samples/sec: 6666.37 - lr: 0.000009 - momentum: 0.000000
186
+ 2023-10-19 00:42:00,062 epoch 8 - iter 1805/3617 - loss 0.11279132 - time (sec): 28.96 - samples/sec: 6669.85 - lr: 0.000008 - momentum: 0.000000
187
+ 2023-10-19 00:42:05,767 epoch 8 - iter 2166/3617 - loss 0.11285781 - time (sec): 34.66 - samples/sec: 6670.09 - lr: 0.000008 - momentum: 0.000000
188
+ 2023-10-19 00:42:11,517 epoch 8 - iter 2527/3617 - loss 0.11301960 - time (sec): 40.41 - samples/sec: 6634.57 - lr: 0.000008 - momentum: 0.000000
189
+ 2023-10-19 00:42:17,117 epoch 8 - iter 2888/3617 - loss 0.11249704 - time (sec): 46.01 - samples/sec: 6632.10 - lr: 0.000007 - momentum: 0.000000
190
+ 2023-10-19 00:42:22,519 epoch 8 - iter 3249/3617 - loss 0.11359347 - time (sec): 51.41 - samples/sec: 6671.77 - lr: 0.000007 - momentum: 0.000000
191
+ 2023-10-19 00:42:28,116 epoch 8 - iter 3610/3617 - loss 0.11389018 - time (sec): 57.01 - samples/sec: 6649.67 - lr: 0.000007 - momentum: 0.000000
192
+ 2023-10-19 00:42:28,224 ----------------------------------------------------------------------------------------------------
193
+ 2023-10-19 00:42:28,224 EPOCH 8 done: loss 0.1137 - lr: 0.000007
194
+ 2023-10-19 00:42:31,463 DEV : loss 0.19062528014183044 - f1-score (micro avg) 0.4952
195
+ 2023-10-19 00:42:31,491 saving best model
196
+ 2023-10-19 00:42:31,528 ----------------------------------------------------------------------------------------------------
197
+ 2023-10-19 00:42:37,260 epoch 9 - iter 361/3617 - loss 0.10534011 - time (sec): 5.73 - samples/sec: 6685.34 - lr: 0.000006 - momentum: 0.000000
198
+ 2023-10-19 00:42:42,983 epoch 9 - iter 722/3617 - loss 0.11077406 - time (sec): 11.45 - samples/sec: 6598.82 - lr: 0.000006 - momentum: 0.000000
199
+ 2023-10-19 00:42:48,253 epoch 9 - iter 1083/3617 - loss 0.10676885 - time (sec): 16.72 - samples/sec: 6791.81 - lr: 0.000006 - momentum: 0.000000
200
+ 2023-10-19 00:42:54,088 epoch 9 - iter 1444/3617 - loss 0.10622678 - time (sec): 22.56 - samples/sec: 6690.49 - lr: 0.000005 - momentum: 0.000000
201
+ 2023-10-19 00:42:59,808 epoch 9 - iter 1805/3617 - loss 0.10776910 - time (sec): 28.28 - samples/sec: 6707.15 - lr: 0.000005 - momentum: 0.000000
202
+ 2023-10-19 00:43:05,595 epoch 9 - iter 2166/3617 - loss 0.11026578 - time (sec): 34.07 - samples/sec: 6654.23 - lr: 0.000005 - momentum: 0.000000
203
+ 2023-10-19 00:43:11,285 epoch 9 - iter 2527/3617 - loss 0.11034727 - time (sec): 39.76 - samples/sec: 6657.01 - lr: 0.000004 - momentum: 0.000000
204
+ 2023-10-19 00:43:17,141 epoch 9 - iter 2888/3617 - loss 0.11070044 - time (sec): 45.61 - samples/sec: 6632.25 - lr: 0.000004 - momentum: 0.000000
205
+ 2023-10-19 00:43:22,779 epoch 9 - iter 3249/3617 - loss 0.10960559 - time (sec): 51.25 - samples/sec: 6625.34 - lr: 0.000004 - momentum: 0.000000
206
+ 2023-10-19 00:43:28,625 epoch 9 - iter 3610/3617 - loss 0.11083827 - time (sec): 57.10 - samples/sec: 6642.00 - lr: 0.000003 - momentum: 0.000000
207
+ 2023-10-19 00:43:28,729 ----------------------------------------------------------------------------------------------------
208
+ 2023-10-19 00:43:28,729 EPOCH 9 done: loss 0.1109 - lr: 0.000003
209
+ 2023-10-19 00:43:31,979 DEV : loss 0.19269128143787384 - f1-score (micro avg) 0.5016
210
+ 2023-10-19 00:43:32,008 saving best model
211
+ 2023-10-19 00:43:32,041 ----------------------------------------------------------------------------------------------------
212
+ 2023-10-19 00:43:38,495 epoch 10 - iter 361/3617 - loss 0.10547873 - time (sec): 6.45 - samples/sec: 5966.48 - lr: 0.000003 - momentum: 0.000000
213
+ 2023-10-19 00:43:44,265 epoch 10 - iter 722/3617 - loss 0.10323462 - time (sec): 12.22 - samples/sec: 6292.25 - lr: 0.000003 - momentum: 0.000000
214
+ 2023-10-19 00:43:49,972 epoch 10 - iter 1083/3617 - loss 0.11136052 - time (sec): 17.93 - samples/sec: 6278.56 - lr: 0.000002 - momentum: 0.000000
215
+ 2023-10-19 00:43:55,711 epoch 10 - iter 1444/3617 - loss 0.10802696 - time (sec): 23.67 - samples/sec: 6385.53 - lr: 0.000002 - momentum: 0.000000
216
+ 2023-10-19 00:44:01,531 epoch 10 - iter 1805/3617 - loss 0.10816177 - time (sec): 29.49 - samples/sec: 6442.13 - lr: 0.000002 - momentum: 0.000000
217
+ 2023-10-19 00:44:07,303 epoch 10 - iter 2166/3617 - loss 0.10603407 - time (sec): 35.26 - samples/sec: 6485.02 - lr: 0.000001 - momentum: 0.000000
218
+ 2023-10-19 00:44:13,016 epoch 10 - iter 2527/3617 - loss 0.10623744 - time (sec): 40.97 - samples/sec: 6481.74 - lr: 0.000001 - momentum: 0.000000
219
+ 2023-10-19 00:44:18,445 epoch 10 - iter 2888/3617 - loss 0.10604407 - time (sec): 46.40 - samples/sec: 6574.08 - lr: 0.000001 - momentum: 0.000000
220
+ 2023-10-19 00:44:24,218 epoch 10 - iter 3249/3617 - loss 0.10773950 - time (sec): 52.18 - samples/sec: 6581.15 - lr: 0.000000 - momentum: 0.000000
221
+ 2023-10-19 00:44:29,920 epoch 10 - iter 3610/3617 - loss 0.10913884 - time (sec): 57.88 - samples/sec: 6551.05 - lr: 0.000000 - momentum: 0.000000
222
+ 2023-10-19 00:44:30,022 ----------------------------------------------------------------------------------------------------
223
+ 2023-10-19 00:44:30,023 EPOCH 10 done: loss 0.1090 - lr: 0.000000
224
+ 2023-10-19 00:44:33,292 DEV : loss 0.1960730254650116 - f1-score (micro avg) 0.5019
225
+ 2023-10-19 00:44:33,321 saving best model
226
+ 2023-10-19 00:44:33,388 ----------------------------------------------------------------------------------------------------
227
+ 2023-10-19 00:44:33,389 Loading model from best epoch ...
228
+ 2023-10-19 00:44:33,469 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
229
+ 2023-10-19 00:44:37,684
230
+ Results:
231
+ - F-score (micro) 0.5164
232
+ - F-score (macro) 0.3449
233
+ - Accuracy 0.36
234
+
235
+ By class:
236
+ precision recall f1-score support
237
+
238
+ loc 0.5194 0.6785 0.5884 591
239
+ pers 0.3952 0.5126 0.4463 357
240
+ org 0.0000 0.0000 0.0000 79
241
+
242
+ micro avg 0.4729 0.5686 0.5164 1027
243
+ macro avg 0.3049 0.3970 0.3449 1027
244
+ weighted avg 0.4363 0.5686 0.4938 1027
245
+
246
+ 2023-10-19 00:44:37,685 ----------------------------------------------------------------------------------------------------