stefan-it commited on
Commit
75710eb
1 Parent(s): e66782d

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. best-model.pt +3 -0
  2. dev.tsv +0 -0
  3. loss.tsv +11 -0
  4. test.tsv +0 -0
  5. training.log +246 -0
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b6a590ecd5715a860468f376ff71a28394a05fa259ba7b8931f128c74d9b0e18
3
+ size 443335879
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 15:17:53 0.0000 0.5489 0.1425 0.6694 0.7411 0.7035 0.5754
3
+ 2 15:18:54 0.0000 0.1229 0.1122 0.7355 0.8058 0.7691 0.6559
4
+ 3 15:19:56 0.0000 0.0725 0.1254 0.7646 0.7887 0.7764 0.6697
5
+ 4 15:20:57 0.0000 0.0497 0.1647 0.7806 0.8436 0.8109 0.7055
6
+ 5 15:21:59 0.0000 0.0353 0.1672 0.7898 0.8373 0.8129 0.7029
7
+ 6 15:23:03 0.0000 0.0254 0.1982 0.8007 0.8305 0.8153 0.7228
8
+ 7 15:24:06 0.0000 0.0162 0.2072 0.8048 0.8288 0.8166 0.7170
9
+ 8 15:25:09 0.0000 0.0131 0.2221 0.8083 0.8402 0.8239 0.7309
10
+ 9 15:26:11 0.0000 0.0077 0.2157 0.8155 0.8454 0.8301 0.7369
11
+ 10 15:27:16 0.0000 0.0050 0.2240 0.8163 0.8373 0.8267 0.7328
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,246 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-13 15:16:58,271 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-13 15:16:58,272 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=21, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-13 15:16:58,272 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-13 15:16:58,272 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
52
+ - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
53
+ 2023-10-13 15:16:58,272 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-13 15:16:58,272 Train: 5901 sentences
55
+ 2023-10-13 15:16:58,272 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-13 15:16:58,272 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-13 15:16:58,272 Training Params:
58
+ 2023-10-13 15:16:58,272 - learning_rate: "5e-05"
59
+ 2023-10-13 15:16:58,272 - mini_batch_size: "8"
60
+ 2023-10-13 15:16:58,272 - max_epochs: "10"
61
+ 2023-10-13 15:16:58,272 - shuffle: "True"
62
+ 2023-10-13 15:16:58,272 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-13 15:16:58,272 Plugins:
64
+ 2023-10-13 15:16:58,272 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2023-10-13 15:16:58,272 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-13 15:16:58,273 Final evaluation on model from best epoch (best-model.pt)
67
+ 2023-10-13 15:16:58,273 - metric: "('micro avg', 'f1-score')"
68
+ 2023-10-13 15:16:58,273 ----------------------------------------------------------------------------------------------------
69
+ 2023-10-13 15:16:58,273 Computation:
70
+ 2023-10-13 15:16:58,273 - compute on device: cuda:0
71
+ 2023-10-13 15:16:58,273 - embedding storage: none
72
+ 2023-10-13 15:16:58,273 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-13 15:16:58,273 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
74
+ 2023-10-13 15:16:58,273 ----------------------------------------------------------------------------------------------------
75
+ 2023-10-13 15:16:58,273 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-13 15:17:03,084 epoch 1 - iter 73/738 - loss 2.74602808 - time (sec): 4.81 - samples/sec: 3466.71 - lr: 0.000005 - momentum: 0.000000
77
+ 2023-10-13 15:17:07,826 epoch 1 - iter 146/738 - loss 1.71361771 - time (sec): 9.55 - samples/sec: 3449.67 - lr: 0.000010 - momentum: 0.000000
78
+ 2023-10-13 15:17:13,391 epoch 1 - iter 219/738 - loss 1.24565664 - time (sec): 15.12 - samples/sec: 3430.28 - lr: 0.000015 - momentum: 0.000000
79
+ 2023-10-13 15:17:17,821 epoch 1 - iter 292/738 - loss 1.03626899 - time (sec): 19.55 - samples/sec: 3433.30 - lr: 0.000020 - momentum: 0.000000
80
+ 2023-10-13 15:17:22,517 epoch 1 - iter 365/738 - loss 0.89538384 - time (sec): 24.24 - samples/sec: 3426.05 - lr: 0.000025 - momentum: 0.000000
81
+ 2023-10-13 15:17:27,166 epoch 1 - iter 438/738 - loss 0.79251444 - time (sec): 28.89 - samples/sec: 3409.99 - lr: 0.000030 - momentum: 0.000000
82
+ 2023-10-13 15:17:31,550 epoch 1 - iter 511/738 - loss 0.71725793 - time (sec): 33.28 - samples/sec: 3417.41 - lr: 0.000035 - momentum: 0.000000
83
+ 2023-10-13 15:17:36,209 epoch 1 - iter 584/738 - loss 0.65519603 - time (sec): 37.94 - samples/sec: 3398.73 - lr: 0.000039 - momentum: 0.000000
84
+ 2023-10-13 15:17:41,869 epoch 1 - iter 657/738 - loss 0.59427688 - time (sec): 43.60 - samples/sec: 3399.96 - lr: 0.000044 - momentum: 0.000000
85
+ 2023-10-13 15:17:46,971 epoch 1 - iter 730/738 - loss 0.55319814 - time (sec): 48.70 - samples/sec: 3385.84 - lr: 0.000049 - momentum: 0.000000
86
+ 2023-10-13 15:17:47,413 ----------------------------------------------------------------------------------------------------
87
+ 2023-10-13 15:17:47,413 EPOCH 1 done: loss 0.5489 - lr: 0.000049
88
+ 2023-10-13 15:17:53,443 DEV : loss 0.14253994822502136 - f1-score (micro avg) 0.7035
89
+ 2023-10-13 15:17:53,476 saving best model
90
+ 2023-10-13 15:17:53,841 ----------------------------------------------------------------------------------------------------
91
+ 2023-10-13 15:17:58,952 epoch 2 - iter 73/738 - loss 0.15502447 - time (sec): 5.11 - samples/sec: 3311.80 - lr: 0.000049 - momentum: 0.000000
92
+ 2023-10-13 15:18:03,246 epoch 2 - iter 146/738 - loss 0.14085048 - time (sec): 9.40 - samples/sec: 3306.19 - lr: 0.000049 - momentum: 0.000000
93
+ 2023-10-13 15:18:08,012 epoch 2 - iter 219/738 - loss 0.13869781 - time (sec): 14.17 - samples/sec: 3275.65 - lr: 0.000048 - momentum: 0.000000
94
+ 2023-10-13 15:18:12,900 epoch 2 - iter 292/738 - loss 0.13704243 - time (sec): 19.06 - samples/sec: 3288.19 - lr: 0.000048 - momentum: 0.000000
95
+ 2023-10-13 15:18:17,212 epoch 2 - iter 365/738 - loss 0.13342204 - time (sec): 23.37 - samples/sec: 3336.73 - lr: 0.000047 - momentum: 0.000000
96
+ 2023-10-13 15:18:23,427 epoch 2 - iter 438/738 - loss 0.13259264 - time (sec): 29.58 - samples/sec: 3372.56 - lr: 0.000047 - momentum: 0.000000
97
+ 2023-10-13 15:18:28,287 epoch 2 - iter 511/738 - loss 0.13044756 - time (sec): 34.44 - samples/sec: 3369.54 - lr: 0.000046 - momentum: 0.000000
98
+ 2023-10-13 15:18:33,290 epoch 2 - iter 584/738 - loss 0.12937219 - time (sec): 39.45 - samples/sec: 3358.12 - lr: 0.000046 - momentum: 0.000000
99
+ 2023-10-13 15:18:38,093 epoch 2 - iter 657/738 - loss 0.12659105 - time (sec): 44.25 - samples/sec: 3367.71 - lr: 0.000045 - momentum: 0.000000
100
+ 2023-10-13 15:18:42,598 epoch 2 - iter 730/738 - loss 0.12317234 - time (sec): 48.76 - samples/sec: 3378.72 - lr: 0.000045 - momentum: 0.000000
101
+ 2023-10-13 15:18:43,094 ----------------------------------------------------------------------------------------------------
102
+ 2023-10-13 15:18:43,095 EPOCH 2 done: loss 0.1229 - lr: 0.000045
103
+ 2023-10-13 15:18:54,371 DEV : loss 0.11215972900390625 - f1-score (micro avg) 0.7691
104
+ 2023-10-13 15:18:54,406 saving best model
105
+ 2023-10-13 15:18:54,884 ----------------------------------------------------------------------------------------------------
106
+ 2023-10-13 15:18:59,693 epoch 3 - iter 73/738 - loss 0.07421766 - time (sec): 4.81 - samples/sec: 3162.11 - lr: 0.000044 - momentum: 0.000000
107
+ 2023-10-13 15:19:04,307 epoch 3 - iter 146/738 - loss 0.06446355 - time (sec): 9.42 - samples/sec: 3312.82 - lr: 0.000043 - momentum: 0.000000
108
+ 2023-10-13 15:19:09,071 epoch 3 - iter 219/738 - loss 0.07063195 - time (sec): 14.18 - samples/sec: 3406.91 - lr: 0.000043 - momentum: 0.000000
109
+ 2023-10-13 15:19:14,191 epoch 3 - iter 292/738 - loss 0.07392124 - time (sec): 19.30 - samples/sec: 3389.06 - lr: 0.000042 - momentum: 0.000000
110
+ 2023-10-13 15:19:19,271 epoch 3 - iter 365/738 - loss 0.07638465 - time (sec): 24.38 - samples/sec: 3400.77 - lr: 0.000042 - momentum: 0.000000
111
+ 2023-10-13 15:19:23,948 epoch 3 - iter 438/738 - loss 0.07585899 - time (sec): 29.06 - samples/sec: 3367.56 - lr: 0.000041 - momentum: 0.000000
112
+ 2023-10-13 15:19:29,191 epoch 3 - iter 511/738 - loss 0.07494694 - time (sec): 34.30 - samples/sec: 3332.95 - lr: 0.000041 - momentum: 0.000000
113
+ 2023-10-13 15:19:34,643 epoch 3 - iter 584/738 - loss 0.07368203 - time (sec): 39.76 - samples/sec: 3303.06 - lr: 0.000040 - momentum: 0.000000
114
+ 2023-10-13 15:19:39,452 epoch 3 - iter 657/738 - loss 0.07370863 - time (sec): 44.57 - samples/sec: 3315.19 - lr: 0.000040 - momentum: 0.000000
115
+ 2023-10-13 15:19:44,937 epoch 3 - iter 730/738 - loss 0.07293039 - time (sec): 50.05 - samples/sec: 3293.55 - lr: 0.000039 - momentum: 0.000000
116
+ 2023-10-13 15:19:45,384 ----------------------------------------------------------------------------------------------------
117
+ 2023-10-13 15:19:45,385 EPOCH 3 done: loss 0.0725 - lr: 0.000039
118
+ 2023-10-13 15:19:56,548 DEV : loss 0.12536022067070007 - f1-score (micro avg) 0.7764
119
+ 2023-10-13 15:19:56,576 saving best model
120
+ 2023-10-13 15:19:57,053 ----------------------------------------------------------------------------------------------------
121
+ 2023-10-13 15:20:01,765 epoch 4 - iter 73/738 - loss 0.04276265 - time (sec): 4.70 - samples/sec: 3350.13 - lr: 0.000038 - momentum: 0.000000
122
+ 2023-10-13 15:20:06,911 epoch 4 - iter 146/738 - loss 0.05179007 - time (sec): 9.85 - samples/sec: 3393.82 - lr: 0.000038 - momentum: 0.000000
123
+ 2023-10-13 15:20:12,365 epoch 4 - iter 219/738 - loss 0.04965902 - time (sec): 15.30 - samples/sec: 3402.43 - lr: 0.000037 - momentum: 0.000000
124
+ 2023-10-13 15:20:17,046 epoch 4 - iter 292/738 - loss 0.04941122 - time (sec): 19.98 - samples/sec: 3373.66 - lr: 0.000037 - momentum: 0.000000
125
+ 2023-10-13 15:20:21,696 epoch 4 - iter 365/738 - loss 0.04819576 - time (sec): 24.63 - samples/sec: 3367.59 - lr: 0.000036 - momentum: 0.000000
126
+ 2023-10-13 15:20:26,064 epoch 4 - iter 438/738 - loss 0.04826942 - time (sec): 29.00 - samples/sec: 3362.01 - lr: 0.000036 - momentum: 0.000000
127
+ 2023-10-13 15:20:31,211 epoch 4 - iter 511/738 - loss 0.04765206 - time (sec): 34.15 - samples/sec: 3370.67 - lr: 0.000035 - momentum: 0.000000
128
+ 2023-10-13 15:20:35,872 epoch 4 - iter 584/738 - loss 0.04844213 - time (sec): 38.81 - samples/sec: 3360.11 - lr: 0.000035 - momentum: 0.000000
129
+ 2023-10-13 15:20:41,280 epoch 4 - iter 657/738 - loss 0.04959664 - time (sec): 44.22 - samples/sec: 3353.98 - lr: 0.000034 - momentum: 0.000000
130
+ 2023-10-13 15:20:46,071 epoch 4 - iter 730/738 - loss 0.04971863 - time (sec): 49.01 - samples/sec: 3365.56 - lr: 0.000033 - momentum: 0.000000
131
+ 2023-10-13 15:20:46,528 ----------------------------------------------------------------------------------------------------
132
+ 2023-10-13 15:20:46,528 EPOCH 4 done: loss 0.0497 - lr: 0.000033
133
+ 2023-10-13 15:20:57,722 DEV : loss 0.1647169589996338 - f1-score (micro avg) 0.8109
134
+ 2023-10-13 15:20:57,752 saving best model
135
+ 2023-10-13 15:20:58,245 ----------------------------------------------------------------------------------------------------
136
+ 2023-10-13 15:21:02,849 epoch 5 - iter 73/738 - loss 0.03178351 - time (sec): 4.59 - samples/sec: 3344.00 - lr: 0.000033 - momentum: 0.000000
137
+ 2023-10-13 15:21:07,618 epoch 5 - iter 146/738 - loss 0.03011420 - time (sec): 9.36 - samples/sec: 3338.50 - lr: 0.000032 - momentum: 0.000000
138
+ 2023-10-13 15:21:12,677 epoch 5 - iter 219/738 - loss 0.03759577 - time (sec): 14.42 - samples/sec: 3368.79 - lr: 0.000032 - momentum: 0.000000
139
+ 2023-10-13 15:21:17,375 epoch 5 - iter 292/738 - loss 0.03522125 - time (sec): 19.12 - samples/sec: 3345.28 - lr: 0.000031 - momentum: 0.000000
140
+ 2023-10-13 15:21:22,615 epoch 5 - iter 365/738 - loss 0.03386361 - time (sec): 24.36 - samples/sec: 3343.37 - lr: 0.000031 - momentum: 0.000000
141
+ 2023-10-13 15:21:27,708 epoch 5 - iter 438/738 - loss 0.03481830 - time (sec): 29.45 - samples/sec: 3340.69 - lr: 0.000030 - momentum: 0.000000
142
+ 2023-10-13 15:21:32,309 epoch 5 - iter 511/738 - loss 0.03493902 - time (sec): 34.05 - samples/sec: 3343.79 - lr: 0.000030 - momentum: 0.000000
143
+ 2023-10-13 15:21:37,167 epoch 5 - iter 584/738 - loss 0.03490208 - time (sec): 38.91 - samples/sec: 3339.05 - lr: 0.000029 - momentum: 0.000000
144
+ 2023-10-13 15:21:42,940 epoch 5 - iter 657/738 - loss 0.03505144 - time (sec): 44.68 - samples/sec: 3320.83 - lr: 0.000028 - momentum: 0.000000
145
+ 2023-10-13 15:21:47,947 epoch 5 - iter 730/738 - loss 0.03538729 - time (sec): 49.69 - samples/sec: 3320.50 - lr: 0.000028 - momentum: 0.000000
146
+ 2023-10-13 15:21:48,393 ----------------------------------------------------------------------------------------------------
147
+ 2023-10-13 15:21:48,393 EPOCH 5 done: loss 0.0353 - lr: 0.000028
148
+ 2023-10-13 15:21:59,731 DEV : loss 0.16715505719184875 - f1-score (micro avg) 0.8129
149
+ 2023-10-13 15:21:59,762 saving best model
150
+ 2023-10-13 15:22:00,272 ----------------------------------------------------------------------------------------------------
151
+ 2023-10-13 15:22:05,043 epoch 6 - iter 73/738 - loss 0.03487724 - time (sec): 4.76 - samples/sec: 3137.17 - lr: 0.000027 - momentum: 0.000000
152
+ 2023-10-13 15:22:09,824 epoch 6 - iter 146/738 - loss 0.02962974 - time (sec): 9.54 - samples/sec: 3178.81 - lr: 0.000027 - momentum: 0.000000
153
+ 2023-10-13 15:22:15,490 epoch 6 - iter 219/738 - loss 0.02757779 - time (sec): 15.21 - samples/sec: 3232.17 - lr: 0.000026 - momentum: 0.000000
154
+ 2023-10-13 15:22:20,455 epoch 6 - iter 292/738 - loss 0.02874868 - time (sec): 20.17 - samples/sec: 3225.44 - lr: 0.000026 - momentum: 0.000000
155
+ 2023-10-13 15:22:25,165 epoch 6 - iter 365/738 - loss 0.02763650 - time (sec): 24.88 - samples/sec: 3254.83 - lr: 0.000025 - momentum: 0.000000
156
+ 2023-10-13 15:22:30,551 epoch 6 - iter 438/738 - loss 0.02686535 - time (sec): 30.27 - samples/sec: 3271.81 - lr: 0.000025 - momentum: 0.000000
157
+ 2023-10-13 15:22:35,096 epoch 6 - iter 511/738 - loss 0.02587544 - time (sec): 34.82 - samples/sec: 3279.27 - lr: 0.000024 - momentum: 0.000000
158
+ 2023-10-13 15:22:39,857 epoch 6 - iter 584/738 - loss 0.02553302 - time (sec): 39.58 - samples/sec: 3288.34 - lr: 0.000023 - momentum: 0.000000
159
+ 2023-10-13 15:22:45,402 epoch 6 - iter 657/738 - loss 0.02568474 - time (sec): 45.12 - samples/sec: 3304.11 - lr: 0.000023 - momentum: 0.000000
160
+ 2023-10-13 15:22:50,134 epoch 6 - iter 730/738 - loss 0.02547133 - time (sec): 49.85 - samples/sec: 3306.39 - lr: 0.000022 - momentum: 0.000000
161
+ 2023-10-13 15:22:50,602 ----------------------------------------------------------------------------------------------------
162
+ 2023-10-13 15:22:50,602 EPOCH 6 done: loss 0.0254 - lr: 0.000022
163
+ 2023-10-13 15:23:03,139 DEV : loss 0.19820576906204224 - f1-score (micro avg) 0.8153
164
+ 2023-10-13 15:23:03,177 saving best model
165
+ 2023-10-13 15:23:03,721 ----------------------------------------------------------------------------------------------------
166
+ 2023-10-13 15:23:08,470 epoch 7 - iter 73/738 - loss 0.01207347 - time (sec): 4.75 - samples/sec: 3203.00 - lr: 0.000022 - momentum: 0.000000
167
+ 2023-10-13 15:23:14,759 epoch 7 - iter 146/738 - loss 0.01791847 - time (sec): 11.04 - samples/sec: 3047.72 - lr: 0.000021 - momentum: 0.000000
168
+ 2023-10-13 15:23:19,307 epoch 7 - iter 219/738 - loss 0.01769644 - time (sec): 15.58 - samples/sec: 3125.38 - lr: 0.000021 - momentum: 0.000000
169
+ 2023-10-13 15:23:24,645 epoch 7 - iter 292/738 - loss 0.01842221 - time (sec): 20.92 - samples/sec: 3096.35 - lr: 0.000020 - momentum: 0.000000
170
+ 2023-10-13 15:23:29,789 epoch 7 - iter 365/738 - loss 0.01783206 - time (sec): 26.07 - samples/sec: 3137.84 - lr: 0.000020 - momentum: 0.000000
171
+ 2023-10-13 15:23:34,922 epoch 7 - iter 438/738 - loss 0.01767386 - time (sec): 31.20 - samples/sec: 3200.09 - lr: 0.000019 - momentum: 0.000000
172
+ 2023-10-13 15:23:40,128 epoch 7 - iter 511/738 - loss 0.01695655 - time (sec): 36.40 - samples/sec: 3224.08 - lr: 0.000018 - momentum: 0.000000
173
+ 2023-10-13 15:23:45,371 epoch 7 - iter 584/738 - loss 0.01651792 - time (sec): 41.65 - samples/sec: 3222.34 - lr: 0.000018 - momentum: 0.000000
174
+ 2023-10-13 15:23:49,934 epoch 7 - iter 657/738 - loss 0.01655195 - time (sec): 46.21 - samples/sec: 3223.49 - lr: 0.000017 - momentum: 0.000000
175
+ 2023-10-13 15:23:54,626 epoch 7 - iter 730/738 - loss 0.01607427 - time (sec): 50.90 - samples/sec: 3232.49 - lr: 0.000017 - momentum: 0.000000
176
+ 2023-10-13 15:23:55,106 ----------------------------------------------------------------------------------------------------
177
+ 2023-10-13 15:23:55,106 EPOCH 7 done: loss 0.0162 - lr: 0.000017
178
+ 2023-10-13 15:24:06,529 DEV : loss 0.20719152688980103 - f1-score (micro avg) 0.8166
179
+ 2023-10-13 15:24:06,573 saving best model
180
+ 2023-10-13 15:24:07,354 ----------------------------------------------------------------------------------------------------
181
+ 2023-10-13 15:24:12,377 epoch 8 - iter 73/738 - loss 0.00864185 - time (sec): 5.02 - samples/sec: 3218.84 - lr: 0.000016 - momentum: 0.000000
182
+ 2023-10-13 15:24:17,396 epoch 8 - iter 146/738 - loss 0.01007836 - time (sec): 10.04 - samples/sec: 3180.73 - lr: 0.000016 - momentum: 0.000000
183
+ 2023-10-13 15:24:22,749 epoch 8 - iter 219/738 - loss 0.01079097 - time (sec): 15.39 - samples/sec: 3223.12 - lr: 0.000015 - momentum: 0.000000
184
+ 2023-10-13 15:24:27,812 epoch 8 - iter 292/738 - loss 0.01209528 - time (sec): 20.46 - samples/sec: 3195.75 - lr: 0.000015 - momentum: 0.000000
185
+ 2023-10-13 15:24:32,510 epoch 8 - iter 365/738 - loss 0.01368334 - time (sec): 25.15 - samples/sec: 3219.63 - lr: 0.000014 - momentum: 0.000000
186
+ 2023-10-13 15:24:37,678 epoch 8 - iter 438/738 - loss 0.01409645 - time (sec): 30.32 - samples/sec: 3206.98 - lr: 0.000013 - momentum: 0.000000
187
+ 2023-10-13 15:24:42,309 epoch 8 - iter 511/738 - loss 0.01413601 - time (sec): 34.95 - samples/sec: 3220.08 - lr: 0.000013 - momentum: 0.000000
188
+ 2023-10-13 15:24:47,889 epoch 8 - iter 584/738 - loss 0.01464820 - time (sec): 40.53 - samples/sec: 3230.23 - lr: 0.000012 - momentum: 0.000000
189
+ 2023-10-13 15:24:52,581 epoch 8 - iter 657/738 - loss 0.01388140 - time (sec): 45.22 - samples/sec: 3247.02 - lr: 0.000012 - momentum: 0.000000
190
+ 2023-10-13 15:24:57,881 epoch 8 - iter 730/738 - loss 0.01314649 - time (sec): 50.52 - samples/sec: 3263.37 - lr: 0.000011 - momentum: 0.000000
191
+ 2023-10-13 15:24:58,335 ----------------------------------------------------------------------------------------------------
192
+ 2023-10-13 15:24:58,335 EPOCH 8 done: loss 0.0131 - lr: 0.000011
193
+ 2023-10-13 15:25:09,552 DEV : loss 0.22207467257976532 - f1-score (micro avg) 0.8239
194
+ 2023-10-13 15:25:09,587 saving best model
195
+ 2023-10-13 15:25:10,122 ----------------------------------------------------------------------------------------------------
196
+ 2023-10-13 15:25:15,072 epoch 9 - iter 73/738 - loss 0.00622634 - time (sec): 4.95 - samples/sec: 3500.95 - lr: 0.000011 - momentum: 0.000000
197
+ 2023-10-13 15:25:19,870 epoch 9 - iter 146/738 - loss 0.00678073 - time (sec): 9.75 - samples/sec: 3451.45 - lr: 0.000010 - momentum: 0.000000
198
+ 2023-10-13 15:25:24,740 epoch 9 - iter 219/738 - loss 0.00883955 - time (sec): 14.62 - samples/sec: 3378.38 - lr: 0.000010 - momentum: 0.000000
199
+ 2023-10-13 15:25:29,740 epoch 9 - iter 292/738 - loss 0.00808768 - time (sec): 19.62 - samples/sec: 3336.72 - lr: 0.000009 - momentum: 0.000000
200
+ 2023-10-13 15:25:34,678 epoch 9 - iter 365/738 - loss 0.00752856 - time (sec): 24.55 - samples/sec: 3321.98 - lr: 0.000008 - momentum: 0.000000
201
+ 2023-10-13 15:25:39,270 epoch 9 - iter 438/738 - loss 0.00836252 - time (sec): 29.15 - samples/sec: 3326.86 - lr: 0.000008 - momentum: 0.000000
202
+ 2023-10-13 15:25:44,160 epoch 9 - iter 511/738 - loss 0.00810242 - time (sec): 34.04 - samples/sec: 3354.34 - lr: 0.000007 - momentum: 0.000000
203
+ 2023-10-13 15:25:49,564 epoch 9 - iter 584/738 - loss 0.00765179 - time (sec): 39.44 - samples/sec: 3330.39 - lr: 0.000007 - momentum: 0.000000
204
+ 2023-10-13 15:25:54,489 epoch 9 - iter 657/738 - loss 0.00751168 - time (sec): 44.36 - samples/sec: 3334.19 - lr: 0.000006 - momentum: 0.000000
205
+ 2023-10-13 15:25:59,344 epoch 9 - iter 730/738 - loss 0.00780373 - time (sec): 49.22 - samples/sec: 3338.80 - lr: 0.000006 - momentum: 0.000000
206
+ 2023-10-13 15:25:59,980 ----------------------------------------------------------------------------------------------------
207
+ 2023-10-13 15:25:59,980 EPOCH 9 done: loss 0.0077 - lr: 0.000006
208
+ 2023-10-13 15:26:11,819 DEV : loss 0.215680792927742 - f1-score (micro avg) 0.8301
209
+ 2023-10-13 15:26:11,859 saving best model
210
+ 2023-10-13 15:26:12,442 ----------------------------------------------------------------------------------------------------
211
+ 2023-10-13 15:26:17,570 epoch 10 - iter 73/738 - loss 0.00455941 - time (sec): 5.12 - samples/sec: 3146.40 - lr: 0.000005 - momentum: 0.000000
212
+ 2023-10-13 15:26:23,698 epoch 10 - iter 146/738 - loss 0.00464036 - time (sec): 11.25 - samples/sec: 3147.77 - lr: 0.000004 - momentum: 0.000000
213
+ 2023-10-13 15:26:28,944 epoch 10 - iter 219/738 - loss 0.00476684 - time (sec): 16.50 - samples/sec: 3115.40 - lr: 0.000004 - momentum: 0.000000
214
+ 2023-10-13 15:26:33,584 epoch 10 - iter 292/738 - loss 0.00428671 - time (sec): 21.14 - samples/sec: 3160.08 - lr: 0.000003 - momentum: 0.000000
215
+ 2023-10-13 15:26:38,315 epoch 10 - iter 365/738 - loss 0.00478654 - time (sec): 25.87 - samples/sec: 3178.75 - lr: 0.000003 - momentum: 0.000000
216
+ 2023-10-13 15:26:43,078 epoch 10 - iter 438/738 - loss 0.00517556 - time (sec): 30.63 - samples/sec: 3179.66 - lr: 0.000002 - momentum: 0.000000
217
+ 2023-10-13 15:26:48,428 epoch 10 - iter 511/738 - loss 0.00501566 - time (sec): 35.98 - samples/sec: 3193.01 - lr: 0.000002 - momentum: 0.000000
218
+ 2023-10-13 15:26:54,233 epoch 10 - iter 584/738 - loss 0.00534061 - time (sec): 41.79 - samples/sec: 3134.82 - lr: 0.000001 - momentum: 0.000000
219
+ 2023-10-13 15:26:59,099 epoch 10 - iter 657/738 - loss 0.00508094 - time (sec): 46.65 - samples/sec: 3150.16 - lr: 0.000001 - momentum: 0.000000
220
+ 2023-10-13 15:27:04,541 epoch 10 - iter 730/738 - loss 0.00505925 - time (sec): 52.10 - samples/sec: 3167.31 - lr: 0.000000 - momentum: 0.000000
221
+ 2023-10-13 15:27:04,973 ----------------------------------------------------------------------------------------------------
222
+ 2023-10-13 15:27:04,974 EPOCH 10 done: loss 0.0050 - lr: 0.000000
223
+ 2023-10-13 15:27:16,194 DEV : loss 0.22398900985717773 - f1-score (micro avg) 0.8267
224
+ 2023-10-13 15:27:16,617 ----------------------------------------------------------------------------------------------------
225
+ 2023-10-13 15:27:16,619 Loading model from best epoch ...
226
+ 2023-10-13 15:27:18,212 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
227
+ 2023-10-13 15:27:24,197
228
+ Results:
229
+ - F-score (micro) 0.7931
230
+ - F-score (macro) 0.6926
231
+ - Accuracy 0.6833
232
+
233
+ By class:
234
+ precision recall f1-score support
235
+
236
+ loc 0.8555 0.8765 0.8659 858
237
+ pers 0.7491 0.8007 0.7741 537
238
+ org 0.5294 0.6136 0.5684 132
239
+ time 0.4789 0.6296 0.5440 54
240
+ prod 0.7167 0.7049 0.7107 61
241
+
242
+ micro avg 0.7714 0.8161 0.7931 1642
243
+ macro avg 0.6659 0.7251 0.6926 1642
244
+ weighted avg 0.7770 0.8161 0.7956 1642
245
+
246
+ 2023-10-13 15:27:24,197 ----------------------------------------------------------------------------------------------------