Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- final-model.pt +3 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697142994.de2e83fddbee.1952.8 +3 -0
- test.tsv +0 -0
- training.log +260 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4c0b9251a4d331a8e798893d9158910661642a6c80a8fabd1ed1f8854124660e
|
3 |
+
size 870793839
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
final-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e0953c2a0bb664a68b2731c61fc393400878d55bbed660a7504e0bf220a2796c
|
3 |
+
size 870793956
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 20:45:16 0.0001 1.0276 0.1668 0.4545 0.5362 0.4920 0.3480
|
3 |
+
2 20:53:59 0.0001 0.1406 0.0920 0.7224 0.7624 0.7419 0.6138
|
4 |
+
3 21:02:49 0.0001 0.0803 0.0899 0.7393 0.7828 0.7604 0.6349
|
5 |
+
4 21:11:34 0.0001 0.0550 0.1102 0.7219 0.7839 0.7516 0.6232
|
6 |
+
5 21:20:34 0.0001 0.0399 0.1232 0.7317 0.7681 0.7494 0.6178
|
7 |
+
6 21:29:34 0.0001 0.0317 0.1524 0.7304 0.7602 0.7450 0.6148
|
8 |
+
7 21:38:44 0.0001 0.0241 0.1624 0.7338 0.7828 0.7575 0.6337
|
9 |
+
8 21:47:10 0.0000 0.0188 0.1798 0.7254 0.7771 0.7504 0.6223
|
10 |
+
9 21:55:37 0.0000 0.0162 0.1869 0.7338 0.7828 0.7575 0.6308
|
11 |
+
10 22:04:01 0.0000 0.0126 0.1963 0.7370 0.7828 0.7592 0.6337
|
runs/events.out.tfevents.1697142994.de2e83fddbee.1952.8
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:460f2c6ac758d35c24793245d0de9cc7d1c79cf1cd52373fd36a08df6e178bf9
|
3 |
+
size 556612
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,260 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-12 20:36:34,417 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-12 20:36:34,419 Model: "SequenceTagger(
|
3 |
+
(embeddings): ByT5Embeddings(
|
4 |
+
(model): T5EncoderModel(
|
5 |
+
(shared): Embedding(384, 1472)
|
6 |
+
(encoder): T5Stack(
|
7 |
+
(embed_tokens): Embedding(384, 1472)
|
8 |
+
(block): ModuleList(
|
9 |
+
(0): T5Block(
|
10 |
+
(layer): ModuleList(
|
11 |
+
(0): T5LayerSelfAttention(
|
12 |
+
(SelfAttention): T5Attention(
|
13 |
+
(q): Linear(in_features=1472, out_features=384, bias=False)
|
14 |
+
(k): Linear(in_features=1472, out_features=384, bias=False)
|
15 |
+
(v): Linear(in_features=1472, out_features=384, bias=False)
|
16 |
+
(o): Linear(in_features=384, out_features=1472, bias=False)
|
17 |
+
(relative_attention_bias): Embedding(32, 6)
|
18 |
+
)
|
19 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(1): T5LayerFF(
|
23 |
+
(DenseReluDense): T5DenseGatedActDense(
|
24 |
+
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
|
25 |
+
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
|
26 |
+
(wo): Linear(in_features=3584, out_features=1472, bias=False)
|
27 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
28 |
+
(act): NewGELUActivation()
|
29 |
+
)
|
30 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
31 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
32 |
+
)
|
33 |
+
)
|
34 |
+
)
|
35 |
+
(1-11): 11 x T5Block(
|
36 |
+
(layer): ModuleList(
|
37 |
+
(0): T5LayerSelfAttention(
|
38 |
+
(SelfAttention): T5Attention(
|
39 |
+
(q): Linear(in_features=1472, out_features=384, bias=False)
|
40 |
+
(k): Linear(in_features=1472, out_features=384, bias=False)
|
41 |
+
(v): Linear(in_features=1472, out_features=384, bias=False)
|
42 |
+
(o): Linear(in_features=384, out_features=1472, bias=False)
|
43 |
+
)
|
44 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
45 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
46 |
+
)
|
47 |
+
(1): T5LayerFF(
|
48 |
+
(DenseReluDense): T5DenseGatedActDense(
|
49 |
+
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
|
50 |
+
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
|
51 |
+
(wo): Linear(in_features=3584, out_features=1472, bias=False)
|
52 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
53 |
+
(act): NewGELUActivation()
|
54 |
+
)
|
55 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
56 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
57 |
+
)
|
58 |
+
)
|
59 |
+
)
|
60 |
+
)
|
61 |
+
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
62 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
63 |
+
)
|
64 |
+
)
|
65 |
+
)
|
66 |
+
(locked_dropout): LockedDropout(p=0.5)
|
67 |
+
(linear): Linear(in_features=1472, out_features=13, bias=True)
|
68 |
+
(loss_function): CrossEntropyLoss()
|
69 |
+
)"
|
70 |
+
2023-10-12 20:36:34,420 ----------------------------------------------------------------------------------------------------
|
71 |
+
2023-10-12 20:36:34,420 MultiCorpus: 7936 train + 992 dev + 992 test sentences
|
72 |
+
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
|
73 |
+
2023-10-12 20:36:34,420 ----------------------------------------------------------------------------------------------------
|
74 |
+
2023-10-12 20:36:34,420 Train: 7936 sentences
|
75 |
+
2023-10-12 20:36:34,420 (train_with_dev=False, train_with_test=False)
|
76 |
+
2023-10-12 20:36:34,421 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-10-12 20:36:34,421 Training Params:
|
78 |
+
2023-10-12 20:36:34,421 - learning_rate: "0.00015"
|
79 |
+
2023-10-12 20:36:34,421 - mini_batch_size: "8"
|
80 |
+
2023-10-12 20:36:34,421 - max_epochs: "10"
|
81 |
+
2023-10-12 20:36:34,421 - shuffle: "True"
|
82 |
+
2023-10-12 20:36:34,421 ----------------------------------------------------------------------------------------------------
|
83 |
+
2023-10-12 20:36:34,421 Plugins:
|
84 |
+
2023-10-12 20:36:34,421 - TensorboardLogger
|
85 |
+
2023-10-12 20:36:34,421 - LinearScheduler | warmup_fraction: '0.1'
|
86 |
+
2023-10-12 20:36:34,422 ----------------------------------------------------------------------------------------------------
|
87 |
+
2023-10-12 20:36:34,422 Final evaluation on model from best epoch (best-model.pt)
|
88 |
+
2023-10-12 20:36:34,422 - metric: "('micro avg', 'f1-score')"
|
89 |
+
2023-10-12 20:36:34,422 ----------------------------------------------------------------------------------------------------
|
90 |
+
2023-10-12 20:36:34,422 Computation:
|
91 |
+
2023-10-12 20:36:34,422 - compute on device: cuda:0
|
92 |
+
2023-10-12 20:36:34,422 - embedding storage: none
|
93 |
+
2023-10-12 20:36:34,422 ----------------------------------------------------------------------------------------------------
|
94 |
+
2023-10-12 20:36:34,422 Model training base path: "hmbench-icdar/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-3"
|
95 |
+
2023-10-12 20:36:34,422 ----------------------------------------------------------------------------------------------------
|
96 |
+
2023-10-12 20:36:34,423 ----------------------------------------------------------------------------------------------------
|
97 |
+
2023-10-12 20:36:34,423 Logging anything other than scalars to TensorBoard is currently not supported.
|
98 |
+
2023-10-12 20:37:23,526 epoch 1 - iter 99/992 - loss 2.53940455 - time (sec): 49.10 - samples/sec: 327.17 - lr: 0.000015 - momentum: 0.000000
|
99 |
+
2023-10-12 20:38:09,552 epoch 1 - iter 198/992 - loss 2.45209354 - time (sec): 95.13 - samples/sec: 334.01 - lr: 0.000030 - momentum: 0.000000
|
100 |
+
2023-10-12 20:38:56,579 epoch 1 - iter 297/992 - loss 2.23771454 - time (sec): 142.15 - samples/sec: 342.90 - lr: 0.000045 - momentum: 0.000000
|
101 |
+
2023-10-12 20:39:47,556 epoch 1 - iter 396/992 - loss 2.00240804 - time (sec): 193.13 - samples/sec: 337.89 - lr: 0.000060 - momentum: 0.000000
|
102 |
+
2023-10-12 20:40:37,028 epoch 1 - iter 495/992 - loss 1.75078774 - time (sec): 242.60 - samples/sec: 338.44 - lr: 0.000075 - momentum: 0.000000
|
103 |
+
2023-10-12 20:41:27,744 epoch 1 - iter 594/992 - loss 1.53136792 - time (sec): 293.32 - samples/sec: 335.30 - lr: 0.000090 - momentum: 0.000000
|
104 |
+
2023-10-12 20:42:16,877 epoch 1 - iter 693/992 - loss 1.36980903 - time (sec): 342.45 - samples/sec: 332.96 - lr: 0.000105 - momentum: 0.000000
|
105 |
+
2023-10-12 20:43:05,181 epoch 1 - iter 792/992 - loss 1.23246121 - time (sec): 390.76 - samples/sec: 334.01 - lr: 0.000120 - momentum: 0.000000
|
106 |
+
2023-10-12 20:43:58,462 epoch 1 - iter 891/992 - loss 1.11904567 - time (sec): 444.04 - samples/sec: 332.10 - lr: 0.000135 - momentum: 0.000000
|
107 |
+
2023-10-12 20:44:49,457 epoch 1 - iter 990/992 - loss 1.02887479 - time (sec): 495.03 - samples/sec: 330.68 - lr: 0.000150 - momentum: 0.000000
|
108 |
+
2023-10-12 20:44:50,413 ----------------------------------------------------------------------------------------------------
|
109 |
+
2023-10-12 20:44:50,413 EPOCH 1 done: loss 1.0276 - lr: 0.000150
|
110 |
+
2023-10-12 20:45:15,991 DEV : loss 0.16679786145687103 - f1-score (micro avg) 0.492
|
111 |
+
2023-10-12 20:45:16,031 saving best model
|
112 |
+
2023-10-12 20:45:16,981 ----------------------------------------------------------------------------------------------------
|
113 |
+
2023-10-12 20:46:06,496 epoch 2 - iter 99/992 - loss 0.17077119 - time (sec): 49.51 - samples/sec: 340.13 - lr: 0.000148 - momentum: 0.000000
|
114 |
+
2023-10-12 20:46:56,296 epoch 2 - iter 198/992 - loss 0.16840803 - time (sec): 99.31 - samples/sec: 332.86 - lr: 0.000147 - momentum: 0.000000
|
115 |
+
2023-10-12 20:47:44,810 epoch 2 - iter 297/992 - loss 0.16705195 - time (sec): 147.83 - samples/sec: 333.86 - lr: 0.000145 - momentum: 0.000000
|
116 |
+
2023-10-12 20:48:32,058 epoch 2 - iter 396/992 - loss 0.15897276 - time (sec): 195.08 - samples/sec: 339.12 - lr: 0.000143 - momentum: 0.000000
|
117 |
+
2023-10-12 20:49:21,939 epoch 2 - iter 495/992 - loss 0.15604786 - time (sec): 244.96 - samples/sec: 335.31 - lr: 0.000142 - momentum: 0.000000
|
118 |
+
2023-10-12 20:50:12,969 epoch 2 - iter 594/992 - loss 0.15226274 - time (sec): 295.99 - samples/sec: 334.72 - lr: 0.000140 - momentum: 0.000000
|
119 |
+
2023-10-12 20:51:01,723 epoch 2 - iter 693/992 - loss 0.14825625 - time (sec): 344.74 - samples/sec: 336.07 - lr: 0.000138 - momentum: 0.000000
|
120 |
+
2023-10-12 20:51:55,514 epoch 2 - iter 792/992 - loss 0.14565453 - time (sec): 398.53 - samples/sec: 330.04 - lr: 0.000137 - momentum: 0.000000
|
121 |
+
2023-10-12 20:52:44,252 epoch 2 - iter 891/992 - loss 0.14261297 - time (sec): 447.27 - samples/sec: 329.90 - lr: 0.000135 - momentum: 0.000000
|
122 |
+
2023-10-12 20:53:32,986 epoch 2 - iter 990/992 - loss 0.14067570 - time (sec): 496.00 - samples/sec: 330.13 - lr: 0.000133 - momentum: 0.000000
|
123 |
+
2023-10-12 20:53:33,941 ----------------------------------------------------------------------------------------------------
|
124 |
+
2023-10-12 20:53:33,942 EPOCH 2 done: loss 0.1406 - lr: 0.000133
|
125 |
+
2023-10-12 20:53:59,736 DEV : loss 0.09202314913272858 - f1-score (micro avg) 0.7419
|
126 |
+
2023-10-12 20:53:59,777 saving best model
|
127 |
+
2023-10-12 20:54:02,426 ----------------------------------------------------------------------------------------------------
|
128 |
+
2023-10-12 20:54:56,733 epoch 3 - iter 99/992 - loss 0.08481471 - time (sec): 54.29 - samples/sec: 316.06 - lr: 0.000132 - momentum: 0.000000
|
129 |
+
2023-10-12 20:55:48,391 epoch 3 - iter 198/992 - loss 0.08839525 - time (sec): 105.95 - samples/sec: 313.11 - lr: 0.000130 - momentum: 0.000000
|
130 |
+
2023-10-12 20:56:36,336 epoch 3 - iter 297/992 - loss 0.08581007 - time (sec): 153.89 - samples/sec: 317.80 - lr: 0.000128 - momentum: 0.000000
|
131 |
+
2023-10-12 20:57:25,973 epoch 3 - iter 396/992 - loss 0.08230993 - time (sec): 203.53 - samples/sec: 317.82 - lr: 0.000127 - momentum: 0.000000
|
132 |
+
2023-10-12 20:58:15,723 epoch 3 - iter 495/992 - loss 0.08191812 - time (sec): 253.28 - samples/sec: 321.05 - lr: 0.000125 - momentum: 0.000000
|
133 |
+
2023-10-12 20:59:05,929 epoch 3 - iter 594/992 - loss 0.08257402 - time (sec): 303.49 - samples/sec: 321.37 - lr: 0.000123 - momentum: 0.000000
|
134 |
+
2023-10-12 20:59:53,635 epoch 3 - iter 693/992 - loss 0.08166994 - time (sec): 351.19 - samples/sec: 322.98 - lr: 0.000122 - momentum: 0.000000
|
135 |
+
2023-10-12 21:00:42,974 epoch 3 - iter 792/992 - loss 0.08101549 - time (sec): 400.53 - samples/sec: 324.25 - lr: 0.000120 - momentum: 0.000000
|
136 |
+
2023-10-12 21:01:32,280 epoch 3 - iter 891/992 - loss 0.08026505 - time (sec): 449.84 - samples/sec: 325.56 - lr: 0.000118 - momentum: 0.000000
|
137 |
+
2023-10-12 21:02:23,827 epoch 3 - iter 990/992 - loss 0.08033861 - time (sec): 501.38 - samples/sec: 326.64 - lr: 0.000117 - momentum: 0.000000
|
138 |
+
2023-10-12 21:02:24,696 ----------------------------------------------------------------------------------------------------
|
139 |
+
2023-10-12 21:02:24,696 EPOCH 3 done: loss 0.0803 - lr: 0.000117
|
140 |
+
2023-10-12 21:02:49,665 DEV : loss 0.08990765362977982 - f1-score (micro avg) 0.7604
|
141 |
+
2023-10-12 21:02:49,705 saving best model
|
142 |
+
2023-10-12 21:02:52,306 ----------------------------------------------------------------------------------------------------
|
143 |
+
2023-10-12 21:03:41,885 epoch 4 - iter 99/992 - loss 0.05709566 - time (sec): 49.57 - samples/sec: 351.31 - lr: 0.000115 - momentum: 0.000000
|
144 |
+
2023-10-12 21:04:31,315 epoch 4 - iter 198/992 - loss 0.05384534 - time (sec): 99.00 - samples/sec: 335.44 - lr: 0.000113 - momentum: 0.000000
|
145 |
+
2023-10-12 21:05:20,247 epoch 4 - iter 297/992 - loss 0.05406144 - time (sec): 147.94 - samples/sec: 334.00 - lr: 0.000112 - momentum: 0.000000
|
146 |
+
2023-10-12 21:06:07,442 epoch 4 - iter 396/992 - loss 0.05302751 - time (sec): 195.13 - samples/sec: 336.83 - lr: 0.000110 - momentum: 0.000000
|
147 |
+
2023-10-12 21:06:53,930 epoch 4 - iter 495/992 - loss 0.05258435 - time (sec): 241.62 - samples/sec: 339.91 - lr: 0.000108 - momentum: 0.000000
|
148 |
+
2023-10-12 21:07:42,776 epoch 4 - iter 594/992 - loss 0.05291144 - time (sec): 290.47 - samples/sec: 337.89 - lr: 0.000107 - momentum: 0.000000
|
149 |
+
2023-10-12 21:08:31,390 epoch 4 - iter 693/992 - loss 0.05440967 - time (sec): 339.08 - samples/sec: 335.36 - lr: 0.000105 - momentum: 0.000000
|
150 |
+
2023-10-12 21:09:21,204 epoch 4 - iter 792/992 - loss 0.05467823 - time (sec): 388.89 - samples/sec: 335.88 - lr: 0.000103 - momentum: 0.000000
|
151 |
+
2023-10-12 21:10:13,890 epoch 4 - iter 891/992 - loss 0.05499260 - time (sec): 441.58 - samples/sec: 332.35 - lr: 0.000102 - momentum: 0.000000
|
152 |
+
2023-10-12 21:11:07,827 epoch 4 - iter 990/992 - loss 0.05480641 - time (sec): 495.52 - samples/sec: 330.33 - lr: 0.000100 - momentum: 0.000000
|
153 |
+
2023-10-12 21:11:08,918 ----------------------------------------------------------------------------------------------------
|
154 |
+
2023-10-12 21:11:08,919 EPOCH 4 done: loss 0.0550 - lr: 0.000100
|
155 |
+
2023-10-12 21:11:34,092 DEV : loss 0.11015438288450241 - f1-score (micro avg) 0.7516
|
156 |
+
2023-10-12 21:11:34,132 ----------------------------------------------------------------------------------------------------
|
157 |
+
2023-10-12 21:12:29,220 epoch 5 - iter 99/992 - loss 0.03490950 - time (sec): 55.09 - samples/sec: 281.83 - lr: 0.000098 - momentum: 0.000000
|
158 |
+
2023-10-12 21:13:19,557 epoch 5 - iter 198/992 - loss 0.03110268 - time (sec): 105.42 - samples/sec: 302.69 - lr: 0.000097 - momentum: 0.000000
|
159 |
+
2023-10-12 21:14:08,872 epoch 5 - iter 297/992 - loss 0.03567964 - time (sec): 154.74 - samples/sec: 313.09 - lr: 0.000095 - momentum: 0.000000
|
160 |
+
2023-10-12 21:15:00,906 epoch 5 - iter 396/992 - loss 0.03894388 - time (sec): 206.77 - samples/sec: 307.97 - lr: 0.000093 - momentum: 0.000000
|
161 |
+
2023-10-12 21:15:50,989 epoch 5 - iter 495/992 - loss 0.03871853 - time (sec): 256.85 - samples/sec: 308.70 - lr: 0.000092 - momentum: 0.000000
|
162 |
+
2023-10-12 21:16:42,127 epoch 5 - iter 594/992 - loss 0.03866059 - time (sec): 307.99 - samples/sec: 311.40 - lr: 0.000090 - momentum: 0.000000
|
163 |
+
2023-10-12 21:17:33,881 epoch 5 - iter 693/992 - loss 0.03769547 - time (sec): 359.75 - samples/sec: 316.19 - lr: 0.000088 - momentum: 0.000000
|
164 |
+
2023-10-12 21:18:22,487 epoch 5 - iter 792/992 - loss 0.03775340 - time (sec): 408.35 - samples/sec: 319.19 - lr: 0.000087 - momentum: 0.000000
|
165 |
+
2023-10-12 21:19:12,969 epoch 5 - iter 891/992 - loss 0.03942820 - time (sec): 458.83 - samples/sec: 319.35 - lr: 0.000085 - momentum: 0.000000
|
166 |
+
2023-10-12 21:20:05,611 epoch 5 - iter 990/992 - loss 0.03993968 - time (sec): 511.48 - samples/sec: 319.98 - lr: 0.000083 - momentum: 0.000000
|
167 |
+
2023-10-12 21:20:06,684 ----------------------------------------------------------------------------------------------------
|
168 |
+
2023-10-12 21:20:06,685 EPOCH 5 done: loss 0.0399 - lr: 0.000083
|
169 |
+
2023-10-12 21:20:34,573 DEV : loss 0.12319868057966232 - f1-score (micro avg) 0.7494
|
170 |
+
2023-10-12 21:20:34,627 ----------------------------------------------------------------------------------------------------
|
171 |
+
2023-10-12 21:21:25,537 epoch 6 - iter 99/992 - loss 0.03811252 - time (sec): 50.91 - samples/sec: 319.23 - lr: 0.000082 - momentum: 0.000000
|
172 |
+
2023-10-12 21:22:16,408 epoch 6 - iter 198/992 - loss 0.03228055 - time (sec): 101.78 - samples/sec: 316.51 - lr: 0.000080 - momentum: 0.000000
|
173 |
+
2023-10-12 21:23:07,761 epoch 6 - iter 297/992 - loss 0.03228149 - time (sec): 153.13 - samples/sec: 318.71 - lr: 0.000078 - momentum: 0.000000
|
174 |
+
2023-10-12 21:24:00,236 epoch 6 - iter 396/992 - loss 0.03210612 - time (sec): 205.61 - samples/sec: 318.41 - lr: 0.000077 - momentum: 0.000000
|
175 |
+
2023-10-12 21:24:53,252 epoch 6 - iter 495/992 - loss 0.03193389 - time (sec): 258.62 - samples/sec: 317.54 - lr: 0.000075 - momentum: 0.000000
|
176 |
+
2023-10-12 21:25:42,174 epoch 6 - iter 594/992 - loss 0.03222843 - time (sec): 307.54 - samples/sec: 318.60 - lr: 0.000073 - momentum: 0.000000
|
177 |
+
2023-10-12 21:26:32,031 epoch 6 - iter 693/992 - loss 0.03237733 - time (sec): 357.40 - samples/sec: 322.22 - lr: 0.000072 - momentum: 0.000000
|
178 |
+
2023-10-12 21:27:23,817 epoch 6 - iter 792/992 - loss 0.03084620 - time (sec): 409.19 - samples/sec: 321.91 - lr: 0.000070 - momentum: 0.000000
|
179 |
+
2023-10-12 21:28:16,696 epoch 6 - iter 891/992 - loss 0.03172155 - time (sec): 462.07 - samples/sec: 321.02 - lr: 0.000068 - momentum: 0.000000
|
180 |
+
2023-10-12 21:29:06,526 epoch 6 - iter 990/992 - loss 0.03169299 - time (sec): 511.90 - samples/sec: 319.93 - lr: 0.000067 - momentum: 0.000000
|
181 |
+
2023-10-12 21:29:07,472 ----------------------------------------------------------------------------------------------------
|
182 |
+
2023-10-12 21:29:07,473 EPOCH 6 done: loss 0.0317 - lr: 0.000067
|
183 |
+
2023-10-12 21:29:34,854 DEV : loss 0.1523526906967163 - f1-score (micro avg) 0.745
|
184 |
+
2023-10-12 21:29:34,898 ----------------------------------------------------------------------------------------------------
|
185 |
+
2023-10-12 21:30:29,230 epoch 7 - iter 99/992 - loss 0.01836233 - time (sec): 54.33 - samples/sec: 297.39 - lr: 0.000065 - momentum: 0.000000
|
186 |
+
2023-10-12 21:31:25,213 epoch 7 - iter 198/992 - loss 0.01888426 - time (sec): 110.31 - samples/sec: 298.53 - lr: 0.000063 - momentum: 0.000000
|
187 |
+
2023-10-12 21:32:16,337 epoch 7 - iter 297/992 - loss 0.02103087 - time (sec): 161.44 - samples/sec: 303.30 - lr: 0.000062 - momentum: 0.000000
|
188 |
+
2023-10-12 21:33:10,363 epoch 7 - iter 396/992 - loss 0.02296196 - time (sec): 215.46 - samples/sec: 303.71 - lr: 0.000060 - momentum: 0.000000
|
189 |
+
2023-10-12 21:34:04,597 epoch 7 - iter 495/992 - loss 0.02381151 - time (sec): 269.70 - samples/sec: 302.28 - lr: 0.000058 - momentum: 0.000000
|
190 |
+
2023-10-12 21:34:59,557 epoch 7 - iter 594/992 - loss 0.02454321 - time (sec): 324.66 - samples/sec: 302.98 - lr: 0.000057 - momentum: 0.000000
|
191 |
+
2023-10-12 21:35:53,242 epoch 7 - iter 693/992 - loss 0.02362879 - time (sec): 378.34 - samples/sec: 303.19 - lr: 0.000055 - momentum: 0.000000
|
192 |
+
2023-10-12 21:36:41,463 epoch 7 - iter 792/992 - loss 0.02358871 - time (sec): 426.56 - samples/sec: 307.73 - lr: 0.000053 - momentum: 0.000000
|
193 |
+
2023-10-12 21:37:29,049 epoch 7 - iter 891/992 - loss 0.02327723 - time (sec): 474.15 - samples/sec: 310.66 - lr: 0.000052 - momentum: 0.000000
|
194 |
+
2023-10-12 21:38:17,393 epoch 7 - iter 990/992 - loss 0.02414669 - time (sec): 522.49 - samples/sec: 313.30 - lr: 0.000050 - momentum: 0.000000
|
195 |
+
2023-10-12 21:38:18,316 ----------------------------------------------------------------------------------------------------
|
196 |
+
2023-10-12 21:38:18,316 EPOCH 7 done: loss 0.0241 - lr: 0.000050
|
197 |
+
2023-10-12 21:38:44,375 DEV : loss 0.16238392889499664 - f1-score (micro avg) 0.7575
|
198 |
+
2023-10-12 21:38:44,416 ----------------------------------------------------------------------------------------------------
|
199 |
+
2023-10-12 21:39:32,311 epoch 8 - iter 99/992 - loss 0.02098412 - time (sec): 47.89 - samples/sec: 345.39 - lr: 0.000048 - momentum: 0.000000
|
200 |
+
2023-10-12 21:40:21,125 epoch 8 - iter 198/992 - loss 0.02001400 - time (sec): 96.71 - samples/sec: 330.15 - lr: 0.000047 - momentum: 0.000000
|
201 |
+
2023-10-12 21:41:11,175 epoch 8 - iter 297/992 - loss 0.01778574 - time (sec): 146.76 - samples/sec: 332.28 - lr: 0.000045 - momentum: 0.000000
|
202 |
+
2023-10-12 21:42:00,933 epoch 8 - iter 396/992 - loss 0.01851136 - time (sec): 196.51 - samples/sec: 333.89 - lr: 0.000043 - momentum: 0.000000
|
203 |
+
2023-10-12 21:42:47,636 epoch 8 - iter 495/992 - loss 0.01977844 - time (sec): 243.22 - samples/sec: 337.22 - lr: 0.000042 - momentum: 0.000000
|
204 |
+
2023-10-12 21:43:35,853 epoch 8 - iter 594/992 - loss 0.01974787 - time (sec): 291.43 - samples/sec: 336.66 - lr: 0.000040 - momentum: 0.000000
|
205 |
+
2023-10-12 21:44:21,954 epoch 8 - iter 693/992 - loss 0.01908591 - time (sec): 337.54 - samples/sec: 338.15 - lr: 0.000038 - momentum: 0.000000
|
206 |
+
2023-10-12 21:45:09,680 epoch 8 - iter 792/992 - loss 0.01817565 - time (sec): 385.26 - samples/sec: 340.05 - lr: 0.000037 - momentum: 0.000000
|
207 |
+
2023-10-12 21:45:56,435 epoch 8 - iter 891/992 - loss 0.01823549 - time (sec): 432.02 - samples/sec: 340.11 - lr: 0.000035 - momentum: 0.000000
|
208 |
+
2023-10-12 21:46:44,381 epoch 8 - iter 990/992 - loss 0.01881597 - time (sec): 479.96 - samples/sec: 340.91 - lr: 0.000033 - momentum: 0.000000
|
209 |
+
2023-10-12 21:46:45,354 ----------------------------------------------------------------------------------------------------
|
210 |
+
2023-10-12 21:46:45,354 EPOCH 8 done: loss 0.0188 - lr: 0.000033
|
211 |
+
2023-10-12 21:47:10,756 DEV : loss 0.1798100620508194 - f1-score (micro avg) 0.7504
|
212 |
+
2023-10-12 21:47:10,796 ----------------------------------------------------------------------------------------------------
|
213 |
+
2023-10-12 21:47:57,915 epoch 9 - iter 99/992 - loss 0.01330830 - time (sec): 47.12 - samples/sec: 328.12 - lr: 0.000032 - momentum: 0.000000
|
214 |
+
2023-10-12 21:48:46,407 epoch 9 - iter 198/992 - loss 0.01224787 - time (sec): 95.61 - samples/sec: 322.93 - lr: 0.000030 - momentum: 0.000000
|
215 |
+
2023-10-12 21:49:35,505 epoch 9 - iter 297/992 - loss 0.01382402 - time (sec): 144.71 - samples/sec: 326.64 - lr: 0.000028 - momentum: 0.000000
|
216 |
+
2023-10-12 21:50:25,041 epoch 9 - iter 396/992 - loss 0.01437517 - time (sec): 194.24 - samples/sec: 331.02 - lr: 0.000027 - momentum: 0.000000
|
217 |
+
2023-10-12 21:51:12,360 epoch 9 - iter 495/992 - loss 0.01368674 - time (sec): 241.56 - samples/sec: 334.81 - lr: 0.000025 - momentum: 0.000000
|
218 |
+
2023-10-12 21:51:59,940 epoch 9 - iter 594/992 - loss 0.01453578 - time (sec): 289.14 - samples/sec: 340.98 - lr: 0.000023 - momentum: 0.000000
|
219 |
+
2023-10-12 21:52:46,742 epoch 9 - iter 693/992 - loss 0.01553987 - time (sec): 335.94 - samples/sec: 343.65 - lr: 0.000022 - momentum: 0.000000
|
220 |
+
2023-10-12 21:53:34,899 epoch 9 - iter 792/992 - loss 0.01569527 - time (sec): 384.10 - samples/sec: 343.96 - lr: 0.000020 - momentum: 0.000000
|
221 |
+
2023-10-12 21:54:22,482 epoch 9 - iter 891/992 - loss 0.01579337 - time (sec): 431.68 - samples/sec: 344.20 - lr: 0.000018 - momentum: 0.000000
|
222 |
+
2023-10-12 21:55:10,034 epoch 9 - iter 990/992 - loss 0.01620441 - time (sec): 479.24 - samples/sec: 341.45 - lr: 0.000017 - momentum: 0.000000
|
223 |
+
2023-10-12 21:55:10,982 ----------------------------------------------------------------------------------------------------
|
224 |
+
2023-10-12 21:55:10,982 EPOCH 9 done: loss 0.0162 - lr: 0.000017
|
225 |
+
2023-10-12 21:55:37,009 DEV : loss 0.18687152862548828 - f1-score (micro avg) 0.7575
|
226 |
+
2023-10-12 21:55:37,055 ----------------------------------------------------------------------------------------------------
|
227 |
+
2023-10-12 21:56:24,933 epoch 10 - iter 99/992 - loss 0.00770149 - time (sec): 47.88 - samples/sec: 344.62 - lr: 0.000015 - momentum: 0.000000
|
228 |
+
2023-10-12 21:57:12,835 epoch 10 - iter 198/992 - loss 0.00989841 - time (sec): 95.78 - samples/sec: 346.95 - lr: 0.000013 - momentum: 0.000000
|
229 |
+
2023-10-12 21:58:00,731 epoch 10 - iter 297/992 - loss 0.01166225 - time (sec): 143.67 - samples/sec: 348.74 - lr: 0.000012 - momentum: 0.000000
|
230 |
+
2023-10-12 21:58:46,948 epoch 10 - iter 396/992 - loss 0.01166889 - time (sec): 189.89 - samples/sec: 348.12 - lr: 0.000010 - momentum: 0.000000
|
231 |
+
2023-10-12 21:59:35,210 epoch 10 - iter 495/992 - loss 0.01103474 - time (sec): 238.15 - samples/sec: 346.06 - lr: 0.000008 - momentum: 0.000000
|
232 |
+
2023-10-12 22:00:23,313 epoch 10 - iter 594/992 - loss 0.01167473 - time (sec): 286.26 - samples/sec: 343.43 - lr: 0.000007 - momentum: 0.000000
|
233 |
+
2023-10-12 22:01:12,909 epoch 10 - iter 693/992 - loss 0.01237909 - time (sec): 335.85 - samples/sec: 342.47 - lr: 0.000005 - momentum: 0.000000
|
234 |
+
2023-10-12 22:01:59,272 epoch 10 - iter 792/992 - loss 0.01251453 - time (sec): 382.21 - samples/sec: 342.30 - lr: 0.000004 - momentum: 0.000000
|
235 |
+
2023-10-12 22:02:46,892 epoch 10 - iter 891/992 - loss 0.01256245 - time (sec): 429.83 - samples/sec: 343.86 - lr: 0.000002 - momentum: 0.000000
|
236 |
+
2023-10-12 22:03:34,464 epoch 10 - iter 990/992 - loss 0.01261074 - time (sec): 477.41 - samples/sec: 342.85 - lr: 0.000000 - momentum: 0.000000
|
237 |
+
2023-10-12 22:03:35,396 ----------------------------------------------------------------------------------------------------
|
238 |
+
2023-10-12 22:03:35,397 EPOCH 10 done: loss 0.0126 - lr: 0.000000
|
239 |
+
2023-10-12 22:04:01,331 DEV : loss 0.19634360074996948 - f1-score (micro avg) 0.7592
|
240 |
+
2023-10-12 22:04:02,299 ----------------------------------------------------------------------------------------------------
|
241 |
+
2023-10-12 22:04:02,301 Loading model from best epoch ...
|
242 |
+
2023-10-12 22:04:05,962 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
|
243 |
+
2023-10-12 22:04:30,881
|
244 |
+
Results:
|
245 |
+
- F-score (micro) 0.7525
|
246 |
+
- F-score (macro) 0.6889
|
247 |
+
- Accuracy 0.6284
|
248 |
+
|
249 |
+
By class:
|
250 |
+
precision recall f1-score support
|
251 |
+
|
252 |
+
LOC 0.7875 0.8092 0.7982 655
|
253 |
+
PER 0.7255 0.8296 0.7741 223
|
254 |
+
ORG 0.4595 0.5354 0.4945 127
|
255 |
+
|
256 |
+
micro avg 0.7277 0.7791 0.7525 1005
|
257 |
+
macro avg 0.6575 0.7247 0.6889 1005
|
258 |
+
weighted avg 0.7323 0.7791 0.7545 1005
|
259 |
+
|
260 |
+
2023-10-12 22:04:30,881 ----------------------------------------------------------------------------------------------------
|