File size: 71,460 Bytes
c96c265
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
2024-08-06 03:39:40,361 INFO [trainer.py:870] (1/8) Training started
2024-08-06 03:39:40,362 INFO [trainer.py:889] (1/8) Device: cuda:1
2024-08-06 03:39:40,362 INFO [trainer.py:890] (1/8) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 100, 'reset_interval': 200, 'valid_interval': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '279b0c87015a615b81b147251814d737a548f397', 'k2-git-date': 'Wed May 24 22:24:09 2023', 'lhotse-version': '1.26.0', 'torch-version': '2.0.1+cu118', 'torch-cuda-available': True, 'torch-cuda-version': '11.8', 'python-version': '3.10', 'icefall-git-branch': 'main', 'icefall-git-sha1': '7d2e5f4-dirty', 'icefall-git-date': 'Tue Aug 6 02:59:12 2024', 'icefall-path': '/workspace/icefall_llm', 'k2-path': '/usr/local/lib/python3.10/dist-packages/k2/__init__.py', 'lhotse-path': '/usr/local/lib/python3.10/dist-packages/lhotse/__init__.py', 'hostname': '6865771', 'IP address': '0.104.195.107'}, 'world_size': 8, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 20, 'start_epoch': 1, 'start_batch': 0, 'exp_dir': PosixPath('exp/valle'), 'optimizer_name': 'ScaledAdam', 'scheduler_name': 'Eden', 'base_lr': 0.03, 'warmup_steps': 200, 'seed': 42, 'inf_check': False, 'save_every_n': 1000, 'keep_last_k': 20, 'average_period': 0, 'accumulate_grad_steps': 1, 'dtype': 'bfloat16', 'filter_min_duration': 0.5, 'filter_max_duration': 14.0, 'train_stage': 1, 'visualize': False, 'oom_check': False, 'model_name': 'valle', 'decoder_dim': 1024, 'nhead': 16, 'num_decoder_layers': 12, 'scale_factor': 1.0, 'norm_first': True, 'add_prenet': False, 'prefix_mode': 1, 'share_embedding': True, 'prepend_bos': False, 'num_quantizers': 8, 'scaling_xformers': False, 'manifest_dir': PosixPath('data/tokenized'), 'max_duration': 320, 'bucketing_sampler': True, 'num_buckets': 6, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 0.1, 'on_the_fly_feats': False, 'shuffle': True, 'buffer_size': 40000, 'shuffle_buffer_size': 100000, 'drop_last': False, 'return_cuts': True, 'num_workers': 8, 'enable_spec_aug': False, 'spec_aug_time_warp_factor': 80, 'input_strategy': 'PrecomputedFeatures', 'dataset': 'libritts', 'text_tokens': 'data/tokenized/unique_text_tokens.k2symbols', 'sampling_rate': 24000}
2024-08-06 03:39:40,362 INFO [trainer.py:892] (1/8) About to create model
2024-08-06 03:39:41,123 INFO [trainer.py:899] (1/8) Number of model parameters: 367386628
2024-08-06 03:39:41,939 INFO [trainer.py:914] (1/8) Using DDP
2024-08-06 03:39:44,000 INFO [datamodule.py:427] (1/8) About to get train cuts
2024-08-06 03:39:44,002 INFO [datamodule.py:434] (1/8) About to get dev cuts
2024-08-06 03:39:44,003 INFO [datamodule.py:292] (1/8) Disable SpecAugment
2024-08-06 03:39:44,003 INFO [datamodule.py:294] (1/8) About to create train dataset
2024-08-06 03:39:44,003 INFO [datamodule.py:323] (1/8) Using DynamicBucketingSampler
2024-08-06 03:39:44,618 INFO [datamodule.py:344] (1/8) About to create train dataloader
2024-08-06 03:39:44,618 INFO [datamodule.py:367] (1/8) About to create dev dataset
2024-08-06 03:39:44,948 INFO [datamodule.py:388] (1/8) About to create dev dataloader
2024-08-06 03:40:39,569 INFO [trainer.py:765] (1/8) Epoch 1, batch 100, train_loss[loss=4.211, ArTop10Accuracy=0.4941, over 14685.00 frames. ], tot_loss[loss=4.797, ArTop10Accuracy=0.3935, over 4777.19 frames. ], batch size: 61, lr: 2.25e-02
2024-08-06 03:41:16,921 INFO [trainer.py:765] (1/8) Epoch 1, batch 200, train_loss[loss=3.868, ArTop10Accuracy=0.5479, over 13852.00 frames. ], tot_loss[loss=4.306, ArTop10Accuracy=0.4751, over 7786.72 frames. ], batch size: 34, lr: 3.00e-02
2024-08-06 03:41:57,950 INFO [trainer.py:765] (1/8) Epoch 1, batch 300, train_loss[loss=3.971, ArTop10Accuracy=0.5225, over 14357.00 frames. ], tot_loss[loss=4.088, ArTop10Accuracy=0.5102, over 9419.77 frames. ], batch size: 44, lr: 3.00e-02
2024-08-06 03:42:33,079 INFO [trainer.py:765] (1/8) Epoch 1, batch 400, train_loss[loss=3.741, ArTop10Accuracy=0.568, over 10105.00 frames. ], tot_loss[loss=3.933, ArTop10Accuracy=0.5359, over 10340.30 frames. ], batch size: 14, lr: 3.00e-02
2024-08-06 03:43:11,270 INFO [trainer.py:765] (1/8) Epoch 1, batch 500, train_loss[loss=3.502, ArTop10Accuracy=0.5975, over 12171.00 frames. ], tot_loss[loss=3.821, ArTop10Accuracy=0.5547, over 10900.33 frames. ], batch size: 22, lr: 2.99e-02
2024-08-06 03:43:46,592 INFO [trainer.py:765] (1/8) Epoch 1, batch 600, train_loss[loss=3.568, ArTop10Accuracy=0.5968, over 11385.00 frames. ], tot_loss[loss=3.742, ArTop10Accuracy=0.5685, over 11432.22 frames. ], batch size: 18, lr: 2.99e-02
2024-08-06 03:44:27,898 INFO [trainer.py:765] (1/8) Epoch 1, batch 700, train_loss[loss=3.5, ArTop10Accuracy=0.6223, over 10264.00 frames. ], tot_loss[loss=3.682, ArTop10Accuracy=0.5792, over 11565.69 frames. ], batch size: 12, lr: 2.99e-02
2024-08-06 03:45:01,514 INFO [trainer.py:765] (1/8) Epoch 1, batch 800, train_loss[loss=3.455, ArTop10Accuracy=0.6257, over 10082.00 frames. ], tot_loss[loss=3.635, ArTop10Accuracy=0.5877, over 11680.27 frames. ], batch size: 12, lr: 2.98e-02
2024-08-06 03:45:32,557 INFO [trainer.py:765] (1/8) Epoch 1, batch 900, train_loss[loss=3.517, ArTop10Accuracy=0.6102, over 12948.00 frames. ], tot_loss[loss=3.581, ArTop10Accuracy=0.5978, over 11721.27 frames. ], batch size: 27, lr: 2.98e-02
2024-08-06 03:46:03,649 INFO [trainer.py:765] (1/8) Epoch 1, batch 1000, train_loss[loss=3.477, ArTop10Accuracy=0.619, over 12944.00 frames. ], tot_loss[loss=3.553, ArTop10Accuracy=0.603, over 11943.35 frames. ], batch size: 27, lr: 2.97e-02
2024-08-06 03:46:07,989 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 8.169e+01 1.565e+02 2.239e+02 3.485e+02 9.105e+03, threshold=4.478e+02, percent-clipped=0.0
2024-08-06 03:46:38,611 INFO [trainer.py:765] (1/8) Epoch 1, batch 1100, train_loss[loss=3.417, ArTop10Accuracy=0.6279, over 13577.00 frames. ], tot_loss[loss=3.53, ArTop10Accuracy=0.607, over 11979.20 frames. ], batch size: 34, lr: 2.96e-02
2024-08-06 03:47:08,745 INFO [trainer.py:765] (1/8) Epoch 1, batch 1200, train_loss[loss=3.569, ArTop10Accuracy=0.6013, over 11880.00 frames. ], tot_loss[loss=3.502, ArTop10Accuracy=0.6125, over 11919.95 frames. ], batch size: 97, lr: 2.96e-02
2024-08-06 03:47:33,462 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 03:48:38,676 INFO [trainer.py:765] (1/8) Epoch 2, batch 100, train_loss[loss=3.514, ArTop10Accuracy=0.6095, over 14636.00 frames. ], tot_loss[loss=3.453, ArTop10Accuracy=0.6225, over 4803.63 frames. ], batch size: 61, lr: 2.90e-02
2024-08-06 03:49:14,596 INFO [trainer.py:765] (1/8) Epoch 2, batch 200, train_loss[loss=3.284, ArTop10Accuracy=0.6556, over 13788.00 frames. ], tot_loss[loss=3.433, ArTop10Accuracy=0.6259, over 7786.74 frames. ], batch size: 34, lr: 2.89e-02
2024-08-06 03:49:56,519 INFO [trainer.py:765] (1/8) Epoch 2, batch 300, train_loss[loss=3.493, ArTop10Accuracy=0.6161, over 14222.00 frames. ], tot_loss[loss=3.42, ArTop10Accuracy=0.6286, over 9427.07 frames. ], batch size: 44, lr: 2.89e-02
2024-08-06 03:50:31,999 INFO [trainer.py:765] (1/8) Epoch 2, batch 400, train_loss[loss=3.347, ArTop10Accuracy=0.6448, over 10418.00 frames. ], tot_loss[loss=3.409, ArTop10Accuracy=0.631, over 10350.14 frames. ], batch size: 14, lr: 2.88e-02
2024-08-06 03:51:17,109 INFO [trainer.py:765] (1/8) Epoch 2, batch 500, train_loss[loss=3.394, ArTop10Accuracy=0.6314, over 12365.00 frames. ], tot_loss[loss=3.396, ArTop10Accuracy=0.6334, over 10903.72 frames. ], batch size: 22, lr: 2.87e-02
2024-08-06 03:51:53,203 INFO [trainer.py:765] (1/8) Epoch 2, batch 600, train_loss[loss=3.394, ArTop10Accuracy=0.6319, over 11559.00 frames. ], tot_loss[loss=3.387, ArTop10Accuracy=0.6348, over 11422.82 frames. ], batch size: 18, lr: 2.86e-02
2024-08-06 03:52:38,993 INFO [trainer.py:765] (1/8) Epoch 2, batch 700, train_loss[loss=3.332, ArTop10Accuracy=0.6517, over 10076.00 frames. ], tot_loss[loss=3.388, ArTop10Accuracy=0.6347, over 11562.03 frames. ], batch size: 12, lr: 2.85e-02
2024-08-06 03:52:47,090 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 03:52:56,023 INFO [trainer.py:811] (1/8) Epoch 2, validation: loss=3.327, ArTop10Accuracy=0.6492, over 1829298.00 frames. 
2024-08-06 03:52:56,024 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 28727MB
2024-08-06 03:52:56,541 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 8.181e+01 1.431e+02 1.849e+02 2.730e+02 2.344e+03, threshold=3.697e+02, percent-clipped=7.2
2024-08-06 03:53:21,881 INFO [trainer.py:765] (1/8) Epoch 2, batch 800, train_loss[loss=3.398, ArTop10Accuracy=0.6388, over 9367.00 frames. ], tot_loss[loss=3.383, ArTop10Accuracy=0.6358, over 11655.20 frames. ], batch size: 11, lr: 2.84e-02
2024-08-06 03:53:53,299 INFO [trainer.py:765] (1/8) Epoch 2, batch 900, train_loss[loss=3.391, ArTop10Accuracy=0.6392, over 12796.00 frames. ], tot_loss[loss=3.366, ArTop10Accuracy=0.639, over 11709.70 frames. ], batch size: 27, lr: 2.83e-02
2024-08-06 03:54:24,808 INFO [trainer.py:765] (1/8) Epoch 2, batch 1000, train_loss[loss=3.409, ArTop10Accuracy=0.6317, over 13004.00 frames. ], tot_loss[loss=3.362, ArTop10Accuracy=0.64, over 11923.32 frames. ], batch size: 27, lr: 2.82e-02
2024-08-06 03:54:56,006 INFO [trainer.py:765] (1/8) Epoch 2, batch 1100, train_loss[loss=3.398, ArTop10Accuracy=0.6314, over 13842.00 frames. ], tot_loss[loss=3.36, ArTop10Accuracy=0.6402, over 11993.47 frames. ], batch size: 34, lr: 2.81e-02
2024-08-06 03:55:26,228 INFO [trainer.py:765] (1/8) Epoch 2, batch 1200, train_loss[loss=3.483, ArTop10Accuracy=0.6179, over 12353.00 frames. ], tot_loss[loss=3.348, ArTop10Accuracy=0.6423, over 11942.25 frames. ], batch size: 97, lr: 2.80e-02
2024-08-06 03:55:51,272 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 03:57:04,101 INFO [trainer.py:765] (1/8) Epoch 3, batch 100, train_loss[loss=3.333, ArTop10Accuracy=0.6494, over 14623.00 frames. ], tot_loss[loss=3.311, ArTop10Accuracy=0.651, over 4773.76 frames. ], batch size: 61, lr: 2.67e-02
2024-08-06 03:57:50,979 INFO [trainer.py:765] (1/8) Epoch 3, batch 200, train_loss[loss=3.328, ArTop10Accuracy=0.65, over 13756.00 frames. ], tot_loss[loss=3.288, ArTop10Accuracy=0.6547, over 7779.65 frames. ], batch size: 34, lr: 2.66e-02
2024-08-06 03:58:26,073 INFO [trainer.py:765] (1/8) Epoch 3, batch 300, train_loss[loss=3.245, ArTop10Accuracy=0.6634, over 14083.00 frames. ], tot_loss[loss=3.274, ArTop10Accuracy=0.6572, over 9407.73 frames. ], batch size: 44, lr: 2.64e-02
2024-08-06 03:59:11,252 INFO [trainer.py:765] (1/8) Epoch 3, batch 400, train_loss[loss=3.083, ArTop10Accuracy=0.6934, over 10398.00 frames. ], tot_loss[loss=3.265, ArTop10Accuracy=0.6589, over 10326.49 frames. ], batch size: 14, lr: 2.63e-02
2024-08-06 03:59:29,675 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 8.720e+01 1.461e+02 1.775e+02 2.344e+02 9.150e+02, threshold=3.550e+02, percent-clipped=5.2
2024-08-06 03:59:49,302 INFO [trainer.py:765] (1/8) Epoch 3, batch 500, train_loss[loss=3.104, ArTop10Accuracy=0.6816, over 12161.00 frames. ], tot_loss[loss=3.253, ArTop10Accuracy=0.6614, over 10898.01 frames. ], batch size: 22, lr: 2.62e-02
2024-08-06 04:00:35,094 INFO [trainer.py:765] (1/8) Epoch 3, batch 600, train_loss[loss=3.198, ArTop10Accuracy=0.6762, over 11433.00 frames. ], tot_loss[loss=3.237, ArTop10Accuracy=0.6643, over 11425.47 frames. ], batch size: 18, lr: 2.61e-02
2024-08-06 04:01:22,057 INFO [trainer.py:765] (1/8) Epoch 3, batch 700, train_loss[loss=3.226, ArTop10Accuracy=0.6768, over 10103.00 frames. ], tot_loss[loss=3.232, ArTop10Accuracy=0.6655, over 11560.08 frames. ], batch size: 12, lr: 2.60e-02
2024-08-06 04:01:56,268 INFO [trainer.py:765] (1/8) Epoch 3, batch 800, train_loss[loss=2.973, ArTop10Accuracy=0.7149, over 10194.00 frames. ], tot_loss[loss=3.225, ArTop10Accuracy=0.6668, over 11687.74 frames. ], batch size: 12, lr: 2.59e-02
2024-08-06 04:02:27,739 INFO [trainer.py:765] (1/8) Epoch 3, batch 900, train_loss[loss=3.063, ArTop10Accuracy=0.6936, over 12768.00 frames. ], tot_loss[loss=3.206, ArTop10Accuracy=0.6706, over 11752.15 frames. ], batch size: 27, lr: 2.57e-02
2024-08-06 04:02:59,282 INFO [trainer.py:765] (1/8) Epoch 3, batch 1000, train_loss[loss=3.204, ArTop10Accuracy=0.6693, over 12887.00 frames. ], tot_loss[loss=3.196, ArTop10Accuracy=0.6724, over 11927.65 frames. ], batch size: 27, lr: 2.56e-02
2024-08-06 04:03:30,941 INFO [trainer.py:765] (1/8) Epoch 3, batch 1100, train_loss[loss=3.202, ArTop10Accuracy=0.6681, over 13940.00 frames. ], tot_loss[loss=3.191, ArTop10Accuracy=0.6736, over 12000.62 frames. ], batch size: 34, lr: 2.55e-02
2024-08-06 04:04:01,311 INFO [trainer.py:765] (1/8) Epoch 3, batch 1200, train_loss[loss=3.286, ArTop10Accuracy=0.6576, over 12261.00 frames. ], tot_loss[loss=3.183, ArTop10Accuracy=0.6754, over 11938.78 frames. ], batch size: 98, lr: 2.54e-02
2024-08-06 04:04:26,857 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 04:05:43,369 INFO [trainer.py:765] (1/8) Epoch 4, batch 100, train_loss[loss=3.22, ArTop10Accuracy=0.6675, over 14405.00 frames. ], tot_loss[loss=3.141, ArTop10Accuracy=0.6839, over 4771.98 frames. ], batch size: 61, lr: 2.38e-02
2024-08-06 04:06:07,077 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 04:06:16,404 INFO [trainer.py:811] (1/8) Epoch 4, validation: loss=3.063, ArTop10Accuracy=0.7031, over 1829298.00 frames. 
2024-08-06 04:06:16,404 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 29537MB
2024-08-06 04:06:16,746 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.091e+02 1.493e+02 1.709e+02 2.068e+02 7.969e+02, threshold=3.418e+02, percent-clipped=2.9
2024-08-06 04:06:31,825 INFO [trainer.py:765] (1/8) Epoch 4, batch 200, train_loss[loss=3.197, ArTop10Accuracy=0.682, over 13803.00 frames. ], tot_loss[loss=3.124, ArTop10Accuracy=0.6873, over 7792.86 frames. ], batch size: 34, lr: 2.37e-02
2024-08-06 04:07:18,544 INFO [trainer.py:765] (1/8) Epoch 4, batch 300, train_loss[loss=3.211, ArTop10Accuracy=0.6781, over 14502.00 frames. ], tot_loss[loss=3.117, ArTop10Accuracy=0.6887, over 9418.04 frames. ], batch size: 45, lr: 2.36e-02
2024-08-06 04:08:01,910 INFO [trainer.py:765] (1/8) Epoch 4, batch 400, train_loss[loss=3.034, ArTop10Accuracy=0.7056, over 11117.00 frames. ], tot_loss[loss=3.116, ArTop10Accuracy=0.6889, over 10328.79 frames. ], batch size: 15, lr: 2.34e-02
2024-08-06 04:08:45,344 INFO [trainer.py:765] (1/8) Epoch 4, batch 500, train_loss[loss=3.084, ArTop10Accuracy=0.6913, over 12164.00 frames. ], tot_loss[loss=3.109, ArTop10Accuracy=0.6898, over 10898.32 frames. ], batch size: 22, lr: 2.33e-02
2024-08-06 04:09:37,072 INFO [trainer.py:765] (1/8) Epoch 4, batch 600, train_loss[loss=3.048, ArTop10Accuracy=0.6966, over 11971.00 frames. ], tot_loss[loss=3.111, ArTop10Accuracy=0.6894, over 11420.59 frames. ], batch size: 19, lr: 2.32e-02
2024-08-06 04:10:13,501 INFO [trainer.py:765] (1/8) Epoch 4, batch 700, train_loss[loss=3.067, ArTop10Accuracy=0.7017, over 10131.00 frames. ], tot_loss[loss=3.112, ArTop10Accuracy=0.6892, over 11573.58 frames. ], batch size: 12, lr: 2.31e-02
2024-08-06 04:10:51,959 INFO [trainer.py:765] (1/8) Epoch 4, batch 800, train_loss[loss=2.975, ArTop10Accuracy=0.7175, over 10435.00 frames. ], tot_loss[loss=3.111, ArTop10Accuracy=0.6893, over 11665.25 frames. ], batch size: 12, lr: 2.30e-02
2024-08-06 04:11:23,330 INFO [trainer.py:765] (1/8) Epoch 4, batch 900, train_loss[loss=3.09, ArTop10Accuracy=0.6959, over 12982.00 frames. ], tot_loss[loss=3.101, ArTop10Accuracy=0.6913, over 11731.81 frames. ], batch size: 27, lr: 2.29e-02
2024-08-06 04:11:54,826 INFO [trainer.py:765] (1/8) Epoch 4, batch 1000, train_loss[loss=3.044, ArTop10Accuracy=0.7061, over 12973.00 frames. ], tot_loss[loss=3.099, ArTop10Accuracy=0.692, over 11933.40 frames. ], batch size: 27, lr: 2.28e-02
2024-08-06 04:12:25,960 INFO [trainer.py:765] (1/8) Epoch 4, batch 1100, train_loss[loss=3.108, ArTop10Accuracy=0.6931, over 13630.00 frames. ], tot_loss[loss=3.106, ArTop10Accuracy=0.6905, over 12001.05 frames. ], batch size: 34, lr: 2.26e-02
2024-08-06 04:12:48,545 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.106e+02 1.440e+02 1.608e+02 1.893e+02 7.925e+02, threshold=3.216e+02, percent-clipped=2.0
2024-08-06 04:12:58,827 INFO [trainer.py:765] (1/8) Epoch 4, batch 1200, train_loss[loss=3.158, ArTop10Accuracy=0.678, over 12305.00 frames. ], tot_loss[loss=3.108, ArTop10Accuracy=0.6903, over 11931.65 frames. ], batch size: 98, lr: 2.25e-02
2024-08-06 04:13:24,356 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 04:14:38,685 INFO [trainer.py:765] (1/8) Epoch 5, batch 100, train_loss[loss=3.184, ArTop10Accuracy=0.6818, over 14605.00 frames. ], tot_loss[loss=3.059, ArTop10Accuracy=0.7008, over 4778.12 frames. ], batch size: 61, lr: 2.10e-02
2024-08-06 04:15:26,826 INFO [trainer.py:765] (1/8) Epoch 5, batch 200, train_loss[loss=3.144, ArTop10Accuracy=0.6805, over 13768.00 frames. ], tot_loss[loss=3.056, ArTop10Accuracy=0.7013, over 7779.65 frames. ], batch size: 34, lr: 2.09e-02
2024-08-06 04:16:08,011 INFO [trainer.py:765] (1/8) Epoch 5, batch 300, train_loss[loss=2.999, ArTop10Accuracy=0.7113, over 14653.00 frames. ], tot_loss[loss=3.052, ArTop10Accuracy=0.7016, over 9413.83 frames. ], batch size: 45, lr: 2.08e-02
2024-08-06 04:16:53,134 INFO [trainer.py:765] (1/8) Epoch 5, batch 400, train_loss[loss=3.053, ArTop10Accuracy=0.7039, over 10298.00 frames. ], tot_loss[loss=3.052, ArTop10Accuracy=0.7016, over 10330.35 frames. ], batch size: 14, lr: 2.07e-02
2024-08-06 04:17:36,638 INFO [trainer.py:765] (1/8) Epoch 5, batch 500, train_loss[loss=3.051, ArTop10Accuracy=0.6982, over 12439.00 frames. ], tot_loss[loss=3.049, ArTop10Accuracy=0.702, over 10905.78 frames. ], batch size: 22, lr: 2.06e-02
2024-08-06 04:18:22,114 INFO [trainer.py:765] (1/8) Epoch 5, batch 600, train_loss[loss=3.094, ArTop10Accuracy=0.6999, over 11559.00 frames. ], tot_loss[loss=3.054, ArTop10Accuracy=0.7011, over 11426.15 frames. ], batch size: 18, lr: 2.05e-02
2024-08-06 04:19:17,033 INFO [trainer.py:765] (1/8) Epoch 5, batch 700, train_loss[loss=2.871, ArTop10Accuracy=0.7291, over 10005.00 frames. ], tot_loss[loss=3.056, ArTop10Accuracy=0.7004, over 11568.53 frames. ], batch size: 12, lr: 2.04e-02
2024-08-06 04:19:51,066 INFO [trainer.py:765] (1/8) Epoch 5, batch 800, train_loss[loss=3, ArTop10Accuracy=0.7126, over 10694.00 frames. ], tot_loss[loss=3.062, ArTop10Accuracy=0.6992, over 11693.41 frames. ], batch size: 13, lr: 2.03e-02
2024-08-06 04:20:18,214 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 04:20:27,476 INFO [trainer.py:811] (1/8) Epoch 5, validation: loss=2.998, ArTop10Accuracy=0.7157, over 1829298.00 frames. 
2024-08-06 04:20:27,476 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 29914MB
2024-08-06 04:20:27,781 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.057e+02 1.385e+02 1.542e+02 1.759e+02 7.741e+02, threshold=3.083e+02, percent-clipped=0.7
2024-08-06 04:20:31,766 INFO [trainer.py:765] (1/8) Epoch 5, batch 900, train_loss[loss=3.024, ArTop10Accuracy=0.7109, over 13120.00 frames. ], tot_loss[loss=3.052, ArTop10Accuracy=0.7012, over 11731.61 frames. ], batch size: 27, lr: 2.02e-02
2024-08-06 04:21:03,306 INFO [trainer.py:765] (1/8) Epoch 5, batch 1000, train_loss[loss=3.029, ArTop10Accuracy=0.7119, over 12783.00 frames. ], tot_loss[loss=3.054, ArTop10Accuracy=0.7009, over 11936.20 frames. ], batch size: 27, lr: 2.01e-02
2024-08-06 04:21:34,451 INFO [trainer.py:765] (1/8) Epoch 5, batch 1100, train_loss[loss=2.999, ArTop10Accuracy=0.7113, over 13838.00 frames. ], tot_loss[loss=3.062, ArTop10Accuracy=0.6995, over 11994.41 frames. ], batch size: 34, lr: 2.00e-02
2024-08-06 04:22:04,752 INFO [trainer.py:765] (1/8) Epoch 5, batch 1200, train_loss[loss=3.164, ArTop10Accuracy=0.6829, over 11790.00 frames. ], tot_loss[loss=3.054, ArTop10Accuracy=0.7009, over 11929.62 frames. ], batch size: 98, lr: 1.99e-02
2024-08-06 04:22:30,194 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 04:23:46,282 INFO [trainer.py:765] (1/8) Epoch 6, batch 100, train_loss[loss=3.049, ArTop10Accuracy=0.7005, over 14328.00 frames. ], tot_loss[loss=3.02, ArTop10Accuracy=0.7089, over 4777.09 frames. ], batch size: 61, lr: 1.85e-02
2024-08-06 04:24:35,255 INFO [trainer.py:765] (1/8) Epoch 6, batch 200, train_loss[loss=2.914, ArTop10Accuracy=0.7266, over 13723.00 frames. ], tot_loss[loss=3.015, ArTop10Accuracy=0.7104, over 7776.05 frames. ], batch size: 34, lr: 1.84e-02
2024-08-06 04:25:16,677 INFO [trainer.py:765] (1/8) Epoch 6, batch 300, train_loss[loss=3.049, ArTop10Accuracy=0.7045, over 14424.00 frames. ], tot_loss[loss=3.009, ArTop10Accuracy=0.711, over 9393.06 frames. ], batch size: 44, lr: 1.83e-02
2024-08-06 04:26:08,924 INFO [trainer.py:765] (1/8) Epoch 6, batch 400, train_loss[loss=2.863, ArTop10Accuracy=0.7268, over 10029.00 frames. ], tot_loss[loss=3.005, ArTop10Accuracy=0.711, over 10322.51 frames. ], batch size: 14, lr: 1.83e-02
2024-08-06 04:26:51,486 INFO [trainer.py:765] (1/8) Epoch 6, batch 500, train_loss[loss=2.995, ArTop10Accuracy=0.7079, over 12478.00 frames. ], tot_loss[loss=3.002, ArTop10Accuracy=0.7113, over 10891.41 frames. ], batch size: 22, lr: 1.82e-02
2024-08-06 04:27:39,298 INFO [trainer.py:765] (1/8) Epoch 6, batch 600, train_loss[loss=2.917, ArTop10Accuracy=0.7149, over 11750.00 frames. ], tot_loss[loss=3.007, ArTop10Accuracy=0.7099, over 11447.28 frames. ], batch size: 18, lr: 1.81e-02
2024-08-06 04:27:46,369 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.054e+02 1.343e+02 1.474e+02 1.660e+02 8.574e+02, threshold=2.947e+02, percent-clipped=0.6
2024-08-06 04:28:33,240 INFO [trainer.py:765] (1/8) Epoch 6, batch 700, train_loss[loss=2.854, ArTop10Accuracy=0.7402, over 10171.00 frames. ], tot_loss[loss=3.014, ArTop10Accuracy=0.7086, over 11581.22 frames. ], batch size: 12, lr: 1.80e-02
2024-08-06 04:29:11,216 INFO [trainer.py:765] (1/8) Epoch 6, batch 800, train_loss[loss=3.12, ArTop10Accuracy=0.6847, over 10061.00 frames. ], tot_loss[loss=3.018, ArTop10Accuracy=0.7082, over 11684.35 frames. ], batch size: 12, lr: 1.79e-02
2024-08-06 04:29:42,751 INFO [trainer.py:765] (1/8) Epoch 6, batch 900, train_loss[loss=3.003, ArTop10Accuracy=0.7076, over 13161.00 frames. ], tot_loss[loss=3.015, ArTop10Accuracy=0.7085, over 11729.41 frames. ], batch size: 27, lr: 1.78e-02
2024-08-06 04:30:14,306 INFO [trainer.py:765] (1/8) Epoch 6, batch 1000, train_loss[loss=3.037, ArTop10Accuracy=0.6993, over 12855.00 frames. ], tot_loss[loss=3.021, ArTop10Accuracy=0.7076, over 11934.75 frames. ], batch size: 27, lr: 1.77e-02
2024-08-06 04:30:45,383 INFO [trainer.py:765] (1/8) Epoch 6, batch 1100, train_loss[loss=2.996, ArTop10Accuracy=0.7164, over 14108.00 frames. ], tot_loss[loss=3.032, ArTop10Accuracy=0.7057, over 12000.46 frames. ], batch size: 34, lr: 1.77e-02
2024-08-06 04:31:15,673 INFO [trainer.py:765] (1/8) Epoch 6, batch 1200, train_loss[loss=3.169, ArTop10Accuracy=0.6784, over 11880.00 frames. ], tot_loss[loss=3.029, ArTop10Accuracy=0.706, over 11943.58 frames. ], batch size: 99, lr: 1.76e-02
2024-08-06 04:31:40,439 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 04:32:52,405 INFO [trainer.py:765] (1/8) Epoch 7, batch 100, train_loss[loss=3.047, ArTop10Accuracy=0.702, over 14422.00 frames. ], tot_loss[loss=2.987, ArTop10Accuracy=0.7148, over 4779.19 frames. ], batch size: 61, lr: 1.64e-02
2024-08-06 04:33:38,223 INFO [trainer.py:765] (1/8) Epoch 7, batch 200, train_loss[loss=2.943, ArTop10Accuracy=0.7228, over 13888.00 frames. ], tot_loss[loss=2.985, ArTop10Accuracy=0.7155, over 7795.94 frames. ], batch size: 34, lr: 1.64e-02
2024-08-06 04:34:22,609 INFO [trainer.py:765] (1/8) Epoch 7, batch 300, train_loss[loss=2.972, ArTop10Accuracy=0.7223, over 14520.00 frames. ], tot_loss[loss=2.982, ArTop10Accuracy=0.716, over 9435.04 frames. ], batch size: 44, lr: 1.63e-02
2024-08-06 04:34:36,847 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 04:34:45,809 INFO [trainer.py:811] (1/8) Epoch 7, validation: loss=2.963, ArTop10Accuracy=0.7233, over 1829298.00 frames. 
2024-08-06 04:34:45,809 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 29914MB
2024-08-06 04:34:46,124 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.009e+02 1.306e+02 1.435e+02 1.599e+02 8.689e+02, threshold=2.871e+02, percent-clipped=0.9
2024-08-06 04:35:17,146 INFO [trainer.py:765] (1/8) Epoch 7, batch 400, train_loss[loss=2.893, ArTop10Accuracy=0.739, over 10205.00 frames. ], tot_loss[loss=2.987, ArTop10Accuracy=0.7148, over 10348.81 frames. ], batch size: 14, lr: 1.62e-02
2024-08-06 04:36:01,710 INFO [trainer.py:765] (1/8) Epoch 7, batch 500, train_loss[loss=2.904, ArTop10Accuracy=0.7278, over 12303.00 frames. ], tot_loss[loss=2.981, ArTop10Accuracy=0.7158, over 10908.06 frames. ], batch size: 22, lr: 1.61e-02
2024-08-06 04:36:48,811 INFO [trainer.py:765] (1/8) Epoch 7, batch 600, train_loss[loss=2.902, ArTop10Accuracy=0.7295, over 11582.00 frames. ], tot_loss[loss=2.982, ArTop10Accuracy=0.7153, over 11445.31 frames. ], batch size: 18, lr: 1.61e-02
2024-08-06 04:37:34,800 INFO [trainer.py:765] (1/8) Epoch 7, batch 700, train_loss[loss=2.89, ArTop10Accuracy=0.7312, over 9354.00 frames. ], tot_loss[loss=2.993, ArTop10Accuracy=0.7132, over 11556.36 frames. ], batch size: 11, lr: 1.60e-02
2024-08-06 04:38:13,613 INFO [trainer.py:765] (1/8) Epoch 7, batch 800, train_loss[loss=2.881, ArTop10Accuracy=0.7282, over 9319.00 frames. ], tot_loss[loss=2.996, ArTop10Accuracy=0.7125, over 11684.94 frames. ], batch size: 11, lr: 1.59e-02
2024-08-06 04:38:45,110 INFO [trainer.py:765] (1/8) Epoch 7, batch 900, train_loss[loss=2.983, ArTop10Accuracy=0.7156, over 13025.00 frames. ], tot_loss[loss=2.987, ArTop10Accuracy=0.7146, over 11730.77 frames. ], batch size: 27, lr: 1.59e-02
2024-08-06 04:39:16,574 INFO [trainer.py:765] (1/8) Epoch 7, batch 1000, train_loss[loss=3.091, ArTop10Accuracy=0.6947, over 12891.00 frames. ], tot_loss[loss=2.993, ArTop10Accuracy=0.7132, over 11931.30 frames. ], batch size: 27, lr: 1.58e-02
2024-08-06 04:39:47,571 INFO [trainer.py:765] (1/8) Epoch 7, batch 1100, train_loss[loss=3.078, ArTop10Accuracy=0.6948, over 13585.00 frames. ], tot_loss[loss=2.999, ArTop10Accuracy=0.7124, over 11996.35 frames. ], batch size: 34, lr: 1.57e-02
2024-08-06 04:40:17,989 INFO [trainer.py:765] (1/8) Epoch 7, batch 1200, train_loss[loss=3.127, ArTop10Accuracy=0.6918, over 12528.00 frames. ], tot_loss[loss=3, ArTop10Accuracy=0.7121, over 11948.80 frames. ], batch size: 99, lr: 1.57e-02
2024-08-06 04:40:43,222 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 04:41:37,492 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 9.816e+01 1.295e+02 1.411e+02 1.574e+02 4.953e+02, threshold=2.821e+02, percent-clipped=1.1
2024-08-06 04:41:58,371 INFO [trainer.py:765] (1/8) Epoch 8, batch 100, train_loss[loss=3.097, ArTop10Accuracy=0.6954, over 14820.00 frames. ], tot_loss[loss=2.969, ArTop10Accuracy=0.719, over 4786.78 frames. ], batch size: 61, lr: 1.47e-02
2024-08-06 04:42:44,986 INFO [trainer.py:765] (1/8) Epoch 8, batch 200, train_loss[loss=3.127, ArTop10Accuracy=0.6864, over 13833.00 frames. ], tot_loss[loss=2.967, ArTop10Accuracy=0.719, over 7780.75 frames. ], batch size: 34, lr: 1.46e-02
2024-08-06 04:43:28,045 INFO [trainer.py:765] (1/8) Epoch 8, batch 300, train_loss[loss=2.93, ArTop10Accuracy=0.7206, over 14569.00 frames. ], tot_loss[loss=2.963, ArTop10Accuracy=0.7196, over 9393.57 frames. ], batch size: 44, lr: 1.46e-02
2024-08-06 04:44:14,462 INFO [trainer.py:765] (1/8) Epoch 8, batch 400, train_loss[loss=2.853, ArTop10Accuracy=0.7256, over 10176.00 frames. ], tot_loss[loss=2.962, ArTop10Accuracy=0.7198, over 10311.66 frames. ], batch size: 14, lr: 1.45e-02
2024-08-06 04:45:00,692 INFO [trainer.py:765] (1/8) Epoch 8, batch 500, train_loss[loss=2.91, ArTop10Accuracy=0.723, over 12279.00 frames. ], tot_loss[loss=2.957, ArTop10Accuracy=0.7206, over 10909.44 frames. ], batch size: 22, lr: 1.45e-02
2024-08-06 04:45:45,394 INFO [trainer.py:765] (1/8) Epoch 8, batch 600, train_loss[loss=2.965, ArTop10Accuracy=0.7214, over 11599.00 frames. ], tot_loss[loss=2.962, ArTop10Accuracy=0.7196, over 11431.46 frames. ], batch size: 18, lr: 1.44e-02
2024-08-06 04:46:34,038 INFO [trainer.py:765] (1/8) Epoch 8, batch 700, train_loss[loss=2.924, ArTop10Accuracy=0.7204, over 9422.00 frames. ], tot_loss[loss=2.967, ArTop10Accuracy=0.7186, over 11568.95 frames. ], batch size: 11, lr: 1.43e-02
2024-08-06 04:47:10,208 INFO [trainer.py:765] (1/8) Epoch 8, batch 800, train_loss[loss=2.762, ArTop10Accuracy=0.7587, over 9953.00 frames. ], tot_loss[loss=2.971, ArTop10Accuracy=0.7178, over 11703.51 frames. ], batch size: 12, lr: 1.43e-02
2024-08-06 04:47:41,606 INFO [trainer.py:765] (1/8) Epoch 8, batch 900, train_loss[loss=2.975, ArTop10Accuracy=0.7147, over 12770.00 frames. ], tot_loss[loss=2.968, ArTop10Accuracy=0.7183, over 11749.70 frames. ], batch size: 27, lr: 1.42e-02
2024-08-06 04:48:13,032 INFO [trainer.py:765] (1/8) Epoch 8, batch 1000, train_loss[loss=2.895, ArTop10Accuracy=0.7247, over 13061.00 frames. ], tot_loss[loss=2.974, ArTop10Accuracy=0.7172, over 11954.97 frames. ], batch size: 27, lr: 1.42e-02
2024-08-06 04:48:28,827 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 04:48:37,663 INFO [trainer.py:811] (1/8) Epoch 8, validation: loss=2.946, ArTop10Accuracy=0.7266, over 1829298.00 frames. 
2024-08-06 04:48:37,664 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 29914MB
2024-08-06 04:48:37,951 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.035e+02 1.289e+02 1.393e+02 1.532e+02 3.557e+02, threshold=2.786e+02, percent-clipped=0.2
2024-08-06 04:48:52,931 INFO [trainer.py:765] (1/8) Epoch 8, batch 1100, train_loss[loss=3.083, ArTop10Accuracy=0.7016, over 13744.00 frames. ], tot_loss[loss=2.977, ArTop10Accuracy=0.7163, over 11994.55 frames. ], batch size: 34, lr: 1.41e-02
2024-08-06 04:49:23,202 INFO [trainer.py:765] (1/8) Epoch 8, batch 1200, train_loss[loss=3.142, ArTop10Accuracy=0.6871, over 11796.00 frames. ], tot_loss[loss=2.98, ArTop10Accuracy=0.7156, over 11937.07 frames. ], batch size: 98, lr: 1.40e-02
2024-08-06 04:49:49,198 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 04:51:01,547 INFO [trainer.py:765] (1/8) Epoch 9, batch 100, train_loss[loss=3.043, ArTop10Accuracy=0.7075, over 14747.00 frames. ], tot_loss[loss=2.952, ArTop10Accuracy=0.7225, over 4779.52 frames. ], batch size: 61, lr: 1.32e-02
2024-08-06 04:51:45,414 INFO [trainer.py:765] (1/8) Epoch 9, batch 200, train_loss[loss=2.9, ArTop10Accuracy=0.7339, over 13749.00 frames. ], tot_loss[loss=2.939, ArTop10Accuracy=0.7247, over 7784.70 frames. ], batch size: 34, lr: 1.32e-02
2024-08-06 04:52:29,082 INFO [trainer.py:765] (1/8) Epoch 9, batch 300, train_loss[loss=2.947, ArTop10Accuracy=0.7237, over 14498.00 frames. ], tot_loss[loss=2.934, ArTop10Accuracy=0.7258, over 9413.47 frames. ], batch size: 44, lr: 1.31e-02
2024-08-06 04:53:16,431 INFO [trainer.py:765] (1/8) Epoch 9, batch 400, train_loss[loss=2.898, ArTop10Accuracy=0.7289, over 10396.00 frames. ], tot_loss[loss=2.937, ArTop10Accuracy=0.7248, over 10343.48 frames. ], batch size: 14, lr: 1.31e-02
2024-08-06 04:53:58,144 INFO [trainer.py:765] (1/8) Epoch 9, batch 500, train_loss[loss=3.009, ArTop10Accuracy=0.7046, over 12482.00 frames. ], tot_loss[loss=2.936, ArTop10Accuracy=0.7248, over 10908.31 frames. ], batch size: 22, lr: 1.30e-02
2024-08-06 04:54:51,077 INFO [trainer.py:765] (1/8) Epoch 9, batch 600, train_loss[loss=2.88, ArTop10Accuracy=0.7351, over 11555.00 frames. ], tot_loss[loss=2.939, ArTop10Accuracy=0.7239, over 11443.87 frames. ], batch size: 18, lr: 1.30e-02
2024-08-06 04:55:34,399 INFO [trainer.py:765] (1/8) Epoch 9, batch 700, train_loss[loss=2.952, ArTop10Accuracy=0.7199, over 9217.00 frames. ], tot_loss[loss=2.948, ArTop10Accuracy=0.7221, over 11577.60 frames. ], batch size: 11, lr: 1.29e-02
2024-08-06 04:56:04,575 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.029e+02 1.257e+02 1.367e+02 1.507e+02 8.820e+02, threshold=2.735e+02, percent-clipped=0.5
2024-08-06 04:56:13,597 INFO [trainer.py:765] (1/8) Epoch 9, batch 800, train_loss[loss=2.97, ArTop10Accuracy=0.7133, over 9305.00 frames. ], tot_loss[loss=2.947, ArTop10Accuracy=0.7222, over 11673.87 frames. ], batch size: 11, lr: 1.29e-02
2024-08-06 04:56:44,975 INFO [trainer.py:765] (1/8) Epoch 9, batch 900, train_loss[loss=2.84, ArTop10Accuracy=0.7408, over 13126.00 frames. ], tot_loss[loss=2.94, ArTop10Accuracy=0.7233, over 11726.70 frames. ], batch size: 27, lr: 1.28e-02
2024-08-06 04:57:16,491 INFO [trainer.py:765] (1/8) Epoch 9, batch 1000, train_loss[loss=2.927, ArTop10Accuracy=0.7269, over 12842.00 frames. ], tot_loss[loss=2.95, ArTop10Accuracy=0.7215, over 11924.86 frames. ], batch size: 27, lr: 1.28e-02
2024-08-06 04:57:47,656 INFO [trainer.py:765] (1/8) Epoch 9, batch 1100, train_loss[loss=2.928, ArTop10Accuracy=0.7262, over 13847.00 frames. ], tot_loss[loss=2.96, ArTop10Accuracy=0.7201, over 11978.23 frames. ], batch size: 34, lr: 1.27e-02
2024-08-06 04:58:18,093 INFO [trainer.py:765] (1/8) Epoch 9, batch 1200, train_loss[loss=3.087, ArTop10Accuracy=0.7016, over 13113.00 frames. ], tot_loss[loss=2.958, ArTop10Accuracy=0.7204, over 11958.80 frames. ], batch size: 97, lr: 1.27e-02
2024-08-06 04:58:43,245 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 04:59:52,748 INFO [trainer.py:765] (1/8) Epoch 10, batch 100, train_loss[loss=3.024, ArTop10Accuracy=0.7094, over 14484.00 frames. ], tot_loss[loss=2.922, ArTop10Accuracy=0.7288, over 4805.25 frames. ], batch size: 61, lr: 1.20e-02
2024-08-06 05:00:43,730 INFO [trainer.py:765] (1/8) Epoch 10, batch 200, train_loss[loss=2.825, ArTop10Accuracy=0.7417, over 13692.00 frames. ], tot_loss[loss=2.919, ArTop10Accuracy=0.7285, over 7805.84 frames. ], batch size: 34, lr: 1.20e-02
2024-08-06 05:01:20,591 INFO [trainer.py:765] (1/8) Epoch 10, batch 300, train_loss[loss=2.978, ArTop10Accuracy=0.7133, over 14424.00 frames. ], tot_loss[loss=2.921, ArTop10Accuracy=0.7282, over 9437.15 frames. ], batch size: 44, lr: 1.19e-02
2024-08-06 05:02:10,047 INFO [trainer.py:765] (1/8) Epoch 10, batch 400, train_loss[loss=2.874, ArTop10Accuracy=0.7338, over 10806.00 frames. ], tot_loss[loss=2.921, ArTop10Accuracy=0.7281, over 10347.74 frames. ], batch size: 15, lr: 1.19e-02
2024-08-06 05:02:46,487 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 05:02:55,378 INFO [trainer.py:811] (1/8) Epoch 10, validation: loss=2.927, ArTop10Accuracy=0.7304, over 1829298.00 frames. 
2024-08-06 05:02:55,379 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 30932MB
2024-08-06 05:02:55,728 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.023e+02 1.269e+02 1.367e+02 1.518e+02 4.405e+02, threshold=2.733e+02, percent-clipped=0.4
2024-08-06 05:02:58,361 INFO [trainer.py:765] (1/8) Epoch 10, batch 500, train_loss[loss=2.849, ArTop10Accuracy=0.7416, over 12035.00 frames. ], tot_loss[loss=2.921, ArTop10Accuracy=0.7277, over 10890.25 frames. ], batch size: 22, lr: 1.19e-02
2024-08-06 05:03:48,229 INFO [trainer.py:765] (1/8) Epoch 10, batch 600, train_loss[loss=2.751, ArTop10Accuracy=0.7489, over 11510.00 frames. ], tot_loss[loss=2.928, ArTop10Accuracy=0.7265, over 11425.54 frames. ], batch size: 18, lr: 1.18e-02
2024-08-06 05:04:36,715 INFO [trainer.py:765] (1/8) Epoch 10, batch 700, train_loss[loss=2.867, ArTop10Accuracy=0.7277, over 10102.00 frames. ], tot_loss[loss=2.932, ArTop10Accuracy=0.7254, over 11561.23 frames. ], batch size: 12, lr: 1.18e-02
2024-08-06 05:05:10,725 INFO [trainer.py:765] (1/8) Epoch 10, batch 800, train_loss[loss=2.75, ArTop10Accuracy=0.7562, over 10174.00 frames. ], tot_loss[loss=2.937, ArTop10Accuracy=0.7242, over 11665.97 frames. ], batch size: 12, lr: 1.17e-02
2024-08-06 05:05:42,245 INFO [trainer.py:765] (1/8) Epoch 10, batch 900, train_loss[loss=2.795, ArTop10Accuracy=0.7538, over 13100.00 frames. ], tot_loss[loss=2.93, ArTop10Accuracy=0.7254, over 11717.16 frames. ], batch size: 27, lr: 1.17e-02
2024-08-06 05:06:13,844 INFO [trainer.py:765] (1/8) Epoch 10, batch 1000, train_loss[loss=2.856, ArTop10Accuracy=0.7444, over 13035.00 frames. ], tot_loss[loss=2.933, ArTop10Accuracy=0.7251, over 11937.94 frames. ], batch size: 27, lr: 1.16e-02
2024-08-06 05:06:45,056 INFO [trainer.py:765] (1/8) Epoch 10, batch 1100, train_loss[loss=3.035, ArTop10Accuracy=0.6976, over 13713.00 frames. ], tot_loss[loss=2.938, ArTop10Accuracy=0.724, over 12000.95 frames. ], batch size: 34, lr: 1.16e-02
2024-08-06 05:07:15,484 INFO [trainer.py:765] (1/8) Epoch 10, batch 1200, train_loss[loss=3.091, ArTop10Accuracy=0.688, over 11661.00 frames. ], tot_loss[loss=2.944, ArTop10Accuracy=0.723, over 11942.60 frames. ], batch size: 98, lr: 1.16e-02
2024-08-06 05:07:40,804 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 05:08:52,966 INFO [trainer.py:765] (1/8) Epoch 11, batch 100, train_loss[loss=2.952, ArTop10Accuracy=0.7234, over 14658.00 frames. ], tot_loss[loss=2.909, ArTop10Accuracy=0.7313, over 4795.02 frames. ], batch size: 61, lr: 1.10e-02
2024-08-06 05:09:41,277 INFO [trainer.py:765] (1/8) Epoch 11, batch 200, train_loss[loss=2.837, ArTop10Accuracy=0.7443, over 13437.00 frames. ], tot_loss[loss=2.909, ArTop10Accuracy=0.7308, over 7799.60 frames. ], batch size: 34, lr: 1.10e-02
2024-08-06 05:09:51,176 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.001e+02 1.278e+02 1.371e+02 1.502e+02 3.785e+02, threshold=2.743e+02, percent-clipped=0.3
2024-08-06 05:10:24,720 INFO [trainer.py:765] (1/8) Epoch 11, batch 300, train_loss[loss=2.882, ArTop10Accuracy=0.7337, over 14214.00 frames. ], tot_loss[loss=2.908, ArTop10Accuracy=0.7312, over 9428.44 frames. ], batch size: 44, lr: 1.09e-02
2024-08-06 05:11:11,784 INFO [trainer.py:765] (1/8) Epoch 11, batch 400, train_loss[loss=2.776, ArTop10Accuracy=0.7525, over 10812.00 frames. ], tot_loss[loss=2.908, ArTop10Accuracy=0.7311, over 10349.47 frames. ], batch size: 15, lr: 1.09e-02
2024-08-06 05:11:52,691 INFO [trainer.py:765] (1/8) Epoch 11, batch 500, train_loss[loss=2.832, ArTop10Accuracy=0.7447, over 12256.00 frames. ], tot_loss[loss=2.905, ArTop10Accuracy=0.7312, over 10905.08 frames. ], batch size: 22, lr: 1.09e-02
2024-08-06 05:12:40,287 INFO [trainer.py:765] (1/8) Epoch 11, batch 600, train_loss[loss=2.758, ArTop10Accuracy=0.7534, over 11557.00 frames. ], tot_loss[loss=2.91, ArTop10Accuracy=0.7301, over 11431.45 frames. ], batch size: 18, lr: 1.08e-02
2024-08-06 05:13:25,708 INFO [trainer.py:765] (1/8) Epoch 11, batch 700, train_loss[loss=2.842, ArTop10Accuracy=0.7449, over 10276.00 frames. ], tot_loss[loss=2.917, ArTop10Accuracy=0.7284, over 11603.02 frames. ], batch size: 12, lr: 1.08e-02
2024-08-06 05:14:04,205 INFO [trainer.py:765] (1/8) Epoch 11, batch 800, train_loss[loss=2.886, ArTop10Accuracy=0.7415, over 10103.00 frames. ], tot_loss[loss=2.92, ArTop10Accuracy=0.7281, over 11708.31 frames. ], batch size: 12, lr: 1.07e-02
2024-08-06 05:14:35,667 INFO [trainer.py:765] (1/8) Epoch 11, batch 900, train_loss[loss=2.983, ArTop10Accuracy=0.7167, over 12986.00 frames. ], tot_loss[loss=2.912, ArTop10Accuracy=0.7294, over 11763.77 frames. ], batch size: 27, lr: 1.07e-02
2024-08-06 05:15:07,263 INFO [trainer.py:765] (1/8) Epoch 11, batch 1000, train_loss[loss=2.839, ArTop10Accuracy=0.7413, over 12999.00 frames. ], tot_loss[loss=2.914, ArTop10Accuracy=0.7289, over 11951.15 frames. ], batch size: 27, lr: 1.07e-02
2024-08-06 05:15:38,260 INFO [trainer.py:765] (1/8) Epoch 11, batch 1100, train_loss[loss=3, ArTop10Accuracy=0.7166, over 13743.00 frames. ], tot_loss[loss=2.919, ArTop10Accuracy=0.7277, over 11998.49 frames. ], batch size: 34, lr: 1.06e-02
2024-08-06 05:16:08,498 INFO [trainer.py:765] (1/8) Epoch 11, batch 1200, train_loss[loss=3.05, ArTop10Accuracy=0.7029, over 11881.00 frames. ], tot_loss[loss=2.921, ArTop10Accuracy=0.7275, over 11939.46 frames. ], batch size: 99, lr: 1.06e-02
2024-08-06 05:16:12,697 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 05:16:21,623 INFO [trainer.py:811] (1/8) Epoch 11, validation: loss=2.923, ArTop10Accuracy=0.7318, over 1829298.00 frames. 
2024-08-06 05:16:21,623 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 30932MB
2024-08-06 05:16:21,949 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.076e+02 1.268e+02 1.368e+02 1.481e+02 4.790e+02, threshold=2.736e+02, percent-clipped=0.6
2024-08-06 05:16:42,524 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 05:18:03,004 INFO [trainer.py:765] (1/8) Epoch 12, batch 100, train_loss[loss=2.948, ArTop10Accuracy=0.7227, over 14766.00 frames. ], tot_loss[loss=2.893, ArTop10Accuracy=0.7349, over 4803.10 frames. ], batch size: 61, lr: 1.01e-02
2024-08-06 05:18:46,003 INFO [trainer.py:765] (1/8) Epoch 12, batch 200, train_loss[loss=2.996, ArTop10Accuracy=0.7162, over 13847.00 frames. ], tot_loss[loss=2.89, ArTop10Accuracy=0.7347, over 7795.71 frames. ], batch size: 34, lr: 1.01e-02
2024-08-06 05:19:31,946 INFO [trainer.py:765] (1/8) Epoch 12, batch 300, train_loss[loss=2.957, ArTop10Accuracy=0.7282, over 14435.00 frames. ], tot_loss[loss=2.889, ArTop10Accuracy=0.7351, over 9411.90 frames. ], batch size: 44, lr: 1.01e-02
2024-08-06 05:20:12,430 INFO [trainer.py:765] (1/8) Epoch 12, batch 400, train_loss[loss=2.886, ArTop10Accuracy=0.7416, over 10391.00 frames. ], tot_loss[loss=2.887, ArTop10Accuracy=0.7348, over 10326.32 frames. ], batch size: 14, lr: 1.00e-02
2024-08-06 05:21:00,639 INFO [trainer.py:765] (1/8) Epoch 12, batch 500, train_loss[loss=2.952, ArTop10Accuracy=0.7296, over 12311.00 frames. ], tot_loss[loss=2.886, ArTop10Accuracy=0.7353, over 10893.08 frames. ], batch size: 22, lr: 9.99e-03
2024-08-06 05:21:43,915 INFO [trainer.py:765] (1/8) Epoch 12, batch 600, train_loss[loss=2.877, ArTop10Accuracy=0.7314, over 11609.00 frames. ], tot_loss[loss=2.89, ArTop10Accuracy=0.734, over 11424.95 frames. ], batch size: 18, lr: 9.96e-03
2024-08-06 05:22:32,205 INFO [trainer.py:765] (1/8) Epoch 12, batch 700, train_loss[loss=2.803, ArTop10Accuracy=0.7459, over 9975.00 frames. ], tot_loss[loss=2.893, ArTop10Accuracy=0.7333, over 11573.77 frames. ], batch size: 12, lr: 9.93e-03
2024-08-06 05:23:08,911 INFO [trainer.py:765] (1/8) Epoch 12, batch 800, train_loss[loss=2.761, ArTop10Accuracy=0.7474, over 10005.00 frames. ], tot_loss[loss=2.9, ArTop10Accuracy=0.732, over 11684.31 frames. ], batch size: 12, lr: 9.90e-03
2024-08-06 05:23:40,459 INFO [trainer.py:765] (1/8) Epoch 12, batch 900, train_loss[loss=2.892, ArTop10Accuracy=0.7327, over 13289.00 frames. ], tot_loss[loss=2.893, ArTop10Accuracy=0.7331, over 11737.04 frames. ], batch size: 27, lr: 9.87e-03
2024-08-06 05:23:54,575 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.067e+02 1.273e+02 1.376e+02 1.503e+02 4.050e+02, threshold=2.752e+02, percent-clipped=0.4
2024-08-06 05:24:14,344 INFO [trainer.py:765] (1/8) Epoch 12, batch 1000, train_loss[loss=2.82, ArTop10Accuracy=0.7526, over 12919.00 frames. ], tot_loss[loss=2.904, ArTop10Accuracy=0.7313, over 11937.90 frames. ], batch size: 27, lr: 9.84e-03
2024-08-06 05:24:45,501 INFO [trainer.py:765] (1/8) Epoch 12, batch 1100, train_loss[loss=2.954, ArTop10Accuracy=0.7211, over 13661.00 frames. ], tot_loss[loss=2.908, ArTop10Accuracy=0.7302, over 11999.56 frames. ], batch size: 34, lr: 9.81e-03
2024-08-06 05:25:15,881 INFO [trainer.py:765] (1/8) Epoch 12, batch 1200, train_loss[loss=3.106, ArTop10Accuracy=0.6906, over 11901.00 frames. ], tot_loss[loss=2.913, ArTop10Accuracy=0.7293, over 11945.42 frames. ], batch size: 97, lr: 9.78e-03
2024-08-06 05:25:41,043 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 05:26:46,787 INFO [trainer.py:765] (1/8) Epoch 13, batch 100, train_loss[loss=2.906, ArTop10Accuracy=0.731, over 14602.00 frames. ], tot_loss[loss=2.873, ArTop10Accuracy=0.7385, over 4787.21 frames. ], batch size: 61, lr: 9.36e-03
2024-08-06 05:27:32,553 INFO [trainer.py:765] (1/8) Epoch 13, batch 200, train_loss[loss=2.885, ArTop10Accuracy=0.7358, over 13786.00 frames. ], tot_loss[loss=2.873, ArTop10Accuracy=0.7384, over 7785.28 frames. ], batch size: 34, lr: 9.34e-03
2024-08-06 05:28:16,036 INFO [trainer.py:765] (1/8) Epoch 13, batch 300, train_loss[loss=2.867, ArTop10Accuracy=0.7418, over 14393.00 frames. ], tot_loss[loss=2.869, ArTop10Accuracy=0.7391, over 9414.48 frames. ], batch size: 44, lr: 9.31e-03
2024-08-06 05:29:00,149 INFO [trainer.py:765] (1/8) Epoch 13, batch 400, train_loss[loss=2.824, ArTop10Accuracy=0.7474, over 10475.00 frames. ], tot_loss[loss=2.868, ArTop10Accuracy=0.7393, over 10313.15 frames. ], batch size: 14, lr: 9.28e-03
2024-08-06 05:29:43,967 INFO [trainer.py:765] (1/8) Epoch 13, batch 500, train_loss[loss=2.729, ArTop10Accuracy=0.7527, over 12343.00 frames. ], tot_loss[loss=2.864, ArTop10Accuracy=0.7396, over 10873.10 frames. ], batch size: 22, lr: 9.26e-03
2024-08-06 05:30:24,248 INFO [trainer.py:765] (1/8) Epoch 13, batch 600, train_loss[loss=2.75, ArTop10Accuracy=0.7631, over 11456.00 frames. ], tot_loss[loss=2.871, ArTop10Accuracy=0.7379, over 11414.87 frames. ], batch size: 18, lr: 9.23e-03
2024-08-06 05:30:58,110 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 05:31:07,054 INFO [trainer.py:811] (1/8) Epoch 13, validation: loss=2.918, ArTop10Accuracy=0.733, over 1829298.00 frames. 
2024-08-06 05:31:07,054 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 30932MB
2024-08-06 05:31:07,351 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.049e+02 1.283e+02 1.389e+02 1.496e+02 2.729e+02, threshold=2.779e+02, percent-clipped=0.0
2024-08-06 05:31:24,042 INFO [trainer.py:765] (1/8) Epoch 13, batch 700, train_loss[loss=2.841, ArTop10Accuracy=0.7457, over 10367.00 frames. ], tot_loss[loss=2.882, ArTop10Accuracy=0.7353, over 11552.22 frames. ], batch size: 12, lr: 9.20e-03
2024-08-06 05:32:00,146 INFO [trainer.py:765] (1/8) Epoch 13, batch 800, train_loss[loss=2.727, ArTop10Accuracy=0.7702, over 10091.00 frames. ], tot_loss[loss=2.885, ArTop10Accuracy=0.7351, over 11685.22 frames. ], batch size: 12, lr: 9.18e-03
2024-08-06 05:32:31,520 INFO [trainer.py:765] (1/8) Epoch 13, batch 900, train_loss[loss=2.896, ArTop10Accuracy=0.7348, over 12939.00 frames. ], tot_loss[loss=2.878, ArTop10Accuracy=0.7363, over 11743.46 frames. ], batch size: 27, lr: 9.15e-03
2024-08-06 05:33:03,042 INFO [trainer.py:765] (1/8) Epoch 13, batch 1000, train_loss[loss=2.882, ArTop10Accuracy=0.7395, over 13052.00 frames. ], tot_loss[loss=2.888, ArTop10Accuracy=0.7343, over 11933.66 frames. ], batch size: 27, lr: 9.13e-03
2024-08-06 05:33:34,231 INFO [trainer.py:765] (1/8) Epoch 13, batch 1100, train_loss[loss=2.868, ArTop10Accuracy=0.7404, over 13751.00 frames. ], tot_loss[loss=2.895, ArTop10Accuracy=0.7331, over 12013.64 frames. ], batch size: 34, lr: 9.10e-03
2024-08-06 05:34:04,518 INFO [trainer.py:765] (1/8) Epoch 13, batch 1200, train_loss[loss=3.042, ArTop10Accuracy=0.7104, over 12299.00 frames. ], tot_loss[loss=2.897, ArTop10Accuracy=0.7327, over 11945.18 frames. ], batch size: 98, lr: 9.07e-03
2024-08-06 05:34:30,235 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 05:35:39,198 INFO [trainer.py:765] (1/8) Epoch 14, batch 100, train_loss[loss=2.937, ArTop10Accuracy=0.7268, over 14650.00 frames. ], tot_loss[loss=2.871, ArTop10Accuracy=0.7388, over 4773.88 frames. ], batch size: 61, lr: 8.71e-03
2024-08-06 05:36:23,063 INFO [trainer.py:765] (1/8) Epoch 14, batch 200, train_loss[loss=2.889, ArTop10Accuracy=0.7419, over 13718.00 frames. ], tot_loss[loss=2.864, ArTop10Accuracy=0.7399, over 7789.92 frames. ], batch size: 34, lr: 8.68e-03
2024-08-06 05:37:09,309 INFO [trainer.py:765] (1/8) Epoch 14, batch 300, train_loss[loss=2.92, ArTop10Accuracy=0.7316, over 14309.00 frames. ], tot_loss[loss=2.857, ArTop10Accuracy=0.741, over 9421.38 frames. ], batch size: 44, lr: 8.66e-03
2024-08-06 05:37:46,029 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.097e+02 1.304e+02 1.410e+02 1.531e+02 2.912e+02, threshold=2.820e+02, percent-clipped=0.2
2024-08-06 05:37:55,139 INFO [trainer.py:765] (1/8) Epoch 14, batch 400, train_loss[loss=2.894, ArTop10Accuracy=0.7352, over 10361.00 frames. ], tot_loss[loss=2.857, ArTop10Accuracy=0.7409, over 10335.77 frames. ], batch size: 14, lr: 8.64e-03
2024-08-06 05:38:42,025 INFO [trainer.py:765] (1/8) Epoch 14, batch 500, train_loss[loss=2.838, ArTop10Accuracy=0.7473, over 12248.00 frames. ], tot_loss[loss=2.854, ArTop10Accuracy=0.7416, over 10916.19 frames. ], batch size: 22, lr: 8.61e-03
2024-08-06 05:39:22,375 INFO [trainer.py:765] (1/8) Epoch 14, batch 600, train_loss[loss=2.725, ArTop10Accuracy=0.7621, over 11564.00 frames. ], tot_loss[loss=2.859, ArTop10Accuracy=0.7405, over 11425.60 frames. ], batch size: 18, lr: 8.59e-03
2024-08-06 05:40:15,143 INFO [trainer.py:765] (1/8) Epoch 14, batch 700, train_loss[loss=2.855, ArTop10Accuracy=0.7373, over 10051.00 frames. ], tot_loss[loss=2.873, ArTop10Accuracy=0.7376, over 11567.50 frames. ], batch size: 12, lr: 8.57e-03
2024-08-06 05:40:49,136 INFO [trainer.py:765] (1/8) Epoch 14, batch 800, train_loss[loss=2.86, ArTop10Accuracy=0.7426, over 10108.00 frames. ], tot_loss[loss=2.877, ArTop10Accuracy=0.7369, over 11685.36 frames. ], batch size: 12, lr: 8.55e-03
2024-08-06 05:41:20,467 INFO [trainer.py:765] (1/8) Epoch 14, batch 900, train_loss[loss=2.889, ArTop10Accuracy=0.7261, over 13003.00 frames. ], tot_loss[loss=2.87, ArTop10Accuracy=0.7382, over 11732.11 frames. ], batch size: 27, lr: 8.52e-03
2024-08-06 05:41:51,996 INFO [trainer.py:765] (1/8) Epoch 14, batch 1000, train_loss[loss=2.853, ArTop10Accuracy=0.739, over 12932.00 frames. ], tot_loss[loss=2.878, ArTop10Accuracy=0.7365, over 11926.03 frames. ], batch size: 27, lr: 8.50e-03
2024-08-06 05:42:23,220 INFO [trainer.py:765] (1/8) Epoch 14, batch 1100, train_loss[loss=2.838, ArTop10Accuracy=0.7475, over 13897.00 frames. ], tot_loss[loss=2.877, ArTop10Accuracy=0.7366, over 11991.45 frames. ], batch size: 34, lr: 8.48e-03
2024-08-06 05:42:53,549 INFO [trainer.py:765] (1/8) Epoch 14, batch 1200, train_loss[loss=3.052, ArTop10Accuracy=0.7097, over 12008.00 frames. ], tot_loss[loss=2.878, ArTop10Accuracy=0.7363, over 11941.69 frames. ], batch size: 98, lr: 8.46e-03
2024-08-06 05:43:18,869 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 05:44:28,571 INFO [trainer.py:765] (1/8) Epoch 15, batch 100, train_loss[loss=2.988, ArTop10Accuracy=0.7149, over 14408.00 frames. ], tot_loss[loss=2.851, ArTop10Accuracy=0.7418, over 4787.97 frames. ], batch size: 61, lr: 8.14e-03
2024-08-06 05:44:29,213 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 05:44:38,024 INFO [trainer.py:811] (1/8) Epoch 15, validation: loss=2.913, ArTop10Accuracy=0.7339, over 1829298.00 frames. 
2024-08-06 05:44:38,024 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 30932MB
2024-08-06 05:44:38,413 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.100e+02 1.307e+02 1.417e+02 1.528e+02 2.981e+02, threshold=2.833e+02, percent-clipped=0.1
2024-08-06 05:45:20,185 INFO [trainer.py:765] (1/8) Epoch 15, batch 200, train_loss[loss=2.832, ArTop10Accuracy=0.7372, over 13619.00 frames. ], tot_loss[loss=2.841, ArTop10Accuracy=0.7438, over 7792.95 frames. ], batch size: 34, lr: 8.11e-03
2024-08-06 05:46:04,647 INFO [trainer.py:765] (1/8) Epoch 15, batch 300, train_loss[loss=2.883, ArTop10Accuracy=0.7322, over 14337.00 frames. ], tot_loss[loss=2.843, ArTop10Accuracy=0.7434, over 9432.06 frames. ], batch size: 44, lr: 8.09e-03
2024-08-06 05:46:51,902 INFO [trainer.py:765] (1/8) Epoch 15, batch 400, train_loss[loss=2.75, ArTop10Accuracy=0.7746, over 10929.00 frames. ], tot_loss[loss=2.847, ArTop10Accuracy=0.743, over 10339.39 frames. ], batch size: 15, lr: 8.07e-03
2024-08-06 05:47:36,911 INFO [trainer.py:765] (1/8) Epoch 15, batch 500, train_loss[loss=2.897, ArTop10Accuracy=0.7295, over 12244.00 frames. ], tot_loss[loss=2.842, ArTop10Accuracy=0.7439, over 10898.15 frames. ], batch size: 22, lr: 8.05e-03
2024-08-06 05:48:24,723 INFO [trainer.py:765] (1/8) Epoch 15, batch 600, train_loss[loss=2.708, ArTop10Accuracy=0.7705, over 11604.00 frames. ], tot_loss[loss=2.847, ArTop10Accuracy=0.7425, over 11441.84 frames. ], batch size: 18, lr: 8.03e-03
2024-08-06 05:49:11,855 INFO [trainer.py:765] (1/8) Epoch 15, batch 700, train_loss[loss=2.931, ArTop10Accuracy=0.7207, over 9923.00 frames. ], tot_loss[loss=2.857, ArTop10Accuracy=0.7409, over 11584.21 frames. ], batch size: 12, lr: 8.01e-03
2024-08-06 05:49:45,778 INFO [trainer.py:765] (1/8) Epoch 15, batch 800, train_loss[loss=2.799, ArTop10Accuracy=0.751, over 9435.00 frames. ], tot_loss[loss=2.865, ArTop10Accuracy=0.7392, over 11680.50 frames. ], batch size: 11, lr: 7.99e-03
2024-08-06 05:50:17,210 INFO [trainer.py:765] (1/8) Epoch 15, batch 900, train_loss[loss=2.885, ArTop10Accuracy=0.7425, over 13101.00 frames. ], tot_loss[loss=2.855, ArTop10Accuracy=0.7409, over 11741.27 frames. ], batch size: 27, lr: 7.97e-03
2024-08-06 05:50:48,829 INFO [trainer.py:765] (1/8) Epoch 15, batch 1000, train_loss[loss=2.879, ArTop10Accuracy=0.7285, over 12940.00 frames. ], tot_loss[loss=2.859, ArTop10Accuracy=0.7401, over 11939.29 frames. ], batch size: 27, lr: 7.95e-03
2024-08-06 05:51:20,069 INFO [trainer.py:765] (1/8) Epoch 15, batch 1100, train_loss[loss=2.92, ArTop10Accuracy=0.7312, over 13428.00 frames. ], tot_loss[loss=2.872, ArTop10Accuracy=0.7378, over 12005.19 frames. ], batch size: 34, lr: 7.93e-03
2024-08-06 05:51:23,515 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.123e+02 1.337e+02 1.431e+02 1.541e+02 2.784e+02, threshold=2.862e+02, percent-clipped=0.0
2024-08-06 05:51:53,082 INFO [trainer.py:765] (1/8) Epoch 15, batch 1200, train_loss[loss=2.979, ArTop10Accuracy=0.7201, over 12276.00 frames. ], tot_loss[loss=2.873, ArTop10Accuracy=0.7373, over 11940.75 frames. ], batch size: 97, lr: 7.91e-03
2024-08-06 05:52:18,078 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 05:53:29,262 INFO [trainer.py:765] (1/8) Epoch 16, batch 100, train_loss[loss=2.95, ArTop10Accuracy=0.7261, over 14794.00 frames. ], tot_loss[loss=2.839, ArTop10Accuracy=0.7458, over 4788.63 frames. ], batch size: 61, lr: 7.63e-03
2024-08-06 05:54:12,876 INFO [trainer.py:765] (1/8) Epoch 16, batch 200, train_loss[loss=2.867, ArTop10Accuracy=0.7372, over 13845.00 frames. ], tot_loss[loss=2.831, ArTop10Accuracy=0.7468, over 7795.41 frames. ], batch size: 34, lr: 7.61e-03
2024-08-06 05:54:59,736 INFO [trainer.py:765] (1/8) Epoch 16, batch 300, train_loss[loss=2.835, ArTop10Accuracy=0.7487, over 14196.00 frames. ], tot_loss[loss=2.83, ArTop10Accuracy=0.7466, over 9437.07 frames. ], batch size: 44, lr: 7.59e-03
2024-08-06 05:55:41,930 INFO [trainer.py:765] (1/8) Epoch 16, batch 400, train_loss[loss=2.825, ArTop10Accuracy=0.7417, over 10353.00 frames. ], tot_loss[loss=2.83, ArTop10Accuracy=0.7463, over 10343.85 frames. ], batch size: 14, lr: 7.58e-03
2024-08-06 05:56:27,679 INFO [trainer.py:765] (1/8) Epoch 16, batch 500, train_loss[loss=2.891, ArTop10Accuracy=0.7377, over 12075.00 frames. ], tot_loss[loss=2.835, ArTop10Accuracy=0.7454, over 10902.38 frames. ], batch size: 22, lr: 7.56e-03
2024-08-06 05:57:12,439 INFO [trainer.py:765] (1/8) Epoch 16, batch 600, train_loss[loss=2.745, ArTop10Accuracy=0.7592, over 11620.00 frames. ], tot_loss[loss=2.843, ArTop10Accuracy=0.7439, over 11438.59 frames. ], batch size: 18, lr: 7.54e-03
2024-08-06 05:58:00,039 INFO [trainer.py:765] (1/8) Epoch 16, batch 700, train_loss[loss=2.872, ArTop10Accuracy=0.7347, over 10090.00 frames. ], tot_loss[loss=2.846, ArTop10Accuracy=0.7427, over 11567.21 frames. ], batch size: 12, lr: 7.52e-03
2024-08-06 05:58:34,023 INFO [trainer.py:765] (1/8) Epoch 16, batch 800, train_loss[loss=2.702, ArTop10Accuracy=0.7621, over 9941.00 frames. ], tot_loss[loss=2.852, ArTop10Accuracy=0.7416, over 11677.75 frames. ], batch size: 12, lr: 7.50e-03
2024-08-06 05:58:41,568 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 05:58:50,426 INFO [trainer.py:811] (1/8) Epoch 16, validation: loss=2.915, ArTop10Accuracy=0.7338, over 1829298.00 frames. 
2024-08-06 05:58:50,427 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 30932MB
2024-08-06 05:58:50,730 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.121e+02 1.335e+02 1.445e+02 1.570e+02 3.252e+02, threshold=2.890e+02, percent-clipped=0.1
2024-08-06 05:59:14,320 INFO [trainer.py:765] (1/8) Epoch 16, batch 900, train_loss[loss=2.841, ArTop10Accuracy=0.7391, over 12885.00 frames. ], tot_loss[loss=2.845, ArTop10Accuracy=0.7426, over 11730.07 frames. ], batch size: 27, lr: 7.49e-03
2024-08-06 05:59:45,915 INFO [trainer.py:765] (1/8) Epoch 16, batch 1000, train_loss[loss=2.772, ArTop10Accuracy=0.7538, over 12877.00 frames. ], tot_loss[loss=2.85, ArTop10Accuracy=0.7417, over 11925.18 frames. ], batch size: 27, lr: 7.47e-03
2024-08-06 06:00:17,091 INFO [trainer.py:765] (1/8) Epoch 16, batch 1100, train_loss[loss=2.963, ArTop10Accuracy=0.7273, over 13734.00 frames. ], tot_loss[loss=2.862, ArTop10Accuracy=0.7394, over 11985.18 frames. ], batch size: 34, lr: 7.45e-03
2024-08-06 06:00:47,464 INFO [trainer.py:765] (1/8) Epoch 16, batch 1200, train_loss[loss=2.979, ArTop10Accuracy=0.7205, over 11777.00 frames. ], tot_loss[loss=2.861, ArTop10Accuracy=0.7397, over 11930.28 frames. ], batch size: 97, lr: 7.43e-03
2024-08-06 06:01:12,268 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 06:02:27,261 INFO [trainer.py:765] (1/8) Epoch 17, batch 100, train_loss[loss=2.891, ArTop10Accuracy=0.7388, over 14342.00 frames. ], tot_loss[loss=2.826, ArTop10Accuracy=0.7468, over 4773.07 frames. ], batch size: 61, lr: 7.18e-03
2024-08-06 06:03:11,850 INFO [trainer.py:765] (1/8) Epoch 17, batch 200, train_loss[loss=2.898, ArTop10Accuracy=0.7334, over 13627.00 frames. ], tot_loss[loss=2.823, ArTop10Accuracy=0.7479, over 7781.00 frames. ], batch size: 34, lr: 7.17e-03
2024-08-06 06:03:57,502 INFO [trainer.py:765] (1/8) Epoch 17, batch 300, train_loss[loss=2.856, ArTop10Accuracy=0.7417, over 14336.00 frames. ], tot_loss[loss=2.822, ArTop10Accuracy=0.7477, over 9420.34 frames. ], batch size: 44, lr: 7.15e-03
2024-08-06 06:04:42,838 INFO [trainer.py:765] (1/8) Epoch 17, batch 400, train_loss[loss=2.741, ArTop10Accuracy=0.7622, over 10507.00 frames. ], tot_loss[loss=2.823, ArTop10Accuracy=0.7477, over 10322.09 frames. ], batch size: 14, lr: 7.13e-03
2024-08-06 06:05:29,004 INFO [trainer.py:765] (1/8) Epoch 17, batch 500, train_loss[loss=2.815, ArTop10Accuracy=0.7536, over 12257.00 frames. ], tot_loss[loss=2.818, ArTop10Accuracy=0.7488, over 10888.13 frames. ], batch size: 22, lr: 7.12e-03
2024-08-06 06:05:49,551 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.142e+02 1.359e+02 1.445e+02 1.551e+02 2.741e+02, threshold=2.891e+02, percent-clipped=0.0
2024-08-06 06:06:20,723 INFO [trainer.py:765] (1/8) Epoch 17, batch 600, train_loss[loss=2.835, ArTop10Accuracy=0.7372, over 11792.00 frames. ], tot_loss[loss=2.827, ArTop10Accuracy=0.747, over 11409.04 frames. ], batch size: 18, lr: 7.10e-03
2024-08-06 06:07:04,695 INFO [trainer.py:765] (1/8) Epoch 17, batch 700, train_loss[loss=2.82, ArTop10Accuracy=0.7605, over 10070.00 frames. ], tot_loss[loss=2.832, ArTop10Accuracy=0.7457, over 11578.57 frames. ], batch size: 12, lr: 7.09e-03
2024-08-06 06:07:44,896 INFO [trainer.py:765] (1/8) Epoch 17, batch 800, train_loss[loss=2.747, ArTop10Accuracy=0.761, over 9348.00 frames. ], tot_loss[loss=2.841, ArTop10Accuracy=0.7441, over 11680.49 frames. ], batch size: 11, lr: 7.07e-03
2024-08-06 06:08:16,384 INFO [trainer.py:765] (1/8) Epoch 17, batch 900, train_loss[loss=2.798, ArTop10Accuracy=0.7556, over 12914.00 frames. ], tot_loss[loss=2.831, ArTop10Accuracy=0.7459, over 11743.50 frames. ], batch size: 27, lr: 7.05e-03
2024-08-06 06:08:47,995 INFO [trainer.py:765] (1/8) Epoch 17, batch 1000, train_loss[loss=2.76, ArTop10Accuracy=0.76, over 13013.00 frames. ], tot_loss[loss=2.836, ArTop10Accuracy=0.7452, over 11941.09 frames. ], batch size: 27, lr: 7.04e-03
2024-08-06 06:09:19,134 INFO [trainer.py:765] (1/8) Epoch 17, batch 1100, train_loss[loss=2.882, ArTop10Accuracy=0.7352, over 13801.00 frames. ], tot_loss[loss=2.848, ArTop10Accuracy=0.7428, over 12001.59 frames. ], batch size: 34, lr: 7.02e-03
2024-08-06 06:09:49,445 INFO [trainer.py:765] (1/8) Epoch 17, batch 1200, train_loss[loss=2.963, ArTop10Accuracy=0.7218, over 12260.00 frames. ], tot_loss[loss=2.848, ArTop10Accuracy=0.7427, over 11943.38 frames. ], batch size: 97, lr: 7.01e-03
2024-08-06 06:10:15,027 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 06:11:23,102 INFO [trainer.py:765] (1/8) Epoch 18, batch 100, train_loss[loss=2.965, ArTop10Accuracy=0.7235, over 14660.00 frames. ], tot_loss[loss=2.812, ArTop10Accuracy=0.7508, over 4790.75 frames. ], batch size: 61, lr: 6.78e-03
2024-08-06 06:12:16,260 INFO [trainer.py:765] (1/8) Epoch 18, batch 200, train_loss[loss=2.711, ArTop10Accuracy=0.7669, over 13790.00 frames. ], tot_loss[loss=2.807, ArTop10Accuracy=0.7517, over 7796.93 frames. ], batch size: 34, lr: 6.77e-03
2024-08-06 06:12:40,317 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 06:12:48,991 INFO [trainer.py:811] (1/8) Epoch 18, validation: loss=2.916, ArTop10Accuracy=0.7343, over 1829298.00 frames. 
2024-08-06 06:12:48,992 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 30932MB
2024-08-06 06:12:49,335 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.163e+02 1.377e+02 1.476e+02 1.588e+02 2.450e+02, threshold=2.952e+02, percent-clipped=0.0
2024-08-06 06:13:07,115 INFO [trainer.py:765] (1/8) Epoch 18, batch 300, train_loss[loss=2.858, ArTop10Accuracy=0.7411, over 14602.00 frames. ], tot_loss[loss=2.805, ArTop10Accuracy=0.7514, over 9444.02 frames. ], batch size: 44, lr: 6.75e-03
2024-08-06 06:13:54,097 INFO [trainer.py:765] (1/8) Epoch 18, batch 400, train_loss[loss=2.723, ArTop10Accuracy=0.7649, over 10493.00 frames. ], tot_loss[loss=2.807, ArTop10Accuracy=0.7511, over 10353.84 frames. ], batch size: 14, lr: 6.74e-03
2024-08-06 06:14:38,487 INFO [trainer.py:765] (1/8) Epoch 18, batch 500, train_loss[loss=2.816, ArTop10Accuracy=0.7537, over 12205.00 frames. ], tot_loss[loss=2.808, ArTop10Accuracy=0.7506, over 10904.51 frames. ], batch size: 22, lr: 6.73e-03
2024-08-06 06:15:23,627 INFO [trainer.py:765] (1/8) Epoch 18, batch 600, train_loss[loss=2.738, ArTop10Accuracy=0.7657, over 11528.00 frames. ], tot_loss[loss=2.813, ArTop10Accuracy=0.7495, over 11416.40 frames. ], batch size: 18, lr: 6.71e-03
2024-08-06 06:16:17,342 INFO [trainer.py:765] (1/8) Epoch 18, batch 700, train_loss[loss=2.747, ArTop10Accuracy=0.7664, over 10033.00 frames. ], tot_loss[loss=2.827, ArTop10Accuracy=0.747, over 11580.43 frames. ], batch size: 12, lr: 6.70e-03
2024-08-06 06:16:51,427 INFO [trainer.py:765] (1/8) Epoch 18, batch 800, train_loss[loss=2.85, ArTop10Accuracy=0.7552, over 9297.00 frames. ], tot_loss[loss=2.832, ArTop10Accuracy=0.7458, over 11673.28 frames. ], batch size: 11, lr: 6.68e-03
2024-08-06 06:17:22,912 INFO [trainer.py:765] (1/8) Epoch 18, batch 900, train_loss[loss=2.729, ArTop10Accuracy=0.7666, over 12886.00 frames. ], tot_loss[loss=2.819, ArTop10Accuracy=0.7482, over 11720.71 frames. ], batch size: 27, lr: 6.67e-03
2024-08-06 06:17:54,528 INFO [trainer.py:765] (1/8) Epoch 18, batch 1000, train_loss[loss=2.824, ArTop10Accuracy=0.7476, over 12904.00 frames. ], tot_loss[loss=2.821, ArTop10Accuracy=0.7477, over 11932.81 frames. ], batch size: 27, lr: 6.65e-03
2024-08-06 06:18:25,662 INFO [trainer.py:765] (1/8) Epoch 18, batch 1100, train_loss[loss=2.767, ArTop10Accuracy=0.7569, over 13569.00 frames. ], tot_loss[loss=2.829, ArTop10Accuracy=0.746, over 11989.03 frames. ], batch size: 34, lr: 6.64e-03
2024-08-06 06:18:55,971 INFO [trainer.py:765] (1/8) Epoch 18, batch 1200, train_loss[loss=2.946, ArTop10Accuracy=0.7211, over 11982.00 frames. ], tot_loss[loss=2.833, ArTop10Accuracy=0.7452, over 11972.69 frames. ], batch size: 97, lr: 6.63e-03
2024-08-06 06:19:19,163 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.178e+02 1.387e+02 1.492e+02 1.607e+02 2.982e+02, threshold=2.983e+02, percent-clipped=0.1
2024-08-06 06:19:23,696 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 06:20:29,728 INFO [trainer.py:765] (1/8) Epoch 19, batch 100, train_loss[loss=2.858, ArTop10Accuracy=0.7397, over 14846.00 frames. ], tot_loss[loss=2.808, ArTop10Accuracy=0.7512, over 4786.28 frames. ], batch size: 61, lr: 6.43e-03
2024-08-06 06:21:11,274 INFO [trainer.py:765] (1/8) Epoch 19, batch 200, train_loss[loss=2.735, ArTop10Accuracy=0.7636, over 13973.00 frames. ], tot_loss[loss=2.796, ArTop10Accuracy=0.7534, over 7791.32 frames. ], batch size: 35, lr: 6.41e-03
2024-08-06 06:21:56,078 INFO [trainer.py:765] (1/8) Epoch 19, batch 300, train_loss[loss=2.783, ArTop10Accuracy=0.7603, over 14131.00 frames. ], tot_loss[loss=2.804, ArTop10Accuracy=0.752, over 9428.20 frames. ], batch size: 44, lr: 6.40e-03
2024-08-06 06:22:36,013 INFO [trainer.py:765] (1/8) Epoch 19, batch 400, train_loss[loss=2.814, ArTop10Accuracy=0.7476, over 10290.00 frames. ], tot_loss[loss=2.799, ArTop10Accuracy=0.7524, over 10336.07 frames. ], batch size: 14, lr: 6.39e-03
2024-08-06 06:23:18,997 INFO [trainer.py:765] (1/8) Epoch 19, batch 500, train_loss[loss=2.787, ArTop10Accuracy=0.7602, over 12204.00 frames. ], tot_loss[loss=2.796, ArTop10Accuracy=0.7527, over 10904.92 frames. ], batch size: 22, lr: 6.37e-03
2024-08-06 06:24:03,685 INFO [trainer.py:765] (1/8) Epoch 19, batch 600, train_loss[loss=2.633, ArTop10Accuracy=0.7812, over 11618.00 frames. ], tot_loss[loss=2.803, ArTop10Accuracy=0.7514, over 11425.50 frames. ], batch size: 18, lr: 6.36e-03
2024-08-06 06:24:46,185 INFO [trainer.py:765] (1/8) Epoch 19, batch 700, train_loss[loss=2.762, ArTop10Accuracy=0.756, over 9400.00 frames. ], tot_loss[loss=2.808, ArTop10Accuracy=0.7503, over 11575.14 frames. ], batch size: 11, lr: 6.35e-03
2024-08-06 06:25:22,355 INFO [trainer.py:765] (1/8) Epoch 19, batch 800, train_loss[loss=2.787, ArTop10Accuracy=0.7534, over 10182.00 frames. ], tot_loss[loss=2.817, ArTop10Accuracy=0.7486, over 11704.71 frames. ], batch size: 12, lr: 6.33e-03
2024-08-06 06:25:53,624 INFO [trainer.py:765] (1/8) Epoch 19, batch 900, train_loss[loss=2.798, ArTop10Accuracy=0.7465, over 12857.00 frames. ], tot_loss[loss=2.815, ArTop10Accuracy=0.749, over 11749.94 frames. ], batch size: 27, lr: 6.32e-03
2024-08-06 06:26:21,772 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 06:26:30,765 INFO [trainer.py:811] (1/8) Epoch 19, validation: loss=2.918, ArTop10Accuracy=0.733, over 1829298.00 frames. 
2024-08-06 06:26:30,766 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 30932MB
2024-08-06 06:26:31,053 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.198e+02 1.416e+02 1.525e+02 1.662e+02 2.849e+02, threshold=3.050e+02, percent-clipped=0.0
2024-08-06 06:26:34,030 INFO [trainer.py:765] (1/8) Epoch 19, batch 1000, train_loss[loss=2.831, ArTop10Accuracy=0.7502, over 12917.00 frames. ], tot_loss[loss=2.817, ArTop10Accuracy=0.7485, over 11946.58 frames. ], batch size: 27, lr: 6.31e-03
2024-08-06 06:27:05,190 INFO [trainer.py:765] (1/8) Epoch 19, batch 1100, train_loss[loss=2.865, ArTop10Accuracy=0.7391, over 13739.00 frames. ], tot_loss[loss=2.827, ArTop10Accuracy=0.7469, over 12006.14 frames. ], batch size: 34, lr: 6.30e-03
2024-08-06 06:27:35,454 INFO [trainer.py:765] (1/8) Epoch 19, batch 1200, train_loss[loss=2.875, ArTop10Accuracy=0.7384, over 12216.00 frames. ], tot_loss[loss=2.827, ArTop10Accuracy=0.7467, over 11964.15 frames. ], batch size: 98, lr: 6.28e-03
2024-08-06 06:28:00,649 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 06:29:08,985 INFO [trainer.py:765] (1/8) Epoch 20, batch 100, train_loss[loss=2.812, ArTop10Accuracy=0.7492, over 14594.00 frames. ], tot_loss[loss=2.794, ArTop10Accuracy=0.7537, over 4778.56 frames. ], batch size: 61, lr: 6.10e-03
2024-08-06 06:29:50,318 INFO [trainer.py:765] (1/8) Epoch 20, batch 200, train_loss[loss=2.711, ArTop10Accuracy=0.761, over 13932.00 frames. ], tot_loss[loss=2.795, ArTop10Accuracy=0.7537, over 7787.63 frames. ], batch size: 34, lr: 6.09e-03
2024-08-06 06:30:37,106 INFO [trainer.py:765] (1/8) Epoch 20, batch 300, train_loss[loss=2.765, ArTop10Accuracy=0.7561, over 14305.00 frames. ], tot_loss[loss=2.792, ArTop10Accuracy=0.7542, over 9419.91 frames. ], batch size: 44, lr: 6.08e-03
2024-08-06 06:31:16,354 INFO [trainer.py:765] (1/8) Epoch 20, batch 400, train_loss[loss=2.805, ArTop10Accuracy=0.7542, over 10359.00 frames. ], tot_loss[loss=2.789, ArTop10Accuracy=0.7546, over 10335.06 frames. ], batch size: 14, lr: 6.07e-03
2024-08-06 06:32:03,759 INFO [trainer.py:765] (1/8) Epoch 20, batch 500, train_loss[loss=2.784, ArTop10Accuracy=0.7469, over 12345.00 frames. ], tot_loss[loss=2.786, ArTop10Accuracy=0.7548, over 10906.17 frames. ], batch size: 22, lr: 6.05e-03
2024-08-06 06:32:43,357 INFO [trainer.py:765] (1/8) Epoch 20, batch 600, train_loss[loss=2.663, ArTop10Accuracy=0.7778, over 11678.00 frames. ], tot_loss[loss=2.791, ArTop10Accuracy=0.7537, over 11425.12 frames. ], batch size: 18, lr: 6.04e-03
2024-08-06 06:33:36,752 INFO [trainer.py:765] (1/8) Epoch 20, batch 700, train_loss[loss=2.749, ArTop10Accuracy=0.7658, over 10069.00 frames. ], tot_loss[loss=2.8, ArTop10Accuracy=0.752, over 11566.64 frames. ], batch size: 12, lr: 6.03e-03
2024-08-06 06:33:43,829 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.196e+02 1.417e+02 1.526e+02 1.639e+02 3.791e+02, threshold=3.052e+02, percent-clipped=0.1
2024-08-06 06:34:13,304 INFO [trainer.py:765] (1/8) Epoch 20, batch 800, train_loss[loss=2.735, ArTop10Accuracy=0.765, over 9984.00 frames. ], tot_loss[loss=2.805, ArTop10Accuracy=0.7511, over 11687.19 frames. ], batch size: 12, lr: 6.02e-03
2024-08-06 06:34:44,580 INFO [trainer.py:765] (1/8) Epoch 20, batch 900, train_loss[loss=2.91, ArTop10Accuracy=0.7331, over 12961.00 frames. ], tot_loss[loss=2.804, ArTop10Accuracy=0.7511, over 11731.76 frames. ], batch size: 27, lr: 6.01e-03
2024-08-06 06:35:16,139 INFO [trainer.py:765] (1/8) Epoch 20, batch 1000, train_loss[loss=2.738, ArTop10Accuracy=0.7649, over 13143.00 frames. ], tot_loss[loss=2.808, ArTop10Accuracy=0.7502, over 11926.52 frames. ], batch size: 27, lr: 6.00e-03
2024-08-06 06:35:47,214 INFO [trainer.py:765] (1/8) Epoch 20, batch 1100, train_loss[loss=2.702, ArTop10Accuracy=0.7708, over 13670.00 frames. ], tot_loss[loss=2.817, ArTop10Accuracy=0.7488, over 11972.22 frames. ], batch size: 34, lr: 5.99e-03
2024-08-06 06:36:17,439 INFO [trainer.py:765] (1/8) Epoch 20, batch 1200, train_loss[loss=2.978, ArTop10Accuracy=0.7192, over 12645.00 frames. ], tot_loss[loss=2.82, ArTop10Accuracy=0.7482, over 11919.35 frames. ], batch size: 98, lr: 5.97e-03
2024-08-06 06:36:42,651 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 06:36:42,654 INFO [trainer.py:1069] (1/8) Done!