File size: 71,457 Bytes
c96c265
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
2024-08-06 08:06:14,318 INFO [trainer.py:870] (4/8) Training started
2024-08-06 08:06:14,319 INFO [trainer.py:889] (4/8) Device: cuda:4
2024-08-06 08:06:14,319 INFO [trainer.py:890] (4/8) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 100, 'reset_interval': 200, 'valid_interval': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '279b0c87015a615b81b147251814d737a548f397', 'k2-git-date': 'Wed May 24 22:24:09 2023', 'lhotse-version': '1.26.0', 'torch-version': '2.0.1+cu118', 'torch-cuda-available': True, 'torch-cuda-version': '11.8', 'python-version': '3.10', 'icefall-git-branch': None, 'icefall-git-sha1': None, 'icefall-git-date': None, 'icefall-path': '/workspace/icefall_llm', 'k2-path': '/usr/local/lib/python3.10/dist-packages/k2/__init__.py', 'lhotse-path': '/usr/local/lib/python3.10/dist-packages/lhotse/__init__.py', 'hostname': '6867463', 'IP address': '0.104.202.7'}, 'world_size': 8, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 20, 'start_epoch': 1, 'start_batch': 0, 'exp_dir': PosixPath('exp/valle'), 'optimizer_name': 'ScaledAdam', 'scheduler_name': 'Eden', 'base_lr': 0.03, 'warmup_steps': 200, 'seed': 42, 'inf_check': False, 'save_every_n': 20000, 'keep_last_k': 20, 'average_period': 0, 'accumulate_grad_steps': 1, 'dtype': 'bfloat16', 'filter_min_duration': 0.5, 'filter_max_duration': 14.0, 'train_stage': 1, 'visualize': False, 'oom_check': False, 'model_name': 'valle', 'decoder_dim': 1024, 'nhead': 16, 'num_decoder_layers': 12, 'scale_factor': 1.0, 'norm_first': True, 'add_prenet': False, 'prefix_mode': 1, 'share_embedding': True, 'prepend_bos': False, 'num_quantizers': 8, 'scaling_xformers': False, 'manifest_dir': PosixPath('data/tokenized'), 'max_duration': 320, 'bucketing_sampler': True, 'num_buckets': 6, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 0.1, 'on_the_fly_feats': False, 'shuffle': True, 'buffer_size': 40000, 'shuffle_buffer_size': 100000, 'drop_last': False, 'return_cuts': True, 'num_workers': 8, 'enable_spec_aug': False, 'spec_aug_time_warp_factor': 80, 'input_strategy': 'PrecomputedFeatures', 'dataset': 'libritts', 'text_tokens': 'data/tokenized/unique_text_tokens.k2symbols', 'sampling_rate': 24000}
2024-08-06 08:06:14,319 INFO [trainer.py:892] (4/8) About to create model
2024-08-06 08:06:15,010 INFO [trainer.py:899] (4/8) Number of model parameters: 367386628
2024-08-06 08:06:16,222 INFO [trainer.py:914] (4/8) Using DDP
2024-08-06 08:06:19,151 INFO [datamodule.py:427] (4/8) About to get train cuts
2024-08-06 08:06:19,153 INFO [datamodule.py:434] (4/8) About to get dev cuts
2024-08-06 08:06:19,155 INFO [datamodule.py:292] (4/8) Disable SpecAugment
2024-08-06 08:06:19,155 INFO [datamodule.py:294] (4/8) About to create train dataset
2024-08-06 08:06:19,155 INFO [datamodule.py:323] (4/8) Using DynamicBucketingSampler
2024-08-06 08:06:19,763 INFO [datamodule.py:344] (4/8) About to create train dataloader
2024-08-06 08:06:19,763 INFO [datamodule.py:367] (4/8) About to create dev dataset
2024-08-06 08:06:20,087 INFO [datamodule.py:388] (4/8) About to create dev dataloader
2024-08-06 08:08:02,120 INFO [trainer.py:765] (4/8) Epoch 1, batch 100, train_loss[loss=4.335, ArTop10Accuracy=0.4992, over 14349.00 frames. ], tot_loss[loss=5.058, ArTop10Accuracy=0.3727, over 4771.89 frames. ], batch size: 62, lr: 2.25e-02
2024-08-06 08:09:28,827 INFO [trainer.py:765] (4/8) Epoch 1, batch 200, train_loss[loss=4.111, ArTop10Accuracy=0.5308, over 13680.00 frames. ], tot_loss[loss=4.496, ArTop10Accuracy=0.4669, over 7746.35 frames. ], batch size: 34, lr: 3.00e-02
2024-08-06 08:10:52,430 INFO [trainer.py:765] (4/8) Epoch 1, batch 300, train_loss[loss=3.881, ArTop10Accuracy=0.5686, over 14085.00 frames. ], tot_loss[loss=4.218, ArTop10Accuracy=0.5129, over 9372.51 frames. ], batch size: 44, lr: 3.00e-02
2024-08-06 08:12:12,698 INFO [trainer.py:765] (4/8) Epoch 1, batch 400, train_loss[loss=3.738, ArTop10Accuracy=0.6, over 10314.00 frames. ], tot_loss[loss=4.027, ArTop10Accuracy=0.5454, over 10297.26 frames. ], batch size: 14, lr: 3.00e-02
2024-08-06 08:13:40,049 INFO [trainer.py:765] (4/8) Epoch 1, batch 500, train_loss[loss=3.696, ArTop10Accuracy=0.6047, over 12063.00 frames. ], tot_loss[loss=3.883, ArTop10Accuracy=0.5703, over 10855.34 frames. ], batch size: 22, lr: 2.99e-02
2024-08-06 08:15:00,242 INFO [trainer.py:765] (4/8) Epoch 1, batch 600, train_loss[loss=3.559, ArTop10Accuracy=0.6271, over 11481.00 frames. ], tot_loss[loss=3.773, ArTop10Accuracy=0.5898, over 11365.23 frames. ], batch size: 18, lr: 2.99e-02
2024-08-06 08:16:26,424 INFO [trainer.py:765] (4/8) Epoch 1, batch 700, train_loss[loss=3.57, ArTop10Accuracy=0.6244, over 10320.00 frames. ], tot_loss[loss=3.695, ArTop10Accuracy=0.6037, over 11513.99 frames. ], batch size: 12, lr: 2.99e-02
2024-08-06 08:17:43,017 INFO [trainer.py:765] (4/8) Epoch 1, batch 800, train_loss[loss=3.429, ArTop10Accuracy=0.6523, over 9978.00 frames. ], tot_loss[loss=3.627, ArTop10Accuracy=0.6163, over 11645.81 frames. ], batch size: 12, lr: 2.98e-02
2024-08-06 08:18:56,150 INFO [trainer.py:765] (4/8) Epoch 1, batch 900, train_loss[loss=3.458, ArTop10Accuracy=0.6464, over 12951.00 frames. ], tot_loss[loss=3.567, ArTop10Accuracy=0.6273, over 11687.10 frames. ], batch size: 27, lr: 2.98e-02
2024-08-06 08:20:12,862 INFO [trainer.py:765] (4/8) Epoch 1, batch 1000, train_loss[loss=3.476, ArTop10Accuracy=0.6408, over 13488.00 frames. ], tot_loss[loss=3.524, ArTop10Accuracy=0.635, over 11889.35 frames. ], batch size: 28, lr: 2.97e-02
2024-08-06 08:20:13,539 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 9.300e+01 1.871e+02 2.675e+02 4.030e+02 9.119e+03, threshold=5.351e+02, percent-clipped=0.0
2024-08-06 08:21:29,154 INFO [trainer.py:765] (4/8) Epoch 1, batch 1100, train_loss[loss=3.469, ArTop10Accuracy=0.6412, over 13692.00 frames. ], tot_loss[loss=3.487, ArTop10Accuracy=0.6416, over 11959.37 frames. ], batch size: 34, lr: 2.96e-02
2024-08-06 08:22:45,410 INFO [trainer.py:765] (4/8) Epoch 1, batch 1200, train_loss[loss=3.468, ArTop10Accuracy=0.6428, over 11691.00 frames. ], tot_loss[loss=3.456, ArTop10Accuracy=0.6475, over 11856.20 frames. ], batch size: 101, lr: 2.96e-02
2024-08-06 08:23:45,262 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 08:25:36,236 INFO [trainer.py:765] (4/8) Epoch 2, batch 100, train_loss[loss=3.453, ArTop10Accuracy=0.6483, over 14559.00 frames. ], tot_loss[loss=3.419, ArTop10Accuracy=0.6533, over 4753.85 frames. ], batch size: 62, lr: 2.90e-02
2024-08-06 08:26:58,955 INFO [trainer.py:765] (4/8) Epoch 2, batch 200, train_loss[loss=3.27, ArTop10Accuracy=0.6853, over 13752.00 frames. ], tot_loss[loss=3.384, ArTop10Accuracy=0.6604, over 7757.10 frames. ], batch size: 34, lr: 2.89e-02
2024-08-06 08:28:25,533 INFO [trainer.py:765] (4/8) Epoch 2, batch 300, train_loss[loss=3.402, ArTop10Accuracy=0.6578, over 14046.00 frames. ], tot_loss[loss=3.371, ArTop10Accuracy=0.6631, over 9382.03 frames. ], batch size: 44, lr: 2.89e-02
2024-08-06 08:29:48,636 INFO [trainer.py:765] (4/8) Epoch 2, batch 400, train_loss[loss=3.355, ArTop10Accuracy=0.6619, over 10944.00 frames. ], tot_loss[loss=3.358, ArTop10Accuracy=0.6657, over 10312.86 frames. ], batch size: 15, lr: 2.88e-02
2024-08-06 08:31:22,902 INFO [trainer.py:765] (4/8) Epoch 2, batch 500, train_loss[loss=3.212, ArTop10Accuracy=0.6956, over 12171.00 frames. ], tot_loss[loss=3.339, ArTop10Accuracy=0.6692, over 10869.97 frames. ], batch size: 22, lr: 2.87e-02
2024-08-06 08:32:45,688 INFO [trainer.py:765] (4/8) Epoch 2, batch 600, train_loss[loss=3.308, ArTop10Accuracy=0.6743, over 11418.00 frames. ], tot_loss[loss=3.329, ArTop10Accuracy=0.671, over 11384.57 frames. ], batch size: 18, lr: 2.86e-02
2024-08-06 08:34:13,582 INFO [trainer.py:765] (4/8) Epoch 2, batch 700, train_loss[loss=3.313, ArTop10Accuracy=0.6793, over 9951.00 frames. ], tot_loss[loss=3.325, ArTop10Accuracy=0.6719, over 11534.00 frames. ], batch size: 12, lr: 2.85e-02
2024-08-06 08:34:31,175 INFO [trainer.py:803] (4/8) Computing validation loss
2024-08-06 08:34:40,888 INFO [trainer.py:811] (4/8) Epoch 2, validation: loss=3.277, ArTop10Accuracy=0.6803, over 1827537.00 frames. 
2024-08-06 08:34:40,889 INFO [trainer.py:814] (4/8) Maximum memory allocated so far is 28695MB
2024-08-06 08:34:41,699 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 7.953e+01 1.592e+02 2.200e+02 3.344e+02 2.949e+03, threshold=4.400e+02, percent-clipped=8.6
2024-08-06 08:35:39,878 INFO [trainer.py:765] (4/8) Epoch 2, batch 800, train_loss[loss=3.2, ArTop10Accuracy=0.6972, over 9570.00 frames. ], tot_loss[loss=3.318, ArTop10Accuracy=0.6735, over 11652.56 frames. ], batch size: 11, lr: 2.84e-02
2024-08-06 08:36:56,371 INFO [trainer.py:765] (4/8) Epoch 2, batch 900, train_loss[loss=3.262, ArTop10Accuracy=0.6776, over 12861.00 frames. ], tot_loss[loss=3.305, ArTop10Accuracy=0.6758, over 11691.96 frames. ], batch size: 27, lr: 2.83e-02
2024-08-06 08:38:10,511 INFO [trainer.py:765] (4/8) Epoch 2, batch 1000, train_loss[loss=3.307, ArTop10Accuracy=0.6773, over 13053.00 frames. ], tot_loss[loss=3.299, ArTop10Accuracy=0.677, over 11892.14 frames. ], batch size: 27, lr: 2.82e-02
2024-08-06 08:39:25,059 INFO [trainer.py:765] (4/8) Epoch 2, batch 1100, train_loss[loss=3.159, ArTop10Accuracy=0.7048, over 13839.00 frames. ], tot_loss[loss=3.292, ArTop10Accuracy=0.6781, over 11930.52 frames. ], batch size: 34, lr: 2.81e-02
2024-08-06 08:40:38,219 INFO [trainer.py:765] (4/8) Epoch 2, batch 1200, train_loss[loss=3.333, ArTop10Accuracy=0.6674, over 13452.00 frames. ], tot_loss[loss=3.283, ArTop10Accuracy=0.6799, over 11836.01 frames. ], batch size: 101, lr: 2.80e-02
2024-08-06 08:41:38,601 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 08:43:36,649 INFO [trainer.py:765] (4/8) Epoch 3, batch 100, train_loss[loss=3.256, ArTop10Accuracy=0.6832, over 14394.00 frames. ], tot_loss[loss=3.244, ArTop10Accuracy=0.6866, over 4768.62 frames. ], batch size: 62, lr: 2.67e-02
2024-08-06 08:45:10,500 INFO [trainer.py:765] (4/8) Epoch 3, batch 200, train_loss[loss=3.201, ArTop10Accuracy=0.695, over 13674.00 frames. ], tot_loss[loss=3.221, ArTop10Accuracy=0.6908, over 7764.36 frames. ], batch size: 34, lr: 2.66e-02
2024-08-06 08:46:29,258 INFO [trainer.py:765] (4/8) Epoch 3, batch 300, train_loss[loss=3.233, ArTop10Accuracy=0.6863, over 14310.00 frames. ], tot_loss[loss=3.207, ArTop10Accuracy=0.6938, over 9365.66 frames. ], batch size: 44, lr: 2.64e-02
2024-08-06 08:48:04,219 INFO [trainer.py:765] (4/8) Epoch 3, batch 400, train_loss[loss=3.129, ArTop10Accuracy=0.7089, over 10473.00 frames. ], tot_loss[loss=3.192, ArTop10Accuracy=0.6969, over 10282.72 frames. ], batch size: 14, lr: 2.63e-02
2024-08-06 08:48:40,881 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 9.282e+01 1.561e+02 1.981e+02 2.686e+02 1.768e+03, threshold=3.962e+02, percent-clipped=7.6
2024-08-06 08:49:25,541 INFO [trainer.py:765] (4/8) Epoch 3, batch 500, train_loss[loss=3.17, ArTop10Accuracy=0.7073, over 12351.00 frames. ], tot_loss[loss=3.172, ArTop10Accuracy=0.7009, over 10852.39 frames. ], batch size: 22, lr: 2.62e-02
2024-08-06 08:51:00,477 INFO [trainer.py:765] (4/8) Epoch 3, batch 600, train_loss[loss=3.056, ArTop10Accuracy=0.7223, over 11325.00 frames. ], tot_loss[loss=3.16, ArTop10Accuracy=0.7028, over 11381.83 frames. ], batch size: 18, lr: 2.61e-02
2024-08-06 08:52:31,618 INFO [trainer.py:765] (4/8) Epoch 3, batch 700, train_loss[loss=3.058, ArTop10Accuracy=0.7241, over 10176.00 frames. ], tot_loss[loss=3.143, ArTop10Accuracy=0.7062, over 11521.06 frames. ], batch size: 12, lr: 2.60e-02
2024-08-06 08:53:57,389 INFO [trainer.py:765] (4/8) Epoch 3, batch 800, train_loss[loss=3.078, ArTop10Accuracy=0.7212, over 9276.00 frames. ], tot_loss[loss=3.137, ArTop10Accuracy=0.7072, over 11622.44 frames. ], batch size: 11, lr: 2.59e-02
2024-08-06 08:55:15,119 INFO [trainer.py:765] (4/8) Epoch 3, batch 900, train_loss[loss=3.061, ArTop10Accuracy=0.7179, over 13047.00 frames. ], tot_loss[loss=3.123, ArTop10Accuracy=0.7099, over 11666.34 frames. ], batch size: 27, lr: 2.57e-02
2024-08-06 08:56:31,557 INFO [trainer.py:765] (4/8) Epoch 3, batch 1000, train_loss[loss=3.183, ArTop10Accuracy=0.6972, over 12882.00 frames. ], tot_loss[loss=3.112, ArTop10Accuracy=0.7119, over 11867.25 frames. ], batch size: 27, lr: 2.56e-02
2024-08-06 08:57:46,506 INFO [trainer.py:765] (4/8) Epoch 3, batch 1100, train_loss[loss=2.998, ArTop10Accuracy=0.7333, over 13554.00 frames. ], tot_loss[loss=3.105, ArTop10Accuracy=0.7132, over 11926.75 frames. ], batch size: 34, lr: 2.55e-02
2024-08-06 08:59:01,399 INFO [trainer.py:765] (4/8) Epoch 3, batch 1200, train_loss[loss=3.151, ArTop10Accuracy=0.7024, over 13326.00 frames. ], tot_loss[loss=3.097, ArTop10Accuracy=0.7145, over 11854.55 frames. ], batch size: 101, lr: 2.54e-02
2024-08-06 09:00:01,980 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 09:01:50,742 INFO [trainer.py:765] (4/8) Epoch 4, batch 100, train_loss[loss=3.127, ArTop10Accuracy=0.7081, over 14670.00 frames. ], tot_loss[loss=3.065, ArTop10Accuracy=0.7201, over 4761.72 frames. ], batch size: 64, lr: 2.38e-02
2024-08-06 09:02:52,859 INFO [trainer.py:803] (4/8) Computing validation loss
2024-08-06 09:03:02,384 INFO [trainer.py:811] (4/8) Epoch 4, validation: loss=2.997, ArTop10Accuracy=0.7338, over 1827537.00 frames. 
2024-08-06 09:03:02,385 INFO [trainer.py:814] (4/8) Maximum memory allocated so far is 29513MB
2024-08-06 09:03:03,364 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.072e+02 1.499e+02 1.782e+02 2.273e+02 1.100e+03, threshold=3.565e+02, percent-clipped=4.7
2024-08-06 09:03:29,273 INFO [trainer.py:765] (4/8) Epoch 4, batch 200, train_loss[loss=3.069, ArTop10Accuracy=0.718, over 13527.00 frames. ], tot_loss[loss=3.041, ArTop10Accuracy=0.7249, over 7754.25 frames. ], batch size: 34, lr: 2.37e-02
2024-08-06 09:05:01,733 INFO [trainer.py:765] (4/8) Epoch 4, batch 300, train_loss[loss=3.133, ArTop10Accuracy=0.7066, over 14562.00 frames. ], tot_loss[loss=3.038, ArTop10Accuracy=0.7259, over 9353.08 frames. ], batch size: 45, lr: 2.36e-02
2024-08-06 09:06:28,151 INFO [trainer.py:765] (4/8) Epoch 4, batch 400, train_loss[loss=2.94, ArTop10Accuracy=0.7486, over 10116.00 frames. ], tot_loss[loss=3.034, ArTop10Accuracy=0.7265, over 10275.79 frames. ], batch size: 14, lr: 2.34e-02
2024-08-06 09:08:01,925 INFO [trainer.py:765] (4/8) Epoch 4, batch 500, train_loss[loss=3.045, ArTop10Accuracy=0.7286, over 12501.00 frames. ], tot_loss[loss=3.029, ArTop10Accuracy=0.7272, over 10828.04 frames. ], batch size: 23, lr: 2.33e-02
2024-08-06 09:09:28,540 INFO [trainer.py:765] (4/8) Epoch 4, batch 600, train_loss[loss=2.955, ArTop10Accuracy=0.7423, over 11589.00 frames. ], tot_loss[loss=3.024, ArTop10Accuracy=0.7284, over 11374.57 frames. ], batch size: 18, lr: 2.32e-02
2024-08-06 09:10:59,865 INFO [trainer.py:765] (4/8) Epoch 4, batch 700, train_loss[loss=3.009, ArTop10Accuracy=0.7394, over 10125.00 frames. ], tot_loss[loss=3.026, ArTop10Accuracy=0.7277, over 11515.59 frames. ], batch size: 12, lr: 2.31e-02
2024-08-06 09:12:17,513 INFO [trainer.py:765] (4/8) Epoch 4, batch 800, train_loss[loss=2.969, ArTop10Accuracy=0.7425, over 9366.00 frames. ], tot_loss[loss=3.022, ArTop10Accuracy=0.7288, over 11640.09 frames. ], batch size: 11, lr: 2.30e-02
2024-08-06 09:13:33,212 INFO [trainer.py:765] (4/8) Epoch 4, batch 900, train_loss[loss=2.992, ArTop10Accuracy=0.7306, over 12924.00 frames. ], tot_loss[loss=3.012, ArTop10Accuracy=0.7308, over 11686.06 frames. ], batch size: 27, lr: 2.29e-02
2024-08-06 09:14:47,520 INFO [trainer.py:765] (4/8) Epoch 4, batch 1000, train_loss[loss=2.935, ArTop10Accuracy=0.7421, over 12690.00 frames. ], tot_loss[loss=3.011, ArTop10Accuracy=0.7308, over 11873.84 frames. ], batch size: 27, lr: 2.28e-02
2024-08-06 09:16:02,982 INFO [trainer.py:765] (4/8) Epoch 4, batch 1100, train_loss[loss=2.965, ArTop10Accuracy=0.741, over 13602.00 frames. ], tot_loss[loss=3.011, ArTop10Accuracy=0.7308, over 11940.21 frames. ], batch size: 34, lr: 2.26e-02
2024-08-06 09:16:53,291 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.100e+02 1.440e+02 1.636e+02 1.968e+02 7.702e+02, threshold=3.273e+02, percent-clipped=1.3
2024-08-06 09:17:18,344 INFO [trainer.py:765] (4/8) Epoch 4, batch 1200, train_loss[loss=3.053, ArTop10Accuracy=0.7237, over 12633.00 frames. ], tot_loss[loss=3.01, ArTop10Accuracy=0.7309, over 11854.12 frames. ], batch size: 101, lr: 2.25e-02
2024-08-06 09:18:17,349 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 09:20:17,170 INFO [trainer.py:765] (4/8) Epoch 5, batch 100, train_loss[loss=3.024, ArTop10Accuracy=0.725, over 14337.00 frames. ], tot_loss[loss=2.997, ArTop10Accuracy=0.7326, over 4765.65 frames. ], batch size: 62, lr: 2.10e-02
2024-08-06 09:21:52,295 INFO [trainer.py:765] (4/8) Epoch 5, batch 200, train_loss[loss=2.968, ArTop10Accuracy=0.7412, over 13647.00 frames. ], tot_loss[loss=2.972, ArTop10Accuracy=0.7375, over 7764.93 frames. ], batch size: 34, lr: 2.09e-02
2024-08-06 09:23:19,240 INFO [trainer.py:765] (4/8) Epoch 5, batch 300, train_loss[loss=3.021, ArTop10Accuracy=0.7283, over 14367.00 frames. ], tot_loss[loss=2.966, ArTop10Accuracy=0.7392, over 9381.01 frames. ], batch size: 45, lr: 2.08e-02
2024-08-06 09:24:53,536 INFO [trainer.py:765] (4/8) Epoch 5, batch 400, train_loss[loss=2.943, ArTop10Accuracy=0.7394, over 10296.00 frames. ], tot_loss[loss=2.963, ArTop10Accuracy=0.74, over 10297.28 frames. ], batch size: 14, lr: 2.07e-02
2024-08-06 09:26:19,417 INFO [trainer.py:765] (4/8) Epoch 5, batch 500, train_loss[loss=2.9, ArTop10Accuracy=0.7498, over 12066.00 frames. ], tot_loss[loss=2.961, ArTop10Accuracy=0.7404, over 10856.08 frames. ], batch size: 22, lr: 2.06e-02
2024-08-06 09:27:49,536 INFO [trainer.py:765] (4/8) Epoch 5, batch 600, train_loss[loss=3.014, ArTop10Accuracy=0.7307, over 11511.00 frames. ], tot_loss[loss=2.963, ArTop10Accuracy=0.7401, over 11385.83 frames. ], batch size: 18, lr: 2.05e-02
2024-08-06 09:29:21,669 INFO [trainer.py:765] (4/8) Epoch 5, batch 700, train_loss[loss=2.976, ArTop10Accuracy=0.733, over 9171.00 frames. ], tot_loss[loss=2.967, ArTop10Accuracy=0.7393, over 11512.86 frames. ], batch size: 11, lr: 2.04e-02
2024-08-06 09:30:44,692 INFO [trainer.py:765] (4/8) Epoch 5, batch 800, train_loss[loss=2.901, ArTop10Accuracy=0.7489, over 10170.00 frames. ], tot_loss[loss=2.971, ArTop10Accuracy=0.7385, over 11642.97 frames. ], batch size: 12, lr: 2.03e-02
2024-08-06 09:31:51,238 INFO [trainer.py:803] (4/8) Computing validation loss
2024-08-06 09:32:00,762 INFO [trainer.py:811] (4/8) Epoch 5, validation: loss=2.926, ArTop10Accuracy=0.7466, over 1827537.00 frames. 
2024-08-06 09:32:00,763 INFO [trainer.py:814] (4/8) Maximum memory allocated so far is 32729MB
2024-08-06 09:32:01,708 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.060e+02 1.349e+02 1.525e+02 1.806e+02 1.007e+03, threshold=3.049e+02, percent-clipped=2.3
2024-08-06 09:32:10,553 INFO [trainer.py:765] (4/8) Epoch 5, batch 900, train_loss[loss=2.998, ArTop10Accuracy=0.7307, over 12870.00 frames. ], tot_loss[loss=2.961, ArTop10Accuracy=0.7404, over 11683.19 frames. ], batch size: 27, lr: 2.02e-02
2024-08-06 09:33:27,322 INFO [trainer.py:765] (4/8) Epoch 5, batch 1000, train_loss[loss=2.89, ArTop10Accuracy=0.7553, over 12822.00 frames. ], tot_loss[loss=2.963, ArTop10Accuracy=0.7399, over 11878.40 frames. ], batch size: 27, lr: 2.01e-02
2024-08-06 09:34:42,299 INFO [trainer.py:765] (4/8) Epoch 5, batch 1100, train_loss[loss=2.96, ArTop10Accuracy=0.7399, over 13461.00 frames. ], tot_loss[loss=2.961, ArTop10Accuracy=0.7402, over 11949.48 frames. ], batch size: 34, lr: 2.00e-02
2024-08-06 09:35:56,331 INFO [trainer.py:765] (4/8) Epoch 5, batch 1200, train_loss[loss=3.116, ArTop10Accuracy=0.7083, over 11610.00 frames. ], tot_loss[loss=2.957, ArTop10Accuracy=0.7409, over 11851.67 frames. ], batch size: 101, lr: 1.99e-02
2024-08-06 09:36:55,326 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 09:38:52,664 INFO [trainer.py:765] (4/8) Epoch 6, batch 100, train_loss[loss=3.011, ArTop10Accuracy=0.7297, over 14652.00 frames. ], tot_loss[loss=2.953, ArTop10Accuracy=0.7414, over 4763.22 frames. ], batch size: 63, lr: 1.85e-02
2024-08-06 09:40:19,833 INFO [trainer.py:765] (4/8) Epoch 6, batch 200, train_loss[loss=2.924, ArTop10Accuracy=0.749, over 13821.00 frames. ], tot_loss[loss=2.935, ArTop10Accuracy=0.7448, over 7747.45 frames. ], batch size: 34, lr: 1.84e-02
2024-08-06 09:41:52,964 INFO [trainer.py:765] (4/8) Epoch 6, batch 300, train_loss[loss=2.895, ArTop10Accuracy=0.7482, over 14133.00 frames. ], tot_loss[loss=2.931, ArTop10Accuracy=0.7456, over 9381.03 frames. ], batch size: 44, lr: 1.83e-02
2024-08-06 09:43:17,827 INFO [trainer.py:765] (4/8) Epoch 6, batch 400, train_loss[loss=2.934, ArTop10Accuracy=0.7428, over 10410.00 frames. ], tot_loss[loss=2.927, ArTop10Accuracy=0.7465, over 10294.73 frames. ], batch size: 14, lr: 1.83e-02
2024-08-06 09:44:54,127 INFO [trainer.py:765] (4/8) Epoch 6, batch 500, train_loss[loss=2.914, ArTop10Accuracy=0.7514, over 12327.00 frames. ], tot_loss[loss=2.916, ArTop10Accuracy=0.7488, over 10858.88 frames. ], batch size: 22, lr: 1.82e-02
2024-08-06 09:46:22,872 INFO [trainer.py:765] (4/8) Epoch 6, batch 600, train_loss[loss=2.957, ArTop10Accuracy=0.7467, over 11445.00 frames. ], tot_loss[loss=2.921, ArTop10Accuracy=0.7477, over 11366.12 frames. ], batch size: 18, lr: 1.81e-02
2024-08-06 09:46:37,219 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.012e+02 1.339e+02 1.480e+02 1.701e+02 7.506e+02, threshold=2.959e+02, percent-clipped=1.1
2024-08-06 09:47:57,869 INFO [trainer.py:765] (4/8) Epoch 6, batch 700, train_loss[loss=2.881, ArTop10Accuracy=0.7528, over 10191.00 frames. ], tot_loss[loss=2.925, ArTop10Accuracy=0.7467, over 11534.85 frames. ], batch size: 12, lr: 1.80e-02
2024-08-06 09:49:15,954 INFO [trainer.py:765] (4/8) Epoch 6, batch 800, train_loss[loss=2.915, ArTop10Accuracy=0.7503, over 9345.00 frames. ], tot_loss[loss=2.927, ArTop10Accuracy=0.7465, over 11644.71 frames. ], batch size: 11, lr: 1.79e-02
2024-08-06 09:50:32,134 INFO [trainer.py:765] (4/8) Epoch 6, batch 900, train_loss[loss=2.904, ArTop10Accuracy=0.7514, over 13062.00 frames. ], tot_loss[loss=2.923, ArTop10Accuracy=0.7472, over 11680.49 frames. ], batch size: 27, lr: 1.78e-02
2024-08-06 09:51:47,298 INFO [trainer.py:765] (4/8) Epoch 6, batch 1000, train_loss[loss=2.913, ArTop10Accuracy=0.752, over 12957.00 frames. ], tot_loss[loss=2.925, ArTop10Accuracy=0.747, over 11873.10 frames. ], batch size: 27, lr: 1.77e-02
2024-08-06 09:53:00,920 INFO [trainer.py:765] (4/8) Epoch 6, batch 1100, train_loss[loss=2.91, ArTop10Accuracy=0.7514, over 13686.00 frames. ], tot_loss[loss=2.931, ArTop10Accuracy=0.7458, over 11944.91 frames. ], batch size: 34, lr: 1.77e-02
2024-08-06 09:54:14,336 INFO [trainer.py:765] (4/8) Epoch 6, batch 1200, train_loss[loss=3.056, ArTop10Accuracy=0.7208, over 12609.00 frames. ], tot_loss[loss=2.93, ArTop10Accuracy=0.7461, over 11868.14 frames. ], batch size: 101, lr: 1.76e-02
2024-08-06 09:55:13,263 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 09:57:06,699 INFO [trainer.py:765] (4/8) Epoch 7, batch 100, train_loss[loss=2.98, ArTop10Accuracy=0.7353, over 14310.00 frames. ], tot_loss[loss=2.918, ArTop10Accuracy=0.748, over 4748.26 frames. ], batch size: 62, lr: 1.64e-02
2024-08-06 09:58:39,426 INFO [trainer.py:765] (4/8) Epoch 7, batch 200, train_loss[loss=2.916, ArTop10Accuracy=0.7468, over 13752.00 frames. ], tot_loss[loss=2.906, ArTop10Accuracy=0.7504, over 7746.19 frames. ], batch size: 34, lr: 1.64e-02
2024-08-06 10:00:06,083 INFO [trainer.py:765] (4/8) Epoch 7, batch 300, train_loss[loss=2.979, ArTop10Accuracy=0.7354, over 13800.00 frames. ], tot_loss[loss=2.898, ArTop10Accuracy=0.7517, over 9374.86 frames. ], batch size: 44, lr: 1.63e-02
2024-08-06 10:00:40,509 INFO [trainer.py:803] (4/8) Computing validation loss
2024-08-06 10:00:50,245 INFO [trainer.py:811] (4/8) Epoch 7, validation: loss=2.88, ArTop10Accuracy=0.7554, over 1827537.00 frames. 
2024-08-06 10:00:50,246 INFO [trainer.py:814] (4/8) Maximum memory allocated so far is 32729MB
2024-08-06 10:00:50,977 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.002e+02 1.286e+02 1.429e+02 1.605e+02 1.020e+03, threshold=2.857e+02, percent-clipped=1.5
2024-08-06 10:01:49,118 INFO [trainer.py:765] (4/8) Epoch 7, batch 400, train_loss[loss=2.852, ArTop10Accuracy=0.765, over 10215.00 frames. ], tot_loss[loss=2.896, ArTop10Accuracy=0.7524, over 10288.13 frames. ], batch size: 14, lr: 1.62e-02
2024-08-06 10:03:21,457 INFO [trainer.py:765] (4/8) Epoch 7, batch 500, train_loss[loss=2.888, ArTop10Accuracy=0.7599, over 12327.00 frames. ], tot_loss[loss=2.891, ArTop10Accuracy=0.7534, over 10853.35 frames. ], batch size: 22, lr: 1.61e-02
2024-08-06 10:04:51,882 INFO [trainer.py:765] (4/8) Epoch 7, batch 600, train_loss[loss=2.909, ArTop10Accuracy=0.7526, over 11847.00 frames. ], tot_loss[loss=2.895, ArTop10Accuracy=0.7529, over 11343.07 frames. ], batch size: 19, lr: 1.61e-02
2024-08-06 10:06:25,112 INFO [trainer.py:765] (4/8) Epoch 7, batch 700, train_loss[loss=2.939, ArTop10Accuracy=0.7475, over 9381.00 frames. ], tot_loss[loss=2.896, ArTop10Accuracy=0.7526, over 11502.86 frames. ], batch size: 11, lr: 1.60e-02
2024-08-06 10:07:46,948 INFO [trainer.py:765] (4/8) Epoch 7, batch 800, train_loss[loss=2.904, ArTop10Accuracy=0.7507, over 10071.00 frames. ], tot_loss[loss=2.902, ArTop10Accuracy=0.7514, over 11636.93 frames. ], batch size: 12, lr: 1.59e-02
2024-08-06 10:09:02,824 INFO [trainer.py:765] (4/8) Epoch 7, batch 900, train_loss[loss=2.835, ArTop10Accuracy=0.7591, over 12993.00 frames. ], tot_loss[loss=2.893, ArTop10Accuracy=0.753, over 11682.69 frames. ], batch size: 27, lr: 1.59e-02
2024-08-06 10:10:19,636 INFO [trainer.py:765] (4/8) Epoch 7, batch 1000, train_loss[loss=2.856, ArTop10Accuracy=0.7663, over 12762.00 frames. ], tot_loss[loss=2.898, ArTop10Accuracy=0.7523, over 11896.36 frames. ], batch size: 27, lr: 1.58e-02
2024-08-06 10:11:35,208 INFO [trainer.py:765] (4/8) Epoch 7, batch 1100, train_loss[loss=2.936, ArTop10Accuracy=0.7445, over 13755.00 frames. ], tot_loss[loss=2.902, ArTop10Accuracy=0.7512, over 11966.74 frames. ], batch size: 34, lr: 1.57e-02
2024-08-06 10:12:48,204 INFO [trainer.py:765] (4/8) Epoch 7, batch 1200, train_loss[loss=3.002, ArTop10Accuracy=0.7326, over 12930.00 frames. ], tot_loss[loss=2.898, ArTop10Accuracy=0.7519, over 11872.20 frames. ], batch size: 101, lr: 1.57e-02
2024-08-06 10:13:46,750 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 10:15:03,600 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.017e+02 1.283e+02 1.410e+02 1.601e+02 1.017e+03, threshold=2.820e+02, percent-clipped=0.9
2024-08-06 10:15:40,820 INFO [trainer.py:765] (4/8) Epoch 8, batch 100, train_loss[loss=3.008, ArTop10Accuracy=0.7324, over 14205.00 frames. ], tot_loss[loss=2.887, ArTop10Accuracy=0.754, over 4763.79 frames. ], batch size: 62, lr: 1.47e-02
2024-08-06 10:17:12,861 INFO [trainer.py:765] (4/8) Epoch 8, batch 200, train_loss[loss=2.874, ArTop10Accuracy=0.7629, over 13785.00 frames. ], tot_loss[loss=2.872, ArTop10Accuracy=0.757, over 7763.58 frames. ], batch size: 34, lr: 1.46e-02
2024-08-06 10:18:37,897 INFO [trainer.py:765] (4/8) Epoch 8, batch 300, train_loss[loss=2.891, ArTop10Accuracy=0.7523, over 14205.00 frames. ], tot_loss[loss=2.865, ArTop10Accuracy=0.7584, over 9375.13 frames. ], batch size: 44, lr: 1.46e-02
2024-08-06 10:20:06,341 INFO [trainer.py:765] (4/8) Epoch 8, batch 400, train_loss[loss=2.892, ArTop10Accuracy=0.751, over 10953.00 frames. ], tot_loss[loss=2.865, ArTop10Accuracy=0.7583, over 10289.35 frames. ], batch size: 15, lr: 1.45e-02
2024-08-06 10:21:32,410 INFO [trainer.py:765] (4/8) Epoch 8, batch 500, train_loss[loss=2.888, ArTop10Accuracy=0.7557, over 12642.00 frames. ], tot_loss[loss=2.859, ArTop10Accuracy=0.7591, over 10849.37 frames. ], batch size: 23, lr: 1.45e-02
2024-08-06 10:23:00,973 INFO [trainer.py:765] (4/8) Epoch 8, batch 600, train_loss[loss=2.915, ArTop10Accuracy=0.7512, over 11388.00 frames. ], tot_loss[loss=2.862, ArTop10Accuracy=0.7587, over 11353.14 frames. ], batch size: 18, lr: 1.44e-02
2024-08-06 10:24:37,787 INFO [trainer.py:765] (4/8) Epoch 8, batch 700, train_loss[loss=2.855, ArTop10Accuracy=0.7624, over 10257.00 frames. ], tot_loss[loss=2.866, ArTop10Accuracy=0.7579, over 11516.65 frames. ], batch size: 12, lr: 1.43e-02
2024-08-06 10:25:56,086 INFO [trainer.py:765] (4/8) Epoch 8, batch 800, train_loss[loss=2.831, ArTop10Accuracy=0.7627, over 10239.00 frames. ], tot_loss[loss=2.873, ArTop10Accuracy=0.7568, over 11657.44 frames. ], batch size: 12, lr: 1.43e-02
2024-08-06 10:27:12,244 INFO [trainer.py:765] (4/8) Epoch 8, batch 900, train_loss[loss=2.981, ArTop10Accuracy=0.7387, over 13305.00 frames. ], tot_loss[loss=2.865, ArTop10Accuracy=0.7582, over 11699.99 frames. ], batch size: 28, lr: 1.42e-02
2024-08-06 10:28:25,262 INFO [trainer.py:765] (4/8) Epoch 8, batch 1000, train_loss[loss=2.895, ArTop10Accuracy=0.7544, over 13005.00 frames. ], tot_loss[loss=2.871, ArTop10Accuracy=0.7573, over 11892.93 frames. ], batch size: 27, lr: 1.42e-02
2024-08-06 10:29:07,154 INFO [trainer.py:803] (4/8) Computing validation loss
2024-08-06 10:29:16,831 INFO [trainer.py:811] (4/8) Epoch 8, validation: loss=2.858, ArTop10Accuracy=0.7594, over 1827537.00 frames. 
2024-08-06 10:29:16,831 INFO [trainer.py:814] (4/8) Maximum memory allocated so far is 32729MB
2024-08-06 10:29:17,490 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.032e+02 1.275e+02 1.390e+02 1.547e+02 3.717e+02, threshold=2.781e+02, percent-clipped=0.7
2024-08-06 10:29:51,731 INFO [trainer.py:765] (4/8) Epoch 8, batch 1100, train_loss[loss=2.842, ArTop10Accuracy=0.762, over 13689.00 frames. ], tot_loss[loss=2.873, ArTop10Accuracy=0.7568, over 11939.37 frames. ], batch size: 34, lr: 1.41e-02
2024-08-06 10:31:05,947 INFO [trainer.py:765] (4/8) Epoch 8, batch 1200, train_loss[loss=2.955, ArTop10Accuracy=0.743, over 12402.00 frames. ], tot_loss[loss=2.875, ArTop10Accuracy=0.7565, over 11857.03 frames. ], batch size: 101, lr: 1.40e-02
2024-08-06 10:32:05,791 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 10:34:01,255 INFO [trainer.py:765] (4/8) Epoch 9, batch 100, train_loss[loss=2.904, ArTop10Accuracy=0.7568, over 14307.00 frames. ], tot_loss[loss=2.863, ArTop10Accuracy=0.7586, over 4737.05 frames. ], batch size: 62, lr: 1.32e-02
2024-08-06 10:35:31,771 INFO [trainer.py:765] (4/8) Epoch 9, batch 200, train_loss[loss=2.822, ArTop10Accuracy=0.7641, over 13818.00 frames. ], tot_loss[loss=2.855, ArTop10Accuracy=0.76, over 7743.59 frames. ], batch size: 35, lr: 1.32e-02
2024-08-06 10:36:57,926 INFO [trainer.py:765] (4/8) Epoch 9, batch 300, train_loss[loss=2.909, ArTop10Accuracy=0.7482, over 13983.00 frames. ], tot_loss[loss=2.849, ArTop10Accuracy=0.7611, over 9372.82 frames. ], batch size: 44, lr: 1.31e-02
2024-08-06 10:38:32,696 INFO [trainer.py:765] (4/8) Epoch 9, batch 400, train_loss[loss=2.76, ArTop10Accuracy=0.7829, over 10272.00 frames. ], tot_loss[loss=2.849, ArTop10Accuracy=0.7612, over 10289.62 frames. ], batch size: 14, lr: 1.31e-02
2024-08-06 10:39:59,255 INFO [trainer.py:765] (4/8) Epoch 9, batch 500, train_loss[loss=2.807, ArTop10Accuracy=0.7688, over 12525.00 frames. ], tot_loss[loss=2.843, ArTop10Accuracy=0.7624, over 10855.66 frames. ], batch size: 23, lr: 1.30e-02
2024-08-06 10:41:29,689 INFO [trainer.py:765] (4/8) Epoch 9, batch 600, train_loss[loss=2.761, ArTop10Accuracy=0.7807, over 11481.00 frames. ], tot_loss[loss=2.846, ArTop10Accuracy=0.7619, over 11380.55 frames. ], batch size: 18, lr: 1.30e-02
2024-08-06 10:42:58,439 INFO [trainer.py:765] (4/8) Epoch 9, batch 700, train_loss[loss=2.828, ArTop10Accuracy=0.7617, over 10086.00 frames. ], tot_loss[loss=2.849, ArTop10Accuracy=0.7613, over 11524.13 frames. ], batch size: 12, lr: 1.29e-02
2024-08-06 10:44:02,952 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.039e+02 1.253e+02 1.352e+02 1.493e+02 7.010e+02, threshold=2.704e+02, percent-clipped=0.6
2024-08-06 10:44:19,668 INFO [trainer.py:765] (4/8) Epoch 9, batch 800, train_loss[loss=2.768, ArTop10Accuracy=0.7773, over 10254.00 frames. ], tot_loss[loss=2.852, ArTop10Accuracy=0.7609, over 11649.42 frames. ], batch size: 12, lr: 1.29e-02
2024-08-06 10:45:35,718 INFO [trainer.py:765] (4/8) Epoch 9, batch 900, train_loss[loss=2.88, ArTop10Accuracy=0.753, over 13434.00 frames. ], tot_loss[loss=2.847, ArTop10Accuracy=0.7618, over 11670.68 frames. ], batch size: 28, lr: 1.28e-02
2024-08-06 10:46:51,270 INFO [trainer.py:765] (4/8) Epoch 9, batch 1000, train_loss[loss=2.869, ArTop10Accuracy=0.7541, over 12966.00 frames. ], tot_loss[loss=2.85, ArTop10Accuracy=0.761, over 11876.51 frames. ], batch size: 27, lr: 1.28e-02
2024-08-06 10:48:06,246 INFO [trainer.py:765] (4/8) Epoch 9, batch 1100, train_loss[loss=2.955, ArTop10Accuracy=0.7404, over 13590.00 frames. ], tot_loss[loss=2.856, ArTop10Accuracy=0.7598, over 11951.07 frames. ], batch size: 34, lr: 1.28e-02
2024-08-06 10:49:21,052 INFO [trainer.py:765] (4/8) Epoch 9, batch 1200, train_loss[loss=2.974, ArTop10Accuracy=0.7371, over 12891.00 frames. ], tot_loss[loss=2.854, ArTop10Accuracy=0.7605, over 11860.30 frames. ], batch size: 101, lr: 1.27e-02
2024-08-06 10:50:22,648 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 10:52:12,326 INFO [trainer.py:765] (4/8) Epoch 10, batch 100, train_loss[loss=2.903, ArTop10Accuracy=0.7494, over 14361.00 frames. ], tot_loss[loss=2.84, ArTop10Accuracy=0.7629, over 4760.61 frames. ], batch size: 62, lr: 1.20e-02
2024-08-06 10:53:44,585 INFO [trainer.py:765] (4/8) Epoch 10, batch 200, train_loss[loss=2.808, ArTop10Accuracy=0.7728, over 13860.00 frames. ], tot_loss[loss=2.832, ArTop10Accuracy=0.7645, over 7751.90 frames. ], batch size: 34, lr: 1.20e-02
2024-08-06 10:55:08,089 INFO [trainer.py:765] (4/8) Epoch 10, batch 300, train_loss[loss=2.899, ArTop10Accuracy=0.7537, over 14238.00 frames. ], tot_loss[loss=2.829, ArTop10Accuracy=0.765, over 9382.80 frames. ], batch size: 44, lr: 1.19e-02
2024-08-06 10:56:41,176 INFO [trainer.py:765] (4/8) Epoch 10, batch 400, train_loss[loss=2.607, ArTop10Accuracy=0.8052, over 10920.00 frames. ], tot_loss[loss=2.825, ArTop10Accuracy=0.7657, over 10285.01 frames. ], batch size: 15, lr: 1.19e-02
2024-08-06 10:58:04,937 INFO [trainer.py:803] (4/8) Computing validation loss
2024-08-06 10:58:14,559 INFO [trainer.py:811] (4/8) Epoch 10, validation: loss=2.842, ArTop10Accuracy=0.7624, over 1827537.00 frames. 
2024-08-06 10:58:14,560 INFO [trainer.py:814] (4/8) Maximum memory allocated so far is 32729MB
2024-08-06 10:58:15,573 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.035e+02 1.228e+02 1.320e+02 1.458e+02 6.096e+02, threshold=2.641e+02, percent-clipped=0.6
2024-08-06 10:58:15,577 INFO [trainer.py:765] (4/8) Epoch 10, batch 500, train_loss[loss=2.741, ArTop10Accuracy=0.7833, over 12168.00 frames. ], tot_loss[loss=2.822, ArTop10Accuracy=0.7663, over 10832.02 frames. ], batch size: 22, lr: 1.19e-02
2024-08-06 10:59:42,814 INFO [trainer.py:765] (4/8) Epoch 10, batch 600, train_loss[loss=2.833, ArTop10Accuracy=0.7675, over 11478.00 frames. ], tot_loss[loss=2.822, ArTop10Accuracy=0.7663, over 11348.40 frames. ], batch size: 18, lr: 1.18e-02
2024-08-06 11:01:18,107 INFO [trainer.py:765] (4/8) Epoch 10, batch 700, train_loss[loss=2.833, ArTop10Accuracy=0.77, over 10155.00 frames. ], tot_loss[loss=2.831, ArTop10Accuracy=0.7646, over 11499.12 frames. ], batch size: 12, lr: 1.18e-02
2024-08-06 11:02:36,917 INFO [trainer.py:765] (4/8) Epoch 10, batch 800, train_loss[loss=2.736, ArTop10Accuracy=0.7802, over 9588.00 frames. ], tot_loss[loss=2.834, ArTop10Accuracy=0.764, over 11598.00 frames. ], batch size: 11, lr: 1.17e-02
2024-08-06 11:03:51,211 INFO [trainer.py:765] (4/8) Epoch 10, batch 900, train_loss[loss=2.81, ArTop10Accuracy=0.7674, over 12879.00 frames. ], tot_loss[loss=2.829, ArTop10Accuracy=0.7651, over 11668.20 frames. ], batch size: 27, lr: 1.17e-02
2024-08-06 11:05:06,351 INFO [trainer.py:765] (4/8) Epoch 10, batch 1000, train_loss[loss=2.774, ArTop10Accuracy=0.7766, over 13230.00 frames. ], tot_loss[loss=2.832, ArTop10Accuracy=0.7643, over 11870.10 frames. ], batch size: 28, lr: 1.17e-02
2024-08-06 11:06:21,722 INFO [trainer.py:765] (4/8) Epoch 10, batch 1100, train_loss[loss=2.834, ArTop10Accuracy=0.7654, over 14001.00 frames. ], tot_loss[loss=2.837, ArTop10Accuracy=0.7635, over 11949.67 frames. ], batch size: 34, lr: 1.16e-02
2024-08-06 11:07:34,771 INFO [trainer.py:765] (4/8) Epoch 10, batch 1200, train_loss[loss=2.926, ArTop10Accuracy=0.7422, over 12183.00 frames. ], tot_loss[loss=2.839, ArTop10Accuracy=0.7631, over 11860.35 frames. ], batch size: 101, lr: 1.16e-02
2024-08-06 11:08:33,545 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 11:10:29,955 INFO [trainer.py:765] (4/8) Epoch 11, batch 100, train_loss[loss=2.894, ArTop10Accuracy=0.7514, over 14163.00 frames. ], tot_loss[loss=2.822, ArTop10Accuracy=0.7656, over 4760.24 frames. ], batch size: 62, lr: 1.10e-02
2024-08-06 11:12:04,675 INFO [trainer.py:765] (4/8) Epoch 11, batch 200, train_loss[loss=2.819, ArTop10Accuracy=0.7616, over 13581.00 frames. ], tot_loss[loss=2.816, ArTop10Accuracy=0.7671, over 7747.57 frames. ], batch size: 34, lr: 1.10e-02
2024-08-06 11:12:22,826 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 9.884e+01 1.240e+02 1.333e+02 1.457e+02 6.939e+02, threshold=2.667e+02, percent-clipped=0.1
2024-08-06 11:13:31,548 INFO [trainer.py:765] (4/8) Epoch 11, batch 300, train_loss[loss=2.827, ArTop10Accuracy=0.7685, over 14136.00 frames. ], tot_loss[loss=2.805, ArTop10Accuracy=0.7695, over 9352.68 frames. ], batch size: 44, lr: 1.09e-02
2024-08-06 11:15:03,269 INFO [trainer.py:765] (4/8) Epoch 11, batch 400, train_loss[loss=2.65, ArTop10Accuracy=0.7958, over 10311.00 frames. ], tot_loss[loss=2.805, ArTop10Accuracy=0.7695, over 10269.57 frames. ], batch size: 14, lr: 1.09e-02
2024-08-06 11:16:29,637 INFO [trainer.py:765] (4/8) Epoch 11, batch 500, train_loss[loss=2.803, ArTop10Accuracy=0.7683, over 12186.00 frames. ], tot_loss[loss=2.799, ArTop10Accuracy=0.7709, over 10871.11 frames. ], batch size: 22, lr: 1.09e-02
2024-08-06 11:18:00,517 INFO [trainer.py:765] (4/8) Epoch 11, batch 600, train_loss[loss=2.703, ArTop10Accuracy=0.7925, over 11367.00 frames. ], tot_loss[loss=2.802, ArTop10Accuracy=0.7702, over 11379.50 frames. ], batch size: 18, lr: 1.08e-02
2024-08-06 11:19:34,514 INFO [trainer.py:765] (4/8) Epoch 11, batch 700, train_loss[loss=2.706, ArTop10Accuracy=0.7954, over 10206.00 frames. ], tot_loss[loss=2.805, ArTop10Accuracy=0.7696, over 11531.85 frames. ], batch size: 12, lr: 1.08e-02
2024-08-06 11:20:55,484 INFO [trainer.py:765] (4/8) Epoch 11, batch 800, train_loss[loss=2.785, ArTop10Accuracy=0.7746, over 10131.00 frames. ], tot_loss[loss=2.812, ArTop10Accuracy=0.7683, over 11652.45 frames. ], batch size: 12, lr: 1.07e-02
2024-08-06 11:22:13,705 INFO [trainer.py:765] (4/8) Epoch 11, batch 900, train_loss[loss=2.852, ArTop10Accuracy=0.7595, over 12939.00 frames. ], tot_loss[loss=2.808, ArTop10Accuracy=0.7692, over 11699.34 frames. ], batch size: 27, lr: 1.07e-02
2024-08-06 11:23:31,799 INFO [trainer.py:765] (4/8) Epoch 11, batch 1000, train_loss[loss=2.784, ArTop10Accuracy=0.776, over 12765.00 frames. ], tot_loss[loss=2.811, ArTop10Accuracy=0.7685, over 11909.07 frames. ], batch size: 27, lr: 1.07e-02
2024-08-06 11:24:46,902 INFO [trainer.py:765] (4/8) Epoch 11, batch 1100, train_loss[loss=2.783, ArTop10Accuracy=0.7739, over 13785.00 frames. ], tot_loss[loss=2.82, ArTop10Accuracy=0.7666, over 11994.73 frames. ], batch size: 34, lr: 1.06e-02
2024-08-06 11:26:00,733 INFO [trainer.py:765] (4/8) Epoch 11, batch 1200, train_loss[loss=2.906, ArTop10Accuracy=0.7499, over 12528.00 frames. ], tot_loss[loss=2.821, ArTop10Accuracy=0.7665, over 11900.31 frames. ], batch size: 101, lr: 1.06e-02
2024-08-06 11:26:15,847 INFO [trainer.py:803] (4/8) Computing validation loss
2024-08-06 11:26:25,556 INFO [trainer.py:811] (4/8) Epoch 11, validation: loss=2.831, ArTop10Accuracy=0.7643, over 1827537.00 frames. 
2024-08-06 11:26:25,557 INFO [trainer.py:814] (4/8) Maximum memory allocated so far is 32729MB
2024-08-06 11:26:26,185 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.029e+02 1.251e+02 1.335e+02 1.441e+02 2.942e+02, threshold=2.669e+02, percent-clipped=0.1
2024-08-06 11:27:09,520 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 11:29:03,450 INFO [trainer.py:765] (4/8) Epoch 12, batch 100, train_loss[loss=2.851, ArTop10Accuracy=0.7621, over 14574.00 frames. ], tot_loss[loss=2.803, ArTop10Accuracy=0.7693, over 4761.92 frames. ], batch size: 62, lr: 1.01e-02
2024-08-06 11:30:30,674 INFO [trainer.py:765] (4/8) Epoch 12, batch 200, train_loss[loss=2.84, ArTop10Accuracy=0.7634, over 13653.00 frames. ], tot_loss[loss=2.801, ArTop10Accuracy=0.7697, over 7757.17 frames. ], batch size: 34, lr: 1.01e-02
2024-08-06 11:31:57,655 INFO [trainer.py:765] (4/8) Epoch 12, batch 300, train_loss[loss=2.84, ArTop10Accuracy=0.7657, over 14268.00 frames. ], tot_loss[loss=2.795, ArTop10Accuracy=0.7713, over 9378.27 frames. ], batch size: 44, lr: 1.01e-02
2024-08-06 11:33:30,739 INFO [trainer.py:765] (4/8) Epoch 12, batch 400, train_loss[loss=2.648, ArTop10Accuracy=0.7979, over 10299.00 frames. ], tot_loss[loss=2.793, ArTop10Accuracy=0.7716, over 10283.00 frames. ], batch size: 14, lr: 1.00e-02
2024-08-06 11:34:55,733 INFO [trainer.py:765] (4/8) Epoch 12, batch 500, train_loss[loss=2.764, ArTop10Accuracy=0.7742, over 12129.00 frames. ], tot_loss[loss=2.79, ArTop10Accuracy=0.7722, over 10856.75 frames. ], batch size: 22, lr: 1.00e-02
2024-08-06 11:36:29,363 INFO [trainer.py:765] (4/8) Epoch 12, batch 600, train_loss[loss=2.737, ArTop10Accuracy=0.7859, over 11379.00 frames. ], tot_loss[loss=2.792, ArTop10Accuracy=0.7718, over 11376.60 frames. ], batch size: 18, lr: 9.97e-03
2024-08-06 11:38:00,343 INFO [trainer.py:765] (4/8) Epoch 12, batch 700, train_loss[loss=2.838, ArTop10Accuracy=0.7632, over 10191.00 frames. ], tot_loss[loss=2.796, ArTop10Accuracy=0.771, over 11525.72 frames. ], batch size: 12, lr: 9.93e-03
2024-08-06 11:39:23,610 INFO [trainer.py:765] (4/8) Epoch 12, batch 800, train_loss[loss=2.668, ArTop10Accuracy=0.7935, over 10128.00 frames. ], tot_loss[loss=2.799, ArTop10Accuracy=0.7706, over 11638.44 frames. ], batch size: 12, lr: 9.90e-03
2024-08-06 11:40:39,889 INFO [trainer.py:765] (4/8) Epoch 12, batch 900, train_loss[loss=2.768, ArTop10Accuracy=0.7778, over 12996.00 frames. ], tot_loss[loss=2.794, ArTop10Accuracy=0.7715, over 11692.40 frames. ], batch size: 27, lr: 9.87e-03
2024-08-06 11:41:13,995 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.041e+02 1.248e+02 1.348e+02 1.459e+02 5.540e+02, threshold=2.695e+02, percent-clipped=0.3
2024-08-06 11:41:56,188 INFO [trainer.py:765] (4/8) Epoch 12, batch 1000, train_loss[loss=2.773, ArTop10Accuracy=0.7733, over 12681.00 frames. ], tot_loss[loss=2.797, ArTop10Accuracy=0.7706, over 11884.29 frames. ], batch size: 27, lr: 9.85e-03
2024-08-06 11:43:14,321 INFO [trainer.py:765] (4/8) Epoch 12, batch 1100, train_loss[loss=2.818, ArTop10Accuracy=0.7713, over 13422.00 frames. ], tot_loss[loss=2.804, ArTop10Accuracy=0.7694, over 11954.58 frames. ], batch size: 34, lr: 9.82e-03
2024-08-06 11:44:26,156 INFO [trainer.py:765] (4/8) Epoch 12, batch 1200, train_loss[loss=2.938, ArTop10Accuracy=0.7448, over 12807.00 frames. ], tot_loss[loss=2.805, ArTop10Accuracy=0.7695, over 11862.54 frames. ], batch size: 101, lr: 9.79e-03
2024-08-06 11:45:26,431 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 11:47:26,599 INFO [trainer.py:765] (4/8) Epoch 13, batch 100, train_loss[loss=2.825, ArTop10Accuracy=0.766, over 14238.00 frames. ], tot_loss[loss=2.792, ArTop10Accuracy=0.7713, over 4764.73 frames. ], batch size: 62, lr: 9.37e-03
2024-08-06 11:48:54,778 INFO [trainer.py:765] (4/8) Epoch 13, batch 200, train_loss[loss=2.834, ArTop10Accuracy=0.7646, over 13965.00 frames. ], tot_loss[loss=2.783, ArTop10Accuracy=0.773, over 7763.78 frames. ], batch size: 35, lr: 9.34e-03
2024-08-06 11:50:20,514 INFO [trainer.py:765] (4/8) Epoch 13, batch 300, train_loss[loss=2.807, ArTop10Accuracy=0.7683, over 14352.00 frames. ], tot_loss[loss=2.779, ArTop10Accuracy=0.7743, over 9392.08 frames. ], batch size: 44, lr: 9.31e-03
2024-08-06 11:51:48,764 INFO [trainer.py:765] (4/8) Epoch 13, batch 400, train_loss[loss=2.714, ArTop10Accuracy=0.785, over 10140.00 frames. ], tot_loss[loss=2.777, ArTop10Accuracy=0.7749, over 10285.29 frames. ], batch size: 14, lr: 9.28e-03
2024-08-06 11:53:13,406 INFO [trainer.py:765] (4/8) Epoch 13, batch 500, train_loss[loss=2.727, ArTop10Accuracy=0.7851, over 12174.00 frames. ], tot_loss[loss=2.769, ArTop10Accuracy=0.7765, over 10845.60 frames. ], batch size: 22, lr: 9.26e-03
2024-08-06 11:54:52,222 INFO [trainer.py:765] (4/8) Epoch 13, batch 600, train_loss[loss=2.715, ArTop10Accuracy=0.7804, over 11475.00 frames. ], tot_loss[loss=2.776, ArTop10Accuracy=0.775, over 11351.00 frames. ], batch size: 18, lr: 9.23e-03
2024-08-06 11:55:47,080 INFO [trainer.py:803] (4/8) Computing validation loss
2024-08-06 11:55:56,835 INFO [trainer.py:811] (4/8) Epoch 13, validation: loss=2.824, ArTop10Accuracy=0.7662, over 1827537.00 frames. 
2024-08-06 11:55:56,835 INFO [trainer.py:814] (4/8) Maximum memory allocated so far is 32729MB
2024-08-06 11:55:57,711 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.064e+02 1.255e+02 1.343e+02 1.452e+02 4.888e+02, threshold=2.687e+02, percent-clipped=0.1
2024-08-06 11:56:28,464 INFO [trainer.py:765] (4/8) Epoch 13, batch 700, train_loss[loss=2.75, ArTop10Accuracy=0.7831, over 10083.00 frames. ], tot_loss[loss=2.777, ArTop10Accuracy=0.7747, over 11501.17 frames. ], batch size: 12, lr: 9.20e-03
2024-08-06 11:57:46,682 INFO [trainer.py:765] (4/8) Epoch 13, batch 800, train_loss[loss=2.675, ArTop10Accuracy=0.7886, over 10248.00 frames. ], tot_loss[loss=2.779, ArTop10Accuracy=0.7744, over 11619.44 frames. ], batch size: 12, lr: 9.18e-03
2024-08-06 11:59:03,286 INFO [trainer.py:765] (4/8) Epoch 13, batch 900, train_loss[loss=2.771, ArTop10Accuracy=0.7767, over 12915.00 frames. ], tot_loss[loss=2.775, ArTop10Accuracy=0.7752, over 11675.58 frames. ], batch size: 27, lr: 9.15e-03
2024-08-06 12:00:19,173 INFO [trainer.py:765] (4/8) Epoch 13, batch 1000, train_loss[loss=2.828, ArTop10Accuracy=0.7686, over 12804.00 frames. ], tot_loss[loss=2.785, ArTop10Accuracy=0.7734, over 11872.36 frames. ], batch size: 27, lr: 9.13e-03
2024-08-06 12:01:34,880 INFO [trainer.py:765] (4/8) Epoch 13, batch 1100, train_loss[loss=2.804, ArTop10Accuracy=0.7651, over 13485.00 frames. ], tot_loss[loss=2.79, ArTop10Accuracy=0.7723, over 11951.26 frames. ], batch size: 34, lr: 9.10e-03
2024-08-06 12:02:48,662 INFO [trainer.py:765] (4/8) Epoch 13, batch 1200, train_loss[loss=2.902, ArTop10Accuracy=0.7484, over 12114.00 frames. ], tot_loss[loss=2.79, ArTop10Accuracy=0.7723, over 11865.52 frames. ], batch size: 101, lr: 9.08e-03
2024-08-06 12:03:48,339 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 12:05:45,333 INFO [trainer.py:765] (4/8) Epoch 14, batch 100, train_loss[loss=2.835, ArTop10Accuracy=0.762, over 14472.00 frames. ], tot_loss[loss=2.776, ArTop10Accuracy=0.7746, over 4782.62 frames. ], batch size: 62, lr: 8.71e-03
2024-08-06 12:07:16,602 INFO [trainer.py:765] (4/8) Epoch 14, batch 200, train_loss[loss=2.814, ArTop10Accuracy=0.7662, over 13683.00 frames. ], tot_loss[loss=2.773, ArTop10Accuracy=0.7753, over 7786.17 frames. ], batch size: 34, lr: 8.69e-03
2024-08-06 12:08:44,310 INFO [trainer.py:765] (4/8) Epoch 14, batch 300, train_loss[loss=2.756, ArTop10Accuracy=0.7761, over 14625.00 frames. ], tot_loss[loss=2.766, ArTop10Accuracy=0.7768, over 9408.24 frames. ], batch size: 45, lr: 8.66e-03
2024-08-06 12:10:01,130 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.072e+02 1.266e+02 1.374e+02 1.483e+02 6.480e+02, threshold=2.748e+02, percent-clipped=0.2
2024-08-06 12:10:10,225 INFO [trainer.py:765] (4/8) Epoch 14, batch 400, train_loss[loss=2.663, ArTop10Accuracy=0.7999, over 10509.00 frames. ], tot_loss[loss=2.765, ArTop10Accuracy=0.7769, over 10317.67 frames. ], batch size: 14, lr: 8.64e-03
2024-08-06 12:11:36,149 INFO [trainer.py:765] (4/8) Epoch 14, batch 500, train_loss[loss=2.836, ArTop10Accuracy=0.7666, over 12150.00 frames. ], tot_loss[loss=2.764, ArTop10Accuracy=0.7771, over 10867.61 frames. ], batch size: 22, lr: 8.62e-03
2024-08-06 12:13:05,992 INFO [trainer.py:765] (4/8) Epoch 14, batch 600, train_loss[loss=2.737, ArTop10Accuracy=0.7799, over 11397.00 frames. ], tot_loss[loss=2.766, ArTop10Accuracy=0.777, over 11388.73 frames. ], batch size: 18, lr: 8.59e-03
2024-08-06 12:14:38,553 INFO [trainer.py:765] (4/8) Epoch 14, batch 700, train_loss[loss=2.761, ArTop10Accuracy=0.7818, over 9318.00 frames. ], tot_loss[loss=2.771, ArTop10Accuracy=0.7759, over 11531.31 frames. ], batch size: 11, lr: 8.57e-03
2024-08-06 12:15:58,068 INFO [trainer.py:765] (4/8) Epoch 14, batch 800, train_loss[loss=2.574, ArTop10Accuracy=0.8119, over 10068.00 frames. ], tot_loss[loss=2.774, ArTop10Accuracy=0.7752, over 11637.37 frames. ], batch size: 12, lr: 8.55e-03
2024-08-06 12:17:12,864 INFO [trainer.py:765] (4/8) Epoch 14, batch 900, train_loss[loss=2.758, ArTop10Accuracy=0.7791, over 13287.00 frames. ], tot_loss[loss=2.767, ArTop10Accuracy=0.7766, over 11696.62 frames. ], batch size: 28, lr: 8.52e-03
2024-08-06 12:18:29,613 INFO [trainer.py:765] (4/8) Epoch 14, batch 1000, train_loss[loss=2.746, ArTop10Accuracy=0.7813, over 12909.00 frames. ], tot_loss[loss=2.771, ArTop10Accuracy=0.7758, over 11892.84 frames. ], batch size: 27, lr: 8.50e-03
2024-08-06 12:19:45,375 INFO [trainer.py:765] (4/8) Epoch 14, batch 1100, train_loss[loss=2.739, ArTop10Accuracy=0.7804, over 13647.00 frames. ], tot_loss[loss=2.775, ArTop10Accuracy=0.7752, over 11926.48 frames. ], batch size: 34, lr: 8.48e-03
2024-08-06 12:20:59,277 INFO [trainer.py:765] (4/8) Epoch 14, batch 1200, train_loss[loss=2.904, ArTop10Accuracy=0.7477, over 12768.00 frames. ], tot_loss[loss=2.774, ArTop10Accuracy=0.7754, over 11863.44 frames. ], batch size: 101, lr: 8.46e-03
2024-08-06 12:21:58,313 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 12:23:51,961 INFO [trainer.py:765] (4/8) Epoch 15, batch 100, train_loss[loss=2.757, ArTop10Accuracy=0.7769, over 14058.00 frames. ], tot_loss[loss=2.763, ArTop10Accuracy=0.7767, over 4741.67 frames. ], batch size: 62, lr: 8.14e-03
2024-08-06 12:24:00,599 INFO [trainer.py:803] (4/8) Computing validation loss
2024-08-06 12:24:10,290 INFO [trainer.py:811] (4/8) Epoch 15, validation: loss=2.819, ArTop10Accuracy=0.7675, over 1827537.00 frames. 
2024-08-06 12:24:10,291 INFO [trainer.py:814] (4/8) Maximum memory allocated so far is 32729MB
2024-08-06 12:24:11,094 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.080e+02 1.284e+02 1.371e+02 1.488e+02 4.667e+02, threshold=2.743e+02, percent-clipped=0.2
2024-08-06 12:25:29,988 INFO [trainer.py:765] (4/8) Epoch 15, batch 200, train_loss[loss=2.727, ArTop10Accuracy=0.7861, over 13497.00 frames. ], tot_loss[loss=2.756, ArTop10Accuracy=0.7786, over 7747.87 frames. ], batch size: 34, lr: 8.12e-03
2024-08-06 12:26:58,694 INFO [trainer.py:765] (4/8) Epoch 15, batch 300, train_loss[loss=2.79, ArTop10Accuracy=0.7734, over 14127.00 frames. ], tot_loss[loss=2.755, ArTop10Accuracy=0.7789, over 9366.22 frames. ], batch size: 44, lr: 8.09e-03
2024-08-06 12:28:28,533 INFO [trainer.py:765] (4/8) Epoch 15, batch 400, train_loss[loss=2.737, ArTop10Accuracy=0.7757, over 10197.00 frames. ], tot_loss[loss=2.75, ArTop10Accuracy=0.7798, over 10275.21 frames. ], batch size: 14, lr: 8.07e-03
2024-08-06 12:29:54,032 INFO [trainer.py:765] (4/8) Epoch 15, batch 500, train_loss[loss=2.684, ArTop10Accuracy=0.7923, over 11910.00 frames. ], tot_loss[loss=2.745, ArTop10Accuracy=0.7806, over 10839.43 frames. ], batch size: 22, lr: 8.05e-03
2024-08-06 12:31:23,292 INFO [trainer.py:765] (4/8) Epoch 15, batch 600, train_loss[loss=2.711, ArTop10Accuracy=0.7829, over 11328.00 frames. ], tot_loss[loss=2.751, ArTop10Accuracy=0.7795, over 11360.31 frames. ], batch size: 18, lr: 8.03e-03
2024-08-06 12:32:53,175 INFO [trainer.py:765] (4/8) Epoch 15, batch 700, train_loss[loss=2.798, ArTop10Accuracy=0.7657, over 9354.00 frames. ], tot_loss[loss=2.755, ArTop10Accuracy=0.7787, over 11509.82 frames. ], batch size: 11, lr: 8.01e-03
2024-08-06 12:34:18,254 INFO [trainer.py:765] (4/8) Epoch 15, batch 800, train_loss[loss=2.694, ArTop10Accuracy=0.7858, over 9429.00 frames. ], tot_loss[loss=2.759, ArTop10Accuracy=0.7778, over 11617.64 frames. ], batch size: 11, lr: 7.99e-03
2024-08-06 12:35:34,726 INFO [trainer.py:765] (4/8) Epoch 15, batch 900, train_loss[loss=2.779, ArTop10Accuracy=0.7811, over 13008.00 frames. ], tot_loss[loss=2.754, ArTop10Accuracy=0.7789, over 11663.27 frames. ], batch size: 27, lr: 7.97e-03
2024-08-06 12:36:50,540 INFO [trainer.py:765] (4/8) Epoch 15, batch 1000, train_loss[loss=2.755, ArTop10Accuracy=0.7811, over 12786.00 frames. ], tot_loss[loss=2.758, ArTop10Accuracy=0.7782, over 11867.55 frames. ], batch size: 27, lr: 7.95e-03
2024-08-06 12:38:05,179 INFO [trainer.py:765] (4/8) Epoch 15, batch 1100, train_loss[loss=2.727, ArTop10Accuracy=0.781, over 13656.00 frames. ], tot_loss[loss=2.765, ArTop10Accuracy=0.7768, over 11960.61 frames. ], batch size: 34, lr: 7.93e-03
2024-08-06 12:38:12,841 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.080e+02 1.293e+02 1.379e+02 1.467e+02 2.824e+02, threshold=2.759e+02, percent-clipped=0.1
2024-08-06 12:39:18,788 INFO [trainer.py:765] (4/8) Epoch 15, batch 1200, train_loss[loss=2.875, ArTop10Accuracy=0.7581, over 12324.00 frames. ], tot_loss[loss=2.767, ArTop10Accuracy=0.7764, over 11867.43 frames. ], batch size: 101, lr: 7.91e-03
2024-08-06 12:40:18,729 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 12:42:17,617 INFO [trainer.py:765] (4/8) Epoch 16, batch 100, train_loss[loss=2.72, ArTop10Accuracy=0.7843, over 14628.00 frames. ], tot_loss[loss=2.754, ArTop10Accuracy=0.7785, over 4756.83 frames. ], batch size: 63, lr: 7.63e-03
2024-08-06 12:43:49,563 INFO [trainer.py:765] (4/8) Epoch 16, batch 200, train_loss[loss=2.772, ArTop10Accuracy=0.7808, over 13596.00 frames. ], tot_loss[loss=2.748, ArTop10Accuracy=0.7796, over 7758.14 frames. ], batch size: 34, lr: 7.61e-03
2024-08-06 12:45:18,501 INFO [trainer.py:765] (4/8) Epoch 16, batch 300, train_loss[loss=2.786, ArTop10Accuracy=0.7746, over 14376.00 frames. ], tot_loss[loss=2.742, ArTop10Accuracy=0.7808, over 9384.75 frames. ], batch size: 44, lr: 7.59e-03
2024-08-06 12:46:45,207 INFO [trainer.py:765] (4/8) Epoch 16, batch 400, train_loss[loss=2.673, ArTop10Accuracy=0.7931, over 10800.00 frames. ], tot_loss[loss=2.738, ArTop10Accuracy=0.7816, over 10273.40 frames. ], batch size: 15, lr: 7.58e-03
2024-08-06 12:48:16,309 INFO [trainer.py:765] (4/8) Epoch 16, batch 500, train_loss[loss=2.668, ArTop10Accuracy=0.7959, over 12543.00 frames. ], tot_loss[loss=2.733, ArTop10Accuracy=0.7828, over 10823.89 frames. ], batch size: 23, lr: 7.56e-03
2024-08-06 12:49:46,641 INFO [trainer.py:765] (4/8) Epoch 16, batch 600, train_loss[loss=2.696, ArTop10Accuracy=0.7945, over 11832.00 frames. ], tot_loss[loss=2.739, ArTop10Accuracy=0.7818, over 11356.77 frames. ], batch size: 19, lr: 7.54e-03
2024-08-06 12:51:23,681 INFO [trainer.py:765] (4/8) Epoch 16, batch 700, train_loss[loss=2.622, ArTop10Accuracy=0.8066, over 9279.00 frames. ], tot_loss[loss=2.742, ArTop10Accuracy=0.7812, over 11496.79 frames. ], batch size: 11, lr: 7.52e-03
2024-08-06 12:52:43,500 INFO [trainer.py:765] (4/8) Epoch 16, batch 800, train_loss[loss=2.665, ArTop10Accuracy=0.7968, over 9534.00 frames. ], tot_loss[loss=2.748, ArTop10Accuracy=0.7802, over 11622.91 frames. ], batch size: 11, lr: 7.51e-03
2024-08-06 12:53:06,015 INFO [trainer.py:803] (4/8) Computing validation loss
2024-08-06 12:53:15,497 INFO [trainer.py:811] (4/8) Epoch 16, validation: loss=2.816, ArTop10Accuracy=0.7678, over 1827537.00 frames. 
2024-08-06 12:53:15,497 INFO [trainer.py:814] (4/8) Maximum memory allocated so far is 32729MB
2024-08-06 12:53:16,186 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.112e+02 1.291e+02 1.391e+02 1.487e+02 3.459e+02, threshold=2.783e+02, percent-clipped=0.1
2024-08-06 12:54:06,480 INFO [trainer.py:765] (4/8) Epoch 16, batch 900, train_loss[loss=2.758, ArTop10Accuracy=0.7755, over 12792.00 frames. ], tot_loss[loss=2.743, ArTop10Accuracy=0.7814, over 11673.14 frames. ], batch size: 27, lr: 7.49e-03
2024-08-06 12:55:19,790 INFO [trainer.py:765] (4/8) Epoch 16, batch 1000, train_loss[loss=2.729, ArTop10Accuracy=0.7823, over 12786.00 frames. ], tot_loss[loss=2.748, ArTop10Accuracy=0.7803, over 11883.80 frames. ], batch size: 27, lr: 7.47e-03
2024-08-06 12:56:33,162 INFO [trainer.py:765] (4/8) Epoch 16, batch 1100, train_loss[loss=2.841, ArTop10Accuracy=0.761, over 13731.00 frames. ], tot_loss[loss=2.755, ArTop10Accuracy=0.7788, over 11965.31 frames. ], batch size: 34, lr: 7.45e-03
2024-08-06 12:57:48,484 INFO [trainer.py:765] (4/8) Epoch 16, batch 1200, train_loss[loss=2.889, ArTop10Accuracy=0.7509, over 13242.00 frames. ], tot_loss[loss=2.758, ArTop10Accuracy=0.7784, over 11864.74 frames. ], batch size: 101, lr: 7.44e-03
2024-08-06 12:58:48,452 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 13:00:47,899 INFO [trainer.py:765] (4/8) Epoch 17, batch 100, train_loss[loss=2.808, ArTop10Accuracy=0.7735, over 14139.00 frames. ], tot_loss[loss=2.737, ArTop10Accuracy=0.782, over 4762.26 frames. ], batch size: 62, lr: 7.18e-03
2024-08-06 13:02:19,301 INFO [trainer.py:765] (4/8) Epoch 17, batch 200, train_loss[loss=2.696, ArTop10Accuracy=0.7905, over 13575.00 frames. ], tot_loss[loss=2.731, ArTop10Accuracy=0.783, over 7754.67 frames. ], batch size: 34, lr: 7.17e-03
2024-08-06 13:03:45,516 INFO [trainer.py:765] (4/8) Epoch 17, batch 300, train_loss[loss=2.778, ArTop10Accuracy=0.774, over 14085.00 frames. ], tot_loss[loss=2.728, ArTop10Accuracy=0.7836, over 9361.00 frames. ], batch size: 44, lr: 7.15e-03
2024-08-06 13:05:21,759 INFO [trainer.py:765] (4/8) Epoch 17, batch 400, train_loss[loss=2.697, ArTop10Accuracy=0.7889, over 10224.00 frames. ], tot_loss[loss=2.729, ArTop10Accuracy=0.7835, over 10286.59 frames. ], batch size: 14, lr: 7.14e-03
2024-08-06 13:06:47,020 INFO [trainer.py:765] (4/8) Epoch 17, batch 500, train_loss[loss=2.703, ArTop10Accuracy=0.7954, over 12390.00 frames. ], tot_loss[loss=2.722, ArTop10Accuracy=0.7849, over 10860.98 frames. ], batch size: 23, lr: 7.12e-03
2024-08-06 13:07:39,878 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.140e+02 1.293e+02 1.386e+02 1.488e+02 3.253e+02, threshold=2.772e+02, percent-clipped=0.1
2024-08-06 13:08:22,687 INFO [trainer.py:765] (4/8) Epoch 17, batch 600, train_loss[loss=2.644, ArTop10Accuracy=0.8019, over 11319.00 frames. ], tot_loss[loss=2.725, ArTop10Accuracy=0.7842, over 11399.75 frames. ], batch size: 18, lr: 7.10e-03
2024-08-06 13:09:54,835 INFO [trainer.py:765] (4/8) Epoch 17, batch 700, train_loss[loss=2.647, ArTop10Accuracy=0.7977, over 9441.00 frames. ], tot_loss[loss=2.732, ArTop10Accuracy=0.7829, over 11531.09 frames. ], batch size: 11, lr: 7.09e-03
2024-08-06 13:11:19,480 INFO [trainer.py:765] (4/8) Epoch 17, batch 800, train_loss[loss=2.671, ArTop10Accuracy=0.7933, over 9414.00 frames. ], tot_loss[loss=2.736, ArTop10Accuracy=0.7824, over 11649.00 frames. ], batch size: 11, lr: 7.07e-03
2024-08-06 13:12:35,669 INFO [trainer.py:765] (4/8) Epoch 17, batch 900, train_loss[loss=2.667, ArTop10Accuracy=0.7941, over 12930.00 frames. ], tot_loss[loss=2.731, ArTop10Accuracy=0.7833, over 11681.78 frames. ], batch size: 27, lr: 7.06e-03
2024-08-06 13:13:53,061 INFO [trainer.py:765] (4/8) Epoch 17, batch 1000, train_loss[loss=2.74, ArTop10Accuracy=0.7801, over 13290.00 frames. ], tot_loss[loss=2.738, ArTop10Accuracy=0.7819, over 11875.05 frames. ], batch size: 28, lr: 7.04e-03
2024-08-06 13:15:08,483 INFO [trainer.py:765] (4/8) Epoch 17, batch 1100, train_loss[loss=2.772, ArTop10Accuracy=0.7688, over 13890.00 frames. ], tot_loss[loss=2.746, ArTop10Accuracy=0.7805, over 11955.85 frames. ], batch size: 34, lr: 7.02e-03
2024-08-06 13:16:22,387 INFO [trainer.py:765] (4/8) Epoch 17, batch 1200, train_loss[loss=2.87, ArTop10Accuracy=0.7565, over 12078.00 frames. ], tot_loss[loss=2.745, ArTop10Accuracy=0.7806, over 11841.53 frames. ], batch size: 101, lr: 7.01e-03
2024-08-06 13:17:21,505 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 13:19:15,993 INFO [trainer.py:765] (4/8) Epoch 18, batch 100, train_loss[loss=2.768, ArTop10Accuracy=0.7747, over 14724.00 frames. ], tot_loss[loss=2.726, ArTop10Accuracy=0.7841, over 4762.63 frames. ], batch size: 62, lr: 6.78e-03
2024-08-06 13:20:46,601 INFO [trainer.py:765] (4/8) Epoch 18, batch 200, train_loss[loss=2.718, ArTop10Accuracy=0.7839, over 13710.00 frames. ], tot_loss[loss=2.72, ArTop10Accuracy=0.7852, over 7740.19 frames. ], batch size: 34, lr: 6.77e-03
2024-08-06 13:21:55,104 INFO [trainer.py:803] (4/8) Computing validation loss
2024-08-06 13:22:04,751 INFO [trainer.py:811] (4/8) Epoch 18, validation: loss=2.817, ArTop10Accuracy=0.768, over 1827537.00 frames. 
2024-08-06 13:22:04,752 INFO [trainer.py:814] (4/8) Maximum memory allocated so far is 32729MB
2024-08-06 13:22:05,473 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.131e+02 1.323e+02 1.409e+02 1.514e+02 3.209e+02, threshold=2.818e+02, percent-clipped=0.1
2024-08-06 13:22:26,580 INFO [trainer.py:765] (4/8) Epoch 18, batch 300, train_loss[loss=2.816, ArTop10Accuracy=0.7642, over 14331.00 frames. ], tot_loss[loss=2.719, ArTop10Accuracy=0.7853, over 9360.96 frames. ], batch size: 45, lr: 6.76e-03
2024-08-06 13:23:57,929 INFO [trainer.py:765] (4/8) Epoch 18, batch 400, train_loss[loss=2.63, ArTop10Accuracy=0.8021, over 10269.00 frames. ], tot_loss[loss=2.719, ArTop10Accuracy=0.7853, over 10295.59 frames. ], batch size: 14, lr: 6.74e-03
2024-08-06 13:25:34,012 INFO [trainer.py:765] (4/8) Epoch 18, batch 500, train_loss[loss=2.756, ArTop10Accuracy=0.7784, over 12132.00 frames. ], tot_loss[loss=2.718, ArTop10Accuracy=0.7855, over 10847.92 frames. ], batch size: 22, lr: 6.73e-03
2024-08-06 13:27:00,633 INFO [trainer.py:765] (4/8) Epoch 18, batch 600, train_loss[loss=2.646, ArTop10Accuracy=0.8065, over 11325.00 frames. ], tot_loss[loss=2.719, ArTop10Accuracy=0.7854, over 11377.58 frames. ], batch size: 18, lr: 6.71e-03
2024-08-06 13:28:33,581 INFO [trainer.py:765] (4/8) Epoch 18, batch 700, train_loss[loss=2.732, ArTop10Accuracy=0.7826, over 10032.00 frames. ], tot_loss[loss=2.721, ArTop10Accuracy=0.785, over 11521.88 frames. ], batch size: 12, lr: 6.70e-03
2024-08-06 13:29:54,984 INFO [trainer.py:765] (4/8) Epoch 18, batch 800, train_loss[loss=2.64, ArTop10Accuracy=0.804, over 9444.00 frames. ], tot_loss[loss=2.725, ArTop10Accuracy=0.7844, over 11624.46 frames. ], batch size: 11, lr: 6.68e-03
2024-08-06 13:31:12,518 INFO [trainer.py:765] (4/8) Epoch 18, batch 900, train_loss[loss=2.733, ArTop10Accuracy=0.7867, over 13254.00 frames. ], tot_loss[loss=2.722, ArTop10Accuracy=0.7851, over 11690.93 frames. ], batch size: 28, lr: 6.67e-03
2024-08-06 13:32:26,550 INFO [trainer.py:765] (4/8) Epoch 18, batch 1000, train_loss[loss=2.754, ArTop10Accuracy=0.7792, over 12873.00 frames. ], tot_loss[loss=2.729, ArTop10Accuracy=0.7838, over 11892.74 frames. ], batch size: 27, lr: 6.66e-03
2024-08-06 13:33:41,496 INFO [trainer.py:765] (4/8) Epoch 18, batch 1100, train_loss[loss=2.731, ArTop10Accuracy=0.7878, over 13854.00 frames. ], tot_loss[loss=2.734, ArTop10Accuracy=0.7828, over 11966.56 frames. ], batch size: 34, lr: 6.64e-03
2024-08-06 13:34:54,673 INFO [trainer.py:765] (4/8) Epoch 18, batch 1200, train_loss[loss=2.876, ArTop10Accuracy=0.755, over 11688.00 frames. ], tot_loss[loss=2.733, ArTop10Accuracy=0.7828, over 11879.62 frames. ], batch size: 103, lr: 6.63e-03
2024-08-06 13:35:51,064 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.124e+02 1.340e+02 1.433e+02 1.533e+02 2.444e+02, threshold=2.867e+02, percent-clipped=0.0
2024-08-06 13:35:54,218 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 13:37:48,623 INFO [trainer.py:765] (4/8) Epoch 19, batch 100, train_loss[loss=2.786, ArTop10Accuracy=0.773, over 14562.00 frames. ], tot_loss[loss=2.709, ArTop10Accuracy=0.7871, over 4763.34 frames. ], batch size: 62, lr: 6.43e-03
2024-08-06 13:39:23,256 INFO [trainer.py:765] (4/8) Epoch 19, batch 200, train_loss[loss=2.706, ArTop10Accuracy=0.782, over 13527.00 frames. ], tot_loss[loss=2.711, ArTop10Accuracy=0.7867, over 7744.85 frames. ], batch size: 34, lr: 6.41e-03
2024-08-06 13:40:48,358 INFO [trainer.py:765] (4/8) Epoch 19, batch 300, train_loss[loss=2.735, ArTop10Accuracy=0.7868, over 14472.00 frames. ], tot_loss[loss=2.71, ArTop10Accuracy=0.7871, over 9377.25 frames. ], batch size: 46, lr: 6.40e-03
2024-08-06 13:42:21,067 INFO [trainer.py:765] (4/8) Epoch 19, batch 400, train_loss[loss=2.586, ArTop10Accuracy=0.8117, over 10197.00 frames. ], tot_loss[loss=2.703, ArTop10Accuracy=0.7883, over 10290.26 frames. ], batch size: 14, lr: 6.39e-03
2024-08-06 13:43:44,954 INFO [trainer.py:765] (4/8) Epoch 19, batch 500, train_loss[loss=2.667, ArTop10Accuracy=0.7974, over 12102.00 frames. ], tot_loss[loss=2.697, ArTop10Accuracy=0.7896, over 10853.44 frames. ], batch size: 22, lr: 6.37e-03
2024-08-06 13:45:16,681 INFO [trainer.py:765] (4/8) Epoch 19, batch 600, train_loss[loss=2.625, ArTop10Accuracy=0.8068, over 11361.00 frames. ], tot_loss[loss=2.703, ArTop10Accuracy=0.7886, over 11367.34 frames. ], batch size: 18, lr: 6.36e-03
2024-08-06 13:46:48,324 INFO [trainer.py:765] (4/8) Epoch 19, batch 700, train_loss[loss=2.687, ArTop10Accuracy=0.7831, over 10386.00 frames. ], tot_loss[loss=2.713, ArTop10Accuracy=0.7867, over 11508.06 frames. ], batch size: 12, lr: 6.35e-03
2024-08-06 13:48:11,883 INFO [trainer.py:765] (4/8) Epoch 19, batch 800, train_loss[loss=2.696, ArTop10Accuracy=0.7907, over 10185.00 frames. ], tot_loss[loss=2.717, ArTop10Accuracy=0.7858, over 11635.75 frames. ], batch size: 12, lr: 6.34e-03
2024-08-06 13:49:27,258 INFO [trainer.py:765] (4/8) Epoch 19, batch 900, train_loss[loss=2.68, ArTop10Accuracy=0.7937, over 12957.00 frames. ], tot_loss[loss=2.711, ArTop10Accuracy=0.7868, over 11686.07 frames. ], batch size: 27, lr: 6.32e-03
2024-08-06 13:50:40,653 INFO [trainer.py:803] (4/8) Computing validation loss
2024-08-06 13:50:50,537 INFO [trainer.py:811] (4/8) Epoch 19, validation: loss=2.818, ArTop10Accuracy=0.7679, over 1827537.00 frames. 
2024-08-06 13:50:50,537 INFO [trainer.py:814] (4/8) Maximum memory allocated so far is 32729MB
2024-08-06 13:50:51,489 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.161e+02 1.371e+02 1.455e+02 1.550e+02 3.697e+02, threshold=2.909e+02, percent-clipped=0.2
2024-08-06 13:50:52,915 INFO [trainer.py:765] (4/8) Epoch 19, batch 1000, train_loss[loss=2.761, ArTop10Accuracy=0.7747, over 12699.00 frames. ], tot_loss[loss=2.72, ArTop10Accuracy=0.7853, over 11884.10 frames. ], batch size: 27, lr: 6.31e-03
2024-08-06 13:52:08,265 INFO [trainer.py:765] (4/8) Epoch 19, batch 1100, train_loss[loss=2.701, ArTop10Accuracy=0.7904, over 13695.00 frames. ], tot_loss[loss=2.724, ArTop10Accuracy=0.7845, over 11953.71 frames. ], batch size: 34, lr: 6.30e-03
2024-08-06 13:53:22,313 INFO [trainer.py:765] (4/8) Epoch 19, batch 1200, train_loss[loss=2.831, ArTop10Accuracy=0.7577, over 12249.00 frames. ], tot_loss[loss=2.726, ArTop10Accuracy=0.7842, over 11861.23 frames. ], batch size: 101, lr: 6.28e-03
2024-08-06 13:54:21,708 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 13:56:12,907 INFO [trainer.py:765] (4/8) Epoch 20, batch 100, train_loss[loss=2.789, ArTop10Accuracy=0.7679, over 14760.00 frames. ], tot_loss[loss=2.713, ArTop10Accuracy=0.7857, over 4756.56 frames. ], batch size: 62, lr: 6.10e-03
2024-08-06 13:57:42,497 INFO [trainer.py:765] (4/8) Epoch 20, batch 200, train_loss[loss=2.639, ArTop10Accuracy=0.8007, over 13737.00 frames. ], tot_loss[loss=2.705, ArTop10Accuracy=0.7879, over 7746.44 frames. ], batch size: 34, lr: 6.09e-03
2024-08-06 13:59:15,430 INFO [trainer.py:765] (4/8) Epoch 20, batch 300, train_loss[loss=2.762, ArTop10Accuracy=0.7798, over 14253.00 frames. ], tot_loss[loss=2.699, ArTop10Accuracy=0.789, over 9366.73 frames. ], batch size: 45, lr: 6.08e-03
2024-08-06 14:00:44,356 INFO [trainer.py:765] (4/8) Epoch 20, batch 400, train_loss[loss=2.555, ArTop10Accuracy=0.8139, over 10905.00 frames. ], tot_loss[loss=2.696, ArTop10Accuracy=0.7895, over 10302.20 frames. ], batch size: 15, lr: 6.07e-03
2024-08-06 14:02:14,855 INFO [trainer.py:765] (4/8) Epoch 20, batch 500, train_loss[loss=2.66, ArTop10Accuracy=0.7958, over 12114.00 frames. ], tot_loss[loss=2.692, ArTop10Accuracy=0.7904, over 10858.12 frames. ], batch size: 22, lr: 6.06e-03
2024-08-06 14:03:40,856 INFO [trainer.py:765] (4/8) Epoch 20, batch 600, train_loss[loss=2.597, ArTop10Accuracy=0.8091, over 11571.00 frames. ], tot_loss[loss=2.695, ArTop10Accuracy=0.7899, over 11385.90 frames. ], batch size: 18, lr: 6.04e-03
2024-08-06 14:05:13,864 INFO [trainer.py:765] (4/8) Epoch 20, batch 700, train_loss[loss=2.717, ArTop10Accuracy=0.7839, over 9984.00 frames. ], tot_loss[loss=2.699, ArTop10Accuracy=0.7892, over 11521.50 frames. ], batch size: 12, lr: 6.03e-03
2024-08-06 14:05:30,791 INFO [optim.py:386] (4/8) Clipping_scale=2.0, grad-norm quartiles 1.180e+02 1.365e+02 1.456e+02 1.550e+02 3.525e+02, threshold=2.913e+02, percent-clipped=0.1
2024-08-06 14:06:34,509 INFO [trainer.py:765] (4/8) Epoch 20, batch 800, train_loss[loss=2.721, ArTop10Accuracy=0.7837, over 10083.00 frames. ], tot_loss[loss=2.705, ArTop10Accuracy=0.7881, over 11637.46 frames. ], batch size: 12, lr: 6.02e-03
2024-08-06 14:07:50,944 INFO [trainer.py:765] (4/8) Epoch 20, batch 900, train_loss[loss=2.635, ArTop10Accuracy=0.8005, over 12861.00 frames. ], tot_loss[loss=2.704, ArTop10Accuracy=0.7881, over 11700.37 frames. ], batch size: 27, lr: 6.01e-03
2024-08-06 14:09:07,173 INFO [trainer.py:765] (4/8) Epoch 20, batch 1000, train_loss[loss=2.693, ArTop10Accuracy=0.7967, over 12675.00 frames. ], tot_loss[loss=2.708, ArTop10Accuracy=0.7876, over 11883.15 frames. ], batch size: 27, lr: 6.00e-03
2024-08-06 14:10:21,210 INFO [trainer.py:765] (4/8) Epoch 20, batch 1100, train_loss[loss=2.709, ArTop10Accuracy=0.7851, over 13629.00 frames. ], tot_loss[loss=2.714, ArTop10Accuracy=0.7864, over 11931.12 frames. ], batch size: 34, lr: 5.99e-03
2024-08-06 14:11:37,813 INFO [trainer.py:765] (4/8) Epoch 20, batch 1200, train_loss[loss=2.855, ArTop10Accuracy=0.7594, over 11973.00 frames. ], tot_loss[loss=2.714, ArTop10Accuracy=0.7863, over 11830.00 frames. ], batch size: 105, lr: 5.98e-03
2024-08-06 14:12:37,299 INFO [trainer.py:650] (4/8) Reaches end of dataloader.
2024-08-06 14:12:37,301 INFO [trainer.py:1069] (4/8) Done!