File size: 71,469 Bytes
c96c265
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
2024-08-06 03:39:40,318 INFO [trainer.py:870] (2/8) Training started
2024-08-06 03:39:40,319 INFO [trainer.py:889] (2/8) Device: cuda:2
2024-08-06 03:39:40,319 INFO [trainer.py:890] (2/8) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 100, 'reset_interval': 200, 'valid_interval': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '279b0c87015a615b81b147251814d737a548f397', 'k2-git-date': 'Wed May 24 22:24:09 2023', 'lhotse-version': '1.26.0', 'torch-version': '2.0.1+cu118', 'torch-cuda-available': True, 'torch-cuda-version': '11.8', 'python-version': '3.10', 'icefall-git-branch': 'main', 'icefall-git-sha1': '7d2e5f4-dirty', 'icefall-git-date': 'Tue Aug 6 02:59:12 2024', 'icefall-path': '/workspace/icefall_llm', 'k2-path': '/usr/local/lib/python3.10/dist-packages/k2/__init__.py', 'lhotse-path': '/usr/local/lib/python3.10/dist-packages/lhotse/__init__.py', 'hostname': '6865771', 'IP address': '0.104.195.107'}, 'world_size': 8, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 20, 'start_epoch': 1, 'start_batch': 0, 'exp_dir': PosixPath('exp/valle'), 'optimizer_name': 'ScaledAdam', 'scheduler_name': 'Eden', 'base_lr': 0.03, 'warmup_steps': 200, 'seed': 42, 'inf_check': False, 'save_every_n': 1000, 'keep_last_k': 20, 'average_period': 0, 'accumulate_grad_steps': 1, 'dtype': 'bfloat16', 'filter_min_duration': 0.5, 'filter_max_duration': 14.0, 'train_stage': 1, 'visualize': False, 'oom_check': False, 'model_name': 'valle', 'decoder_dim': 1024, 'nhead': 16, 'num_decoder_layers': 12, 'scale_factor': 1.0, 'norm_first': True, 'add_prenet': False, 'prefix_mode': 1, 'share_embedding': True, 'prepend_bos': False, 'num_quantizers': 8, 'scaling_xformers': False, 'manifest_dir': PosixPath('data/tokenized'), 'max_duration': 320, 'bucketing_sampler': True, 'num_buckets': 6, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 0.1, 'on_the_fly_feats': False, 'shuffle': True, 'buffer_size': 40000, 'shuffle_buffer_size': 100000, 'drop_last': False, 'return_cuts': True, 'num_workers': 8, 'enable_spec_aug': False, 'spec_aug_time_warp_factor': 80, 'input_strategy': 'PrecomputedFeatures', 'dataset': 'libritts', 'text_tokens': 'data/tokenized/unique_text_tokens.k2symbols', 'sampling_rate': 24000}
2024-08-06 03:39:40,319 INFO [trainer.py:892] (2/8) About to create model
2024-08-06 03:39:41,079 INFO [trainer.py:899] (2/8) Number of model parameters: 367386628
2024-08-06 03:39:41,905 INFO [trainer.py:914] (2/8) Using DDP
2024-08-06 03:39:43,993 INFO [datamodule.py:427] (2/8) About to get train cuts
2024-08-06 03:39:43,995 INFO [datamodule.py:434] (2/8) About to get dev cuts
2024-08-06 03:39:43,997 INFO [datamodule.py:292] (2/8) Disable SpecAugment
2024-08-06 03:39:43,997 INFO [datamodule.py:294] (2/8) About to create train dataset
2024-08-06 03:39:43,998 INFO [datamodule.py:323] (2/8) Using DynamicBucketingSampler
2024-08-06 03:39:44,608 INFO [datamodule.py:344] (2/8) About to create train dataloader
2024-08-06 03:39:44,608 INFO [datamodule.py:367] (2/8) About to create dev dataset
2024-08-06 03:39:44,934 INFO [datamodule.py:388] (2/8) About to create dev dataloader
2024-08-06 03:40:39,570 INFO [trainer.py:765] (2/8) Epoch 1, batch 100, train_loss[loss=4.238, ArTop10Accuracy=0.4902, over 14599.00 frames. ], tot_loss[loss=4.784, ArTop10Accuracy=0.3953, over 4788.28 frames. ], batch size: 61, lr: 2.25e-02
2024-08-06 03:41:16,922 INFO [trainer.py:765] (2/8) Epoch 1, batch 200, train_loss[loss=3.97, ArTop10Accuracy=0.5311, over 13617.00 frames. ], tot_loss[loss=4.305, ArTop10Accuracy=0.4754, over 7786.16 frames. ], batch size: 34, lr: 3.00e-02
2024-08-06 03:41:57,950 INFO [trainer.py:765] (2/8) Epoch 1, batch 300, train_loss[loss=3.828, ArTop10Accuracy=0.5518, over 14061.00 frames. ], tot_loss[loss=4.093, ArTop10Accuracy=0.5099, over 9425.13 frames. ], batch size: 44, lr: 3.00e-02
2024-08-06 03:42:33,080 INFO [trainer.py:765] (2/8) Epoch 1, batch 400, train_loss[loss=3.717, ArTop10Accuracy=0.5717, over 11056.00 frames. ], tot_loss[loss=3.942, ArTop10Accuracy=0.5348, over 10340.47 frames. ], batch size: 15, lr: 3.00e-02
2024-08-06 03:43:11,271 INFO [trainer.py:765] (2/8) Epoch 1, batch 500, train_loss[loss=3.542, ArTop10Accuracy=0.6033, over 12246.00 frames. ], tot_loss[loss=3.831, ArTop10Accuracy=0.5535, over 10902.49 frames. ], batch size: 22, lr: 2.99e-02
2024-08-06 03:43:46,593 INFO [trainer.py:765] (2/8) Epoch 1, batch 600, train_loss[loss=3.484, ArTop10Accuracy=0.6182, over 11327.00 frames. ], tot_loss[loss=3.746, ArTop10Accuracy=0.568, over 11421.62 frames. ], batch size: 18, lr: 2.99e-02
2024-08-06 03:44:27,899 INFO [trainer.py:765] (2/8) Epoch 1, batch 700, train_loss[loss=3.646, ArTop10Accuracy=0.5823, over 10104.00 frames. ], tot_loss[loss=3.685, ArTop10Accuracy=0.579, over 11567.62 frames. ], batch size: 12, lr: 2.99e-02
2024-08-06 03:45:01,514 INFO [trainer.py:765] (2/8) Epoch 1, batch 800, train_loss[loss=3.438, ArTop10Accuracy=0.6215, over 10209.00 frames. ], tot_loss[loss=3.637, ArTop10Accuracy=0.5874, over 11679.20 frames. ], batch size: 12, lr: 2.98e-02
2024-08-06 03:45:32,558 INFO [trainer.py:765] (2/8) Epoch 1, batch 900, train_loss[loss=3.594, ArTop10Accuracy=0.5968, over 12892.00 frames. ], tot_loss[loss=3.583, ArTop10Accuracy=0.5973, over 11727.14 frames. ], batch size: 27, lr: 2.98e-02
2024-08-06 03:46:03,649 INFO [trainer.py:765] (2/8) Epoch 1, batch 1000, train_loss[loss=3.459, ArTop10Accuracy=0.62, over 12809.00 frames. ], tot_loss[loss=3.55, ArTop10Accuracy=0.6035, over 11929.18 frames. ], batch size: 27, lr: 2.97e-02
2024-08-06 03:46:07,988 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 8.169e+01 1.565e+02 2.239e+02 3.485e+02 9.105e+03, threshold=4.478e+02, percent-clipped=0.0
2024-08-06 03:46:38,612 INFO [trainer.py:765] (2/8) Epoch 1, batch 1100, train_loss[loss=3.54, ArTop10Accuracy=0.6066, over 13569.00 frames. ], tot_loss[loss=3.526, ArTop10Accuracy=0.6082, over 11994.07 frames. ], batch size: 34, lr: 2.96e-02
2024-08-06 03:47:08,745 INFO [trainer.py:765] (2/8) Epoch 1, batch 1200, train_loss[loss=3.542, ArTop10Accuracy=0.606, over 11052.00 frames. ], tot_loss[loss=3.496, ArTop10Accuracy=0.6136, over 11929.77 frames. ], batch size: 99, lr: 2.96e-02
2024-08-06 03:47:33,759 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 03:48:38,677 INFO [trainer.py:765] (2/8) Epoch 2, batch 100, train_loss[loss=3.423, ArTop10Accuracy=0.6282, over 14455.00 frames. ], tot_loss[loss=3.441, ArTop10Accuracy=0.6243, over 4794.60 frames. ], batch size: 61, lr: 2.90e-02
2024-08-06 03:49:14,597 INFO [trainer.py:765] (2/8) Epoch 2, batch 200, train_loss[loss=3.527, ArTop10Accuracy=0.6072, over 13942.00 frames. ], tot_loss[loss=3.438, ArTop10Accuracy=0.6251, over 7807.13 frames. ], batch size: 34, lr: 2.89e-02
2024-08-06 03:49:56,520 INFO [trainer.py:765] (2/8) Epoch 2, batch 300, train_loss[loss=3.462, ArTop10Accuracy=0.6197, over 14367.00 frames. ], tot_loss[loss=3.426, ArTop10Accuracy=0.6271, over 9433.56 frames. ], batch size: 44, lr: 2.89e-02
2024-08-06 03:50:32,000 INFO [trainer.py:765] (2/8) Epoch 2, batch 400, train_loss[loss=3.259, ArTop10Accuracy=0.6613, over 10127.00 frames. ], tot_loss[loss=3.414, ArTop10Accuracy=0.6297, over 10340.39 frames. ], batch size: 14, lr: 2.88e-02
2024-08-06 03:51:17,110 INFO [trainer.py:765] (2/8) Epoch 2, batch 500, train_loss[loss=3.289, ArTop10Accuracy=0.6452, over 12078.00 frames. ], tot_loss[loss=3.407, ArTop10Accuracy=0.631, over 10899.71 frames. ], batch size: 22, lr: 2.87e-02
2024-08-06 03:51:53,206 INFO [trainer.py:765] (2/8) Epoch 2, batch 600, train_loss[loss=3.37, ArTop10Accuracy=0.6379, over 11569.00 frames. ], tot_loss[loss=3.397, ArTop10Accuracy=0.633, over 11439.27 frames. ], batch size: 18, lr: 2.86e-02
2024-08-06 03:52:38,994 INFO [trainer.py:765] (2/8) Epoch 2, batch 700, train_loss[loss=3.362, ArTop10Accuracy=0.6345, over 10167.00 frames. ], tot_loss[loss=3.396, ArTop10Accuracy=0.6331, over 11591.15 frames. ], batch size: 12, lr: 2.85e-02
2024-08-06 03:52:47,092 INFO [trainer.py:803] (2/8) Computing validation loss
2024-08-06 03:52:56,023 INFO [trainer.py:811] (2/8) Epoch 2, validation: loss=3.327, ArTop10Accuracy=0.6492, over 1829298.00 frames. 
2024-08-06 03:52:56,024 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 28804MB
2024-08-06 03:52:56,541 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 8.181e+01 1.431e+02 1.849e+02 2.730e+02 2.344e+03, threshold=3.697e+02, percent-clipped=7.2
2024-08-06 03:53:21,881 INFO [trainer.py:765] (2/8) Epoch 2, batch 800, train_loss[loss=3.249, ArTop10Accuracy=0.6537, over 10208.00 frames. ], tot_loss[loss=3.386, ArTop10Accuracy=0.6347, over 11706.94 frames. ], batch size: 12, lr: 2.84e-02
2024-08-06 03:53:53,299 INFO [trainer.py:765] (2/8) Epoch 2, batch 900, train_loss[loss=3.354, ArTop10Accuracy=0.6334, over 12977.00 frames. ], tot_loss[loss=3.37, ArTop10Accuracy=0.638, over 11741.79 frames. ], batch size: 27, lr: 2.83e-02
2024-08-06 03:54:24,809 INFO [trainer.py:765] (2/8) Epoch 2, batch 1000, train_loss[loss=3.322, ArTop10Accuracy=0.6474, over 13032.00 frames. ], tot_loss[loss=3.363, ArTop10Accuracy=0.6394, over 11951.43 frames. ], batch size: 27, lr: 2.82e-02
2024-08-06 03:54:56,006 INFO [trainer.py:765] (2/8) Epoch 2, batch 1100, train_loss[loss=3.301, ArTop10Accuracy=0.6477, over 13736.00 frames. ], tot_loss[loss=3.362, ArTop10Accuracy=0.6397, over 11995.67 frames. ], batch size: 34, lr: 2.81e-02
2024-08-06 03:55:26,228 INFO [trainer.py:765] (2/8) Epoch 2, batch 1200, train_loss[loss=3.427, ArTop10Accuracy=0.632, over 12590.00 frames. ], tot_loss[loss=3.354, ArTop10Accuracy=0.6413, over 11940.59 frames. ], batch size: 99, lr: 2.80e-02
2024-08-06 03:55:51,132 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 03:57:04,102 INFO [trainer.py:765] (2/8) Epoch 3, batch 100, train_loss[loss=3.315, ArTop10Accuracy=0.6471, over 14358.00 frames. ], tot_loss[loss=3.315, ArTop10Accuracy=0.6487, over 4780.00 frames. ], batch size: 61, lr: 2.67e-02
2024-08-06 03:57:50,979 INFO [trainer.py:765] (2/8) Epoch 3, batch 200, train_loss[loss=3.234, ArTop10Accuracy=0.666, over 13654.00 frames. ], tot_loss[loss=3.293, ArTop10Accuracy=0.6534, over 7796.59 frames. ], batch size: 34, lr: 2.66e-02
2024-08-06 03:58:26,074 INFO [trainer.py:765] (2/8) Epoch 3, batch 300, train_loss[loss=3.247, ArTop10Accuracy=0.6639, over 14242.00 frames. ], tot_loss[loss=3.279, ArTop10Accuracy=0.6563, over 9417.80 frames. ], batch size: 44, lr: 2.64e-02
2024-08-06 03:59:11,253 INFO [trainer.py:765] (2/8) Epoch 3, batch 400, train_loss[loss=3.091, ArTop10Accuracy=0.6961, over 10417.00 frames. ], tot_loss[loss=3.26, ArTop10Accuracy=0.6596, over 10337.38 frames. ], batch size: 14, lr: 2.63e-02
2024-08-06 03:59:29,674 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 8.720e+01 1.461e+02 1.775e+02 2.344e+02 9.150e+02, threshold=3.550e+02, percent-clipped=5.2
2024-08-06 03:59:49,303 INFO [trainer.py:765] (2/8) Epoch 3, batch 500, train_loss[loss=3.186, ArTop10Accuracy=0.6673, over 12323.00 frames. ], tot_loss[loss=3.249, ArTop10Accuracy=0.6617, over 10910.34 frames. ], batch size: 22, lr: 2.62e-02
2024-08-06 04:00:35,095 INFO [trainer.py:765] (2/8) Epoch 3, batch 600, train_loss[loss=3.144, ArTop10Accuracy=0.6824, over 11789.00 frames. ], tot_loss[loss=3.23, ArTop10Accuracy=0.6655, over 11415.53 frames. ], batch size: 18, lr: 2.61e-02
2024-08-06 04:01:22,060 INFO [trainer.py:765] (2/8) Epoch 3, batch 700, train_loss[loss=3.153, ArTop10Accuracy=0.6802, over 10131.00 frames. ], tot_loss[loss=3.226, ArTop10Accuracy=0.6665, over 11576.97 frames. ], batch size: 12, lr: 2.60e-02
2024-08-06 04:01:56,270 INFO [trainer.py:765] (2/8) Epoch 3, batch 800, train_loss[loss=2.947, ArTop10Accuracy=0.7092, over 10105.00 frames. ], tot_loss[loss=3.216, ArTop10Accuracy=0.6688, over 11691.18 frames. ], batch size: 12, lr: 2.59e-02
2024-08-06 04:02:27,741 INFO [trainer.py:765] (2/8) Epoch 3, batch 900, train_loss[loss=3.278, ArTop10Accuracy=0.6582, over 12969.00 frames. ], tot_loss[loss=3.199, ArTop10Accuracy=0.6721, over 11735.44 frames. ], batch size: 27, lr: 2.57e-02
2024-08-06 04:02:59,284 INFO [trainer.py:765] (2/8) Epoch 3, batch 1000, train_loss[loss=3.106, ArTop10Accuracy=0.6993, over 13160.00 frames. ], tot_loss[loss=3.195, ArTop10Accuracy=0.6727, over 11942.81 frames. ], batch size: 27, lr: 2.56e-02
2024-08-06 04:03:30,942 INFO [trainer.py:765] (2/8) Epoch 3, batch 1100, train_loss[loss=3.136, ArTop10Accuracy=0.6876, over 13802.00 frames. ], tot_loss[loss=3.193, ArTop10Accuracy=0.6733, over 11997.13 frames. ], batch size: 34, lr: 2.55e-02
2024-08-06 04:04:01,313 INFO [trainer.py:765] (2/8) Epoch 3, batch 1200, train_loss[loss=3.282, ArTop10Accuracy=0.6577, over 12318.00 frames. ], tot_loss[loss=3.183, ArTop10Accuracy=0.6751, over 11945.69 frames. ], batch size: 99, lr: 2.54e-02
2024-08-06 04:04:26,694 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 04:05:43,369 INFO [trainer.py:765] (2/8) Epoch 4, batch 100, train_loss[loss=3.177, ArTop10Accuracy=0.6706, over 14302.00 frames. ], tot_loss[loss=3.14, ArTop10Accuracy=0.6842, over 4789.47 frames. ], batch size: 61, lr: 2.38e-02
2024-08-06 04:06:07,077 INFO [trainer.py:803] (2/8) Computing validation loss
2024-08-06 04:06:16,404 INFO [trainer.py:811] (2/8) Epoch 4, validation: loss=3.063, ArTop10Accuracy=0.7031, over 1829298.00 frames. 
2024-08-06 04:06:16,404 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33194MB
2024-08-06 04:06:16,746 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.091e+02 1.493e+02 1.709e+02 2.068e+02 7.969e+02, threshold=3.418e+02, percent-clipped=2.9
2024-08-06 04:06:31,826 INFO [trainer.py:765] (2/8) Epoch 4, batch 200, train_loss[loss=3.072, ArTop10Accuracy=0.7041, over 13627.00 frames. ], tot_loss[loss=3.122, ArTop10Accuracy=0.6886, over 7782.21 frames. ], batch size: 34, lr: 2.37e-02
2024-08-06 04:07:18,544 INFO [trainer.py:765] (2/8) Epoch 4, batch 300, train_loss[loss=3.159, ArTop10Accuracy=0.6799, over 14517.00 frames. ], tot_loss[loss=3.116, ArTop10Accuracy=0.6894, over 9421.67 frames. ], batch size: 44, lr: 2.36e-02
2024-08-06 04:08:01,910 INFO [trainer.py:765] (2/8) Epoch 4, batch 400, train_loss[loss=3.091, ArTop10Accuracy=0.6954, over 10874.00 frames. ], tot_loss[loss=3.109, ArTop10Accuracy=0.69, over 10347.99 frames. ], batch size: 15, lr: 2.34e-02
2024-08-06 04:08:45,344 INFO [trainer.py:765] (2/8) Epoch 4, batch 500, train_loss[loss=3.166, ArTop10Accuracy=0.6733, over 12378.00 frames. ], tot_loss[loss=3.105, ArTop10Accuracy=0.6905, over 10907.40 frames. ], batch size: 22, lr: 2.33e-02
2024-08-06 04:09:37,071 INFO [trainer.py:765] (2/8) Epoch 4, batch 600, train_loss[loss=3.147, ArTop10Accuracy=0.6757, over 11511.00 frames. ], tot_loss[loss=3.11, ArTop10Accuracy=0.6894, over 11418.96 frames. ], batch size: 18, lr: 2.32e-02
2024-08-06 04:10:13,501 INFO [trainer.py:765] (2/8) Epoch 4, batch 700, train_loss[loss=3.185, ArTop10Accuracy=0.6861, over 10217.00 frames. ], tot_loss[loss=3.117, ArTop10Accuracy=0.6885, over 11566.09 frames. ], batch size: 12, lr: 2.31e-02
2024-08-06 04:10:51,959 INFO [trainer.py:765] (2/8) Epoch 4, batch 800, train_loss[loss=3.132, ArTop10Accuracy=0.689, over 10153.00 frames. ], tot_loss[loss=3.116, ArTop10Accuracy=0.6886, over 11678.64 frames. ], batch size: 12, lr: 2.30e-02
2024-08-06 04:11:23,331 INFO [trainer.py:765] (2/8) Epoch 4, batch 900, train_loss[loss=3.056, ArTop10Accuracy=0.701, over 13147.00 frames. ], tot_loss[loss=3.106, ArTop10Accuracy=0.6906, over 11728.67 frames. ], batch size: 28, lr: 2.29e-02
2024-08-06 04:11:54,826 INFO [trainer.py:765] (2/8) Epoch 4, batch 1000, train_loss[loss=2.987, ArTop10Accuracy=0.7195, over 12903.00 frames. ], tot_loss[loss=3.102, ArTop10Accuracy=0.691, over 11945.47 frames. ], batch size: 27, lr: 2.28e-02
2024-08-06 04:12:25,959 INFO [trainer.py:765] (2/8) Epoch 4, batch 1100, train_loss[loss=3.082, ArTop10Accuracy=0.693, over 13680.00 frames. ], tot_loss[loss=3.107, ArTop10Accuracy=0.6902, over 11976.90 frames. ], batch size: 34, lr: 2.26e-02
2024-08-06 04:12:48,545 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.106e+02 1.440e+02 1.608e+02 1.893e+02 7.925e+02, threshold=3.216e+02, percent-clipped=2.0
2024-08-06 04:12:58,827 INFO [trainer.py:765] (2/8) Epoch 4, batch 1200, train_loss[loss=3.199, ArTop10Accuracy=0.675, over 11681.00 frames. ], tot_loss[loss=3.103, ArTop10Accuracy=0.6913, over 11926.91 frames. ], batch size: 98, lr: 2.25e-02
2024-08-06 04:13:24,340 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 04:14:38,685 INFO [trainer.py:765] (2/8) Epoch 5, batch 100, train_loss[loss=3.067, ArTop10Accuracy=0.6987, over 14655.00 frames. ], tot_loss[loss=3.059, ArTop10Accuracy=0.7007, over 4800.60 frames. ], batch size: 61, lr: 2.10e-02
2024-08-06 04:15:26,826 INFO [trainer.py:765] (2/8) Epoch 5, batch 200, train_loss[loss=3.074, ArTop10Accuracy=0.7022, over 13615.00 frames. ], tot_loss[loss=3.054, ArTop10Accuracy=0.7018, over 7801.70 frames. ], batch size: 34, lr: 2.09e-02
2024-08-06 04:16:08,011 INFO [trainer.py:765] (2/8) Epoch 5, batch 300, train_loss[loss=3.081, ArTop10Accuracy=0.6943, over 14161.00 frames. ], tot_loss[loss=3.047, ArTop10Accuracy=0.7033, over 9412.98 frames. ], batch size: 44, lr: 2.08e-02
2024-08-06 04:16:53,134 INFO [trainer.py:765] (2/8) Epoch 5, batch 400, train_loss[loss=3.186, ArTop10Accuracy=0.6745, over 10281.00 frames. ], tot_loss[loss=3.051, ArTop10Accuracy=0.7019, over 10329.99 frames. ], batch size: 14, lr: 2.07e-02
2024-08-06 04:17:36,638 INFO [trainer.py:765] (2/8) Epoch 5, batch 500, train_loss[loss=3.071, ArTop10Accuracy=0.6972, over 12362.00 frames. ], tot_loss[loss=3.048, ArTop10Accuracy=0.7025, over 10896.69 frames. ], batch size: 22, lr: 2.06e-02
2024-08-06 04:18:22,114 INFO [trainer.py:765] (2/8) Epoch 5, batch 600, train_loss[loss=3.095, ArTop10Accuracy=0.6862, over 11747.00 frames. ], tot_loss[loss=3.048, ArTop10Accuracy=0.7022, over 11433.37 frames. ], batch size: 18, lr: 2.05e-02
2024-08-06 04:19:17,033 INFO [trainer.py:765] (2/8) Epoch 5, batch 700, train_loss[loss=2.958, ArTop10Accuracy=0.7146, over 10286.00 frames. ], tot_loss[loss=3.058, ArTop10Accuracy=0.7003, over 11570.08 frames. ], batch size: 12, lr: 2.04e-02
2024-08-06 04:19:51,066 INFO [trainer.py:765] (2/8) Epoch 5, batch 800, train_loss[loss=3.108, ArTop10Accuracy=0.6899, over 10094.00 frames. ], tot_loss[loss=3.062, ArTop10Accuracy=0.6995, over 11685.00 frames. ], batch size: 12, lr: 2.03e-02
2024-08-06 04:20:18,214 INFO [trainer.py:803] (2/8) Computing validation loss
2024-08-06 04:20:27,476 INFO [trainer.py:811] (2/8) Epoch 5, validation: loss=2.998, ArTop10Accuracy=0.7157, over 1829298.00 frames. 
2024-08-06 04:20:27,476 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33194MB
2024-08-06 04:20:27,781 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.057e+02 1.385e+02 1.542e+02 1.759e+02 7.741e+02, threshold=3.083e+02, percent-clipped=0.7
2024-08-06 04:20:31,767 INFO [trainer.py:765] (2/8) Epoch 5, batch 900, train_loss[loss=3.109, ArTop10Accuracy=0.6952, over 12959.00 frames. ], tot_loss[loss=3.053, ArTop10Accuracy=0.7008, over 11728.45 frames. ], batch size: 27, lr: 2.02e-02
2024-08-06 04:21:03,306 INFO [trainer.py:765] (2/8) Epoch 5, batch 1000, train_loss[loss=3.045, ArTop10Accuracy=0.7074, over 12927.00 frames. ], tot_loss[loss=3.053, ArTop10Accuracy=0.7014, over 11930.08 frames. ], batch size: 27, lr: 2.01e-02
2024-08-06 04:21:34,451 INFO [trainer.py:765] (2/8) Epoch 5, batch 1100, train_loss[loss=3.074, ArTop10Accuracy=0.6958, over 13779.00 frames. ], tot_loss[loss=3.058, ArTop10Accuracy=0.7003, over 11984.06 frames. ], batch size: 34, lr: 2.00e-02
2024-08-06 04:22:04,752 INFO [trainer.py:765] (2/8) Epoch 5, batch 1200, train_loss[loss=3.174, ArTop10Accuracy=0.6778, over 12273.00 frames. ], tot_loss[loss=3.057, ArTop10Accuracy=0.7005, over 11929.53 frames. ], batch size: 99, lr: 1.99e-02
2024-08-06 04:22:30,397 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 04:23:46,282 INFO [trainer.py:765] (2/8) Epoch 6, batch 100, train_loss[loss=3.099, ArTop10Accuracy=0.6947, over 14271.00 frames. ], tot_loss[loss=3.027, ArTop10Accuracy=0.7069, over 4796.55 frames. ], batch size: 61, lr: 1.85e-02
2024-08-06 04:24:35,256 INFO [trainer.py:765] (2/8) Epoch 6, batch 200, train_loss[loss=3.011, ArTop10Accuracy=0.7076, over 13806.00 frames. ], tot_loss[loss=3.02, ArTop10Accuracy=0.7086, over 7806.97 frames. ], batch size: 34, lr: 1.84e-02
2024-08-06 04:25:16,676 INFO [trainer.py:765] (2/8) Epoch 6, batch 300, train_loss[loss=3.069, ArTop10Accuracy=0.6979, over 14163.00 frames. ], tot_loss[loss=3.018, ArTop10Accuracy=0.7089, over 9428.27 frames. ], batch size: 44, lr: 1.83e-02
2024-08-06 04:26:08,924 INFO [trainer.py:765] (2/8) Epoch 6, batch 400, train_loss[loss=2.913, ArTop10Accuracy=0.7321, over 10243.00 frames. ], tot_loss[loss=3.012, ArTop10Accuracy=0.7096, over 10322.01 frames. ], batch size: 14, lr: 1.83e-02
2024-08-06 04:26:51,486 INFO [trainer.py:765] (2/8) Epoch 6, batch 500, train_loss[loss=3.093, ArTop10Accuracy=0.6874, over 12358.00 frames. ], tot_loss[loss=3.009, ArTop10Accuracy=0.71, over 10891.35 frames. ], batch size: 22, lr: 1.82e-02
2024-08-06 04:27:39,298 INFO [trainer.py:765] (2/8) Epoch 6, batch 600, train_loss[loss=2.981, ArTop10Accuracy=0.7169, over 11535.00 frames. ], tot_loss[loss=3.014, ArTop10Accuracy=0.7088, over 11414.81 frames. ], batch size: 18, lr: 1.81e-02
2024-08-06 04:27:46,370 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.054e+02 1.343e+02 1.474e+02 1.660e+02 8.574e+02, threshold=2.947e+02, percent-clipped=0.6
2024-08-06 04:28:33,239 INFO [trainer.py:765] (2/8) Epoch 6, batch 700, train_loss[loss=2.864, ArTop10Accuracy=0.7439, over 10079.00 frames. ], tot_loss[loss=3.02, ArTop10Accuracy=0.7075, over 11550.48 frames. ], batch size: 12, lr: 1.80e-02
2024-08-06 04:29:11,216 INFO [trainer.py:765] (2/8) Epoch 6, batch 800, train_loss[loss=3.082, ArTop10Accuracy=0.6921, over 10204.00 frames. ], tot_loss[loss=3.026, ArTop10Accuracy=0.7065, over 11675.56 frames. ], batch size: 12, lr: 1.79e-02
2024-08-06 04:29:42,752 INFO [trainer.py:765] (2/8) Epoch 6, batch 900, train_loss[loss=3.021, ArTop10Accuracy=0.7036, over 12896.00 frames. ], tot_loss[loss=3.021, ArTop10Accuracy=0.7078, over 11744.03 frames. ], batch size: 27, lr: 1.78e-02
2024-08-06 04:30:14,306 INFO [trainer.py:765] (2/8) Epoch 6, batch 1000, train_loss[loss=3.099, ArTop10Accuracy=0.6958, over 12892.00 frames. ], tot_loss[loss=3.025, ArTop10Accuracy=0.7068, over 11954.61 frames. ], batch size: 27, lr: 1.77e-02
2024-08-06 04:30:45,384 INFO [trainer.py:765] (2/8) Epoch 6, batch 1100, train_loss[loss=3.054, ArTop10Accuracy=0.7074, over 13657.00 frames. ], tot_loss[loss=3.031, ArTop10Accuracy=0.7058, over 11989.60 frames. ], batch size: 34, lr: 1.77e-02
2024-08-06 04:31:15,673 INFO [trainer.py:765] (2/8) Epoch 6, batch 1200, train_loss[loss=3.197, ArTop10Accuracy=0.6765, over 11611.00 frames. ], tot_loss[loss=3.026, ArTop10Accuracy=0.7065, over 11928.69 frames. ], batch size: 98, lr: 1.76e-02
2024-08-06 04:31:40,624 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 04:32:52,405 INFO [trainer.py:765] (2/8) Epoch 7, batch 100, train_loss[loss=3.098, ArTop10Accuracy=0.6977, over 14447.00 frames. ], tot_loss[loss=2.988, ArTop10Accuracy=0.7144, over 4767.78 frames. ], batch size: 61, lr: 1.64e-02
2024-08-06 04:33:38,224 INFO [trainer.py:765] (2/8) Epoch 7, batch 200, train_loss[loss=3.006, ArTop10Accuracy=0.7124, over 13782.00 frames. ], tot_loss[loss=2.984, ArTop10Accuracy=0.7154, over 7764.47 frames. ], batch size: 34, lr: 1.64e-02
2024-08-06 04:34:22,609 INFO [trainer.py:765] (2/8) Epoch 7, batch 300, train_loss[loss=3.026, ArTop10Accuracy=0.7082, over 14388.00 frames. ], tot_loss[loss=2.987, ArTop10Accuracy=0.7148, over 9405.42 frames. ], batch size: 44, lr: 1.63e-02
2024-08-06 04:34:36,848 INFO [trainer.py:803] (2/8) Computing validation loss
2024-08-06 04:34:45,809 INFO [trainer.py:811] (2/8) Epoch 7, validation: loss=2.963, ArTop10Accuracy=0.7233, over 1829298.00 frames. 
2024-08-06 04:34:45,810 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33194MB
2024-08-06 04:34:46,125 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.009e+02 1.306e+02 1.435e+02 1.599e+02 8.689e+02, threshold=2.871e+02, percent-clipped=0.9
2024-08-06 04:35:17,147 INFO [trainer.py:765] (2/8) Epoch 7, batch 400, train_loss[loss=3.029, ArTop10Accuracy=0.7126, over 10839.00 frames. ], tot_loss[loss=2.988, ArTop10Accuracy=0.7148, over 10324.61 frames. ], batch size: 15, lr: 1.62e-02
2024-08-06 04:36:01,711 INFO [trainer.py:765] (2/8) Epoch 7, batch 500, train_loss[loss=2.873, ArTop10Accuracy=0.7396, over 12169.00 frames. ], tot_loss[loss=2.987, ArTop10Accuracy=0.7146, over 10888.61 frames. ], batch size: 22, lr: 1.61e-02
2024-08-06 04:36:48,812 INFO [trainer.py:765] (2/8) Epoch 7, batch 600, train_loss[loss=2.969, ArTop10Accuracy=0.7169, over 11577.00 frames. ], tot_loss[loss=2.989, ArTop10Accuracy=0.7139, over 11406.76 frames. ], batch size: 18, lr: 1.61e-02
2024-08-06 04:37:34,800 INFO [trainer.py:765] (2/8) Epoch 7, batch 700, train_loss[loss=2.885, ArTop10Accuracy=0.7319, over 10136.00 frames. ], tot_loss[loss=2.996, ArTop10Accuracy=0.7127, over 11568.04 frames. ], batch size: 12, lr: 1.60e-02
2024-08-06 04:38:13,614 INFO [trainer.py:765] (2/8) Epoch 7, batch 800, train_loss[loss=2.811, ArTop10Accuracy=0.7477, over 10693.00 frames. ], tot_loss[loss=3, ArTop10Accuracy=0.712, over 11681.76 frames. ], batch size: 13, lr: 1.59e-02
2024-08-06 04:38:45,110 INFO [trainer.py:765] (2/8) Epoch 7, batch 900, train_loss[loss=3.053, ArTop10Accuracy=0.703, over 13248.00 frames. ], tot_loss[loss=2.991, ArTop10Accuracy=0.7138, over 11741.89 frames. ], batch size: 27, lr: 1.59e-02
2024-08-06 04:39:16,575 INFO [trainer.py:765] (2/8) Epoch 7, batch 1000, train_loss[loss=2.957, ArTop10Accuracy=0.7173, over 12975.00 frames. ], tot_loss[loss=2.991, ArTop10Accuracy=0.7137, over 11959.74 frames. ], batch size: 27, lr: 1.58e-02
2024-08-06 04:39:47,571 INFO [trainer.py:765] (2/8) Epoch 7, batch 1100, train_loss[loss=3.027, ArTop10Accuracy=0.7077, over 13800.00 frames. ], tot_loss[loss=3, ArTop10Accuracy=0.712, over 12002.90 frames. ], batch size: 34, lr: 1.57e-02
2024-08-06 04:40:17,989 INFO [trainer.py:765] (2/8) Epoch 7, batch 1200, train_loss[loss=3.136, ArTop10Accuracy=0.6861, over 12887.00 frames. ], tot_loss[loss=2.996, ArTop10Accuracy=0.7127, over 11943.73 frames. ], batch size: 99, lr: 1.57e-02
2024-08-06 04:40:43,363 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 04:41:37,492 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 9.816e+01 1.295e+02 1.411e+02 1.574e+02 4.953e+02, threshold=2.821e+02, percent-clipped=1.1
2024-08-06 04:41:58,371 INFO [trainer.py:765] (2/8) Epoch 8, batch 100, train_loss[loss=3.046, ArTop10Accuracy=0.7047, over 14552.00 frames. ], tot_loss[loss=2.971, ArTop10Accuracy=0.7182, over 4792.34 frames. ], batch size: 61, lr: 1.47e-02
2024-08-06 04:42:44,986 INFO [trainer.py:765] (2/8) Epoch 8, batch 200, train_loss[loss=2.974, ArTop10Accuracy=0.7225, over 13235.00 frames. ], tot_loss[loss=2.95, ArTop10Accuracy=0.7225, over 7789.63 frames. ], batch size: 33, lr: 1.46e-02
2024-08-06 04:43:28,045 INFO [trainer.py:765] (2/8) Epoch 8, batch 300, train_loss[loss=3.041, ArTop10Accuracy=0.7051, over 14321.00 frames. ], tot_loss[loss=2.949, ArTop10Accuracy=0.7227, over 9418.52 frames. ], batch size: 44, lr: 1.46e-02
2024-08-06 04:44:14,461 INFO [trainer.py:765] (2/8) Epoch 8, batch 400, train_loss[loss=3.094, ArTop10Accuracy=0.7061, over 10312.00 frames. ], tot_loss[loss=2.954, ArTop10Accuracy=0.7218, over 10342.89 frames. ], batch size: 14, lr: 1.45e-02
2024-08-06 04:45:00,692 INFO [trainer.py:765] (2/8) Epoch 8, batch 500, train_loss[loss=2.953, ArTop10Accuracy=0.7192, over 12418.00 frames. ], tot_loss[loss=2.951, ArTop10Accuracy=0.722, over 10914.11 frames. ], batch size: 22, lr: 1.45e-02
2024-08-06 04:45:45,393 INFO [trainer.py:765] (2/8) Epoch 8, batch 600, train_loss[loss=2.951, ArTop10Accuracy=0.7179, over 11643.00 frames. ], tot_loss[loss=2.959, ArTop10Accuracy=0.7201, over 11430.71 frames. ], batch size: 18, lr: 1.44e-02
2024-08-06 04:46:34,037 INFO [trainer.py:765] (2/8) Epoch 8, batch 700, train_loss[loss=2.88, ArTop10Accuracy=0.7407, over 10089.00 frames. ], tot_loss[loss=2.97, ArTop10Accuracy=0.7179, over 11557.64 frames. ], batch size: 12, lr: 1.43e-02
2024-08-06 04:47:10,207 INFO [trainer.py:765] (2/8) Epoch 8, batch 800, train_loss[loss=2.942, ArTop10Accuracy=0.7197, over 9999.00 frames. ], tot_loss[loss=2.973, ArTop10Accuracy=0.7171, over 11684.33 frames. ], batch size: 12, lr: 1.43e-02
2024-08-06 04:47:41,605 INFO [trainer.py:765] (2/8) Epoch 8, batch 900, train_loss[loss=2.918, ArTop10Accuracy=0.7281, over 13035.00 frames. ], tot_loss[loss=2.969, ArTop10Accuracy=0.7179, over 11727.15 frames. ], batch size: 27, lr: 1.42e-02
2024-08-06 04:48:13,033 INFO [trainer.py:765] (2/8) Epoch 8, batch 1000, train_loss[loss=2.956, ArTop10Accuracy=0.7221, over 12840.00 frames. ], tot_loss[loss=2.97, ArTop10Accuracy=0.7178, over 11936.16 frames. ], batch size: 27, lr: 1.42e-02
2024-08-06 04:48:28,828 INFO [trainer.py:803] (2/8) Computing validation loss
2024-08-06 04:48:37,663 INFO [trainer.py:811] (2/8) Epoch 8, validation: loss=2.946, ArTop10Accuracy=0.7266, over 1829298.00 frames. 
2024-08-06 04:48:37,664 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33194MB
2024-08-06 04:48:37,951 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.035e+02 1.289e+02 1.393e+02 1.532e+02 3.557e+02, threshold=2.786e+02, percent-clipped=0.2
2024-08-06 04:48:52,931 INFO [trainer.py:765] (2/8) Epoch 8, batch 1100, train_loss[loss=2.905, ArTop10Accuracy=0.7247, over 13471.00 frames. ], tot_loss[loss=2.974, ArTop10Accuracy=0.7167, over 11991.47 frames. ], batch size: 34, lr: 1.41e-02
2024-08-06 04:49:23,202 INFO [trainer.py:765] (2/8) Epoch 8, batch 1200, train_loss[loss=3.151, ArTop10Accuracy=0.6828, over 12836.00 frames. ], tot_loss[loss=2.978, ArTop10Accuracy=0.7161, over 11952.83 frames. ], batch size: 97, lr: 1.40e-02
2024-08-06 04:49:48,393 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 04:51:01,547 INFO [trainer.py:765] (2/8) Epoch 9, batch 100, train_loss[loss=3.072, ArTop10Accuracy=0.6984, over 14479.00 frames. ], tot_loss[loss=2.952, ArTop10Accuracy=0.7219, over 4767.15 frames. ], batch size: 61, lr: 1.32e-02
2024-08-06 04:51:45,414 INFO [trainer.py:765] (2/8) Epoch 9, batch 200, train_loss[loss=3.028, ArTop10Accuracy=0.7102, over 13817.00 frames. ], tot_loss[loss=2.94, ArTop10Accuracy=0.7243, over 7779.31 frames. ], batch size: 34, lr: 1.32e-02
2024-08-06 04:52:29,082 INFO [trainer.py:765] (2/8) Epoch 9, batch 300, train_loss[loss=2.961, ArTop10Accuracy=0.7219, over 14264.00 frames. ], tot_loss[loss=2.938, ArTop10Accuracy=0.7249, over 9398.27 frames. ], batch size: 44, lr: 1.31e-02
2024-08-06 04:53:16,431 INFO [trainer.py:765] (2/8) Epoch 9, batch 400, train_loss[loss=2.907, ArTop10Accuracy=0.7297, over 10737.00 frames. ], tot_loss[loss=2.936, ArTop10Accuracy=0.725, over 10337.25 frames. ], batch size: 15, lr: 1.31e-02
2024-08-06 04:53:58,143 INFO [trainer.py:765] (2/8) Epoch 9, batch 500, train_loss[loss=2.943, ArTop10Accuracy=0.7245, over 12110.00 frames. ], tot_loss[loss=2.932, ArTop10Accuracy=0.7254, over 10890.50 frames. ], batch size: 22, lr: 1.30e-02
2024-08-06 04:54:51,077 INFO [trainer.py:765] (2/8) Epoch 9, batch 600, train_loss[loss=2.921, ArTop10Accuracy=0.7325, over 11728.00 frames. ], tot_loss[loss=2.941, ArTop10Accuracy=0.7234, over 11423.61 frames. ], batch size: 18, lr: 1.30e-02
2024-08-06 04:55:34,399 INFO [trainer.py:765] (2/8) Epoch 9, batch 700, train_loss[loss=2.84, ArTop10Accuracy=0.7416, over 10190.00 frames. ], tot_loss[loss=2.948, ArTop10Accuracy=0.7222, over 11578.40 frames. ], batch size: 12, lr: 1.29e-02
2024-08-06 04:56:04,575 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.029e+02 1.257e+02 1.367e+02 1.507e+02 8.820e+02, threshold=2.735e+02, percent-clipped=0.5
2024-08-06 04:56:13,598 INFO [trainer.py:765] (2/8) Epoch 9, batch 800, train_loss[loss=2.872, ArTop10Accuracy=0.7408, over 10110.00 frames. ], tot_loss[loss=2.953, ArTop10Accuracy=0.7213, over 11698.31 frames. ], batch size: 12, lr: 1.29e-02
2024-08-06 04:56:44,975 INFO [trainer.py:765] (2/8) Epoch 9, batch 900, train_loss[loss=2.936, ArTop10Accuracy=0.7295, over 13064.00 frames. ], tot_loss[loss=2.949, ArTop10Accuracy=0.7222, over 11748.18 frames. ], batch size: 27, lr: 1.28e-02
2024-08-06 04:57:16,491 INFO [trainer.py:765] (2/8) Epoch 9, batch 1000, train_loss[loss=2.958, ArTop10Accuracy=0.7276, over 13016.00 frames. ], tot_loss[loss=2.955, ArTop10Accuracy=0.7213, over 11944.72 frames. ], batch size: 27, lr: 1.28e-02
2024-08-06 04:57:47,657 INFO [trainer.py:765] (2/8) Epoch 9, batch 1100, train_loss[loss=3, ArTop10Accuracy=0.7154, over 13793.00 frames. ], tot_loss[loss=2.965, ArTop10Accuracy=0.7189, over 12005.74 frames. ], batch size: 34, lr: 1.27e-02
2024-08-06 04:58:18,093 INFO [trainer.py:765] (2/8) Epoch 9, batch 1200, train_loss[loss=3.099, ArTop10Accuracy=0.6887, over 12366.00 frames. ], tot_loss[loss=2.962, ArTop10Accuracy=0.7196, over 11971.89 frames. ], batch size: 98, lr: 1.27e-02
2024-08-06 04:58:43,379 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 04:59:52,750 INFO [trainer.py:765] (2/8) Epoch 10, batch 100, train_loss[loss=2.978, ArTop10Accuracy=0.7164, over 14878.00 frames. ], tot_loss[loss=2.924, ArTop10Accuracy=0.7286, over 4780.23 frames. ], batch size: 63, lr: 1.20e-02
2024-08-06 05:00:43,730 INFO [trainer.py:765] (2/8) Epoch 10, batch 200, train_loss[loss=2.88, ArTop10Accuracy=0.7405, over 13887.00 frames. ], tot_loss[loss=2.919, ArTop10Accuracy=0.7293, over 7803.44 frames. ], batch size: 34, lr: 1.20e-02
2024-08-06 05:01:20,591 INFO [trainer.py:765] (2/8) Epoch 10, batch 300, train_loss[loss=2.964, ArTop10Accuracy=0.7188, over 14173.00 frames. ], tot_loss[loss=2.912, ArTop10Accuracy=0.7301, over 9425.87 frames. ], batch size: 44, lr: 1.19e-02
2024-08-06 05:02:10,048 INFO [trainer.py:765] (2/8) Epoch 10, batch 400, train_loss[loss=2.955, ArTop10Accuracy=0.7291, over 10429.00 frames. ], tot_loss[loss=2.914, ArTop10Accuracy=0.7295, over 10335.80 frames. ], batch size: 14, lr: 1.19e-02
2024-08-06 05:02:46,488 INFO [trainer.py:803] (2/8) Computing validation loss
2024-08-06 05:02:55,377 INFO [trainer.py:811] (2/8) Epoch 10, validation: loss=2.927, ArTop10Accuracy=0.7304, over 1829298.00 frames. 
2024-08-06 05:02:55,378 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33194MB
2024-08-06 05:02:55,728 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.023e+02 1.269e+02 1.367e+02 1.518e+02 4.405e+02, threshold=2.733e+02, percent-clipped=0.4
2024-08-06 05:02:58,361 INFO [trainer.py:765] (2/8) Epoch 10, batch 500, train_loss[loss=2.875, ArTop10Accuracy=0.7371, over 12161.00 frames. ], tot_loss[loss=2.915, ArTop10Accuracy=0.7288, over 10893.26 frames. ], batch size: 22, lr: 1.19e-02
2024-08-06 05:03:48,229 INFO [trainer.py:765] (2/8) Epoch 10, batch 600, train_loss[loss=2.871, ArTop10Accuracy=0.7365, over 11569.00 frames. ], tot_loss[loss=2.918, ArTop10Accuracy=0.7282, over 11428.81 frames. ], batch size: 18, lr: 1.18e-02
2024-08-06 05:04:36,715 INFO [trainer.py:765] (2/8) Epoch 10, batch 700, train_loss[loss=2.612, ArTop10Accuracy=0.7818, over 10102.00 frames. ], tot_loss[loss=2.931, ArTop10Accuracy=0.7254, over 11580.42 frames. ], batch size: 12, lr: 1.18e-02
2024-08-06 05:05:10,726 INFO [trainer.py:765] (2/8) Epoch 10, batch 800, train_loss[loss=2.908, ArTop10Accuracy=0.7233, over 10210.00 frames. ], tot_loss[loss=2.939, ArTop10Accuracy=0.7238, over 11684.13 frames. ], batch size: 12, lr: 1.17e-02
2024-08-06 05:05:42,245 INFO [trainer.py:765] (2/8) Epoch 10, batch 900, train_loss[loss=2.944, ArTop10Accuracy=0.7194, over 12966.00 frames. ], tot_loss[loss=2.928, ArTop10Accuracy=0.7258, over 11742.25 frames. ], batch size: 27, lr: 1.17e-02
2024-08-06 05:06:13,844 INFO [trainer.py:765] (2/8) Epoch 10, batch 1000, train_loss[loss=2.983, ArTop10Accuracy=0.7219, over 12821.00 frames. ], tot_loss[loss=2.925, ArTop10Accuracy=0.7266, over 11939.45 frames. ], batch size: 27, lr: 1.16e-02
2024-08-06 05:06:45,055 INFO [trainer.py:765] (2/8) Epoch 10, batch 1100, train_loss[loss=3.068, ArTop10Accuracy=0.7031, over 13737.00 frames. ], tot_loss[loss=2.941, ArTop10Accuracy=0.7238, over 12009.40 frames. ], batch size: 34, lr: 1.16e-02
2024-08-06 05:07:15,484 INFO [trainer.py:765] (2/8) Epoch 10, batch 1200, train_loss[loss=3.01, ArTop10Accuracy=0.7078, over 12358.00 frames. ], tot_loss[loss=2.94, ArTop10Accuracy=0.724, over 11950.99 frames. ], batch size: 97, lr: 1.16e-02
2024-08-06 05:07:40,243 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 05:08:52,966 INFO [trainer.py:765] (2/8) Epoch 11, batch 100, train_loss[loss=2.925, ArTop10Accuracy=0.7288, over 14495.00 frames. ], tot_loss[loss=2.906, ArTop10Accuracy=0.7314, over 4770.40 frames. ], batch size: 61, lr: 1.10e-02
2024-08-06 05:09:41,277 INFO [trainer.py:765] (2/8) Epoch 11, batch 200, train_loss[loss=2.929, ArTop10Accuracy=0.7319, over 13841.00 frames. ], tot_loss[loss=2.905, ArTop10Accuracy=0.7314, over 7772.06 frames. ], batch size: 34, lr: 1.10e-02
2024-08-06 05:09:51,176 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.001e+02 1.278e+02 1.371e+02 1.502e+02 3.785e+02, threshold=2.743e+02, percent-clipped=0.3
2024-08-06 05:10:24,720 INFO [trainer.py:765] (2/8) Epoch 11, batch 300, train_loss[loss=2.998, ArTop10Accuracy=0.7166, over 14187.00 frames. ], tot_loss[loss=2.901, ArTop10Accuracy=0.7321, over 9389.76 frames. ], batch size: 44, lr: 1.09e-02
2024-08-06 05:11:11,784 INFO [trainer.py:765] (2/8) Epoch 11, batch 400, train_loss[loss=2.766, ArTop10Accuracy=0.7571, over 10857.00 frames. ], tot_loss[loss=2.9, ArTop10Accuracy=0.7322, over 10329.19 frames. ], batch size: 15, lr: 1.09e-02
2024-08-06 05:11:52,692 INFO [trainer.py:765] (2/8) Epoch 11, batch 500, train_loss[loss=2.9, ArTop10Accuracy=0.7279, over 12227.00 frames. ], tot_loss[loss=2.904, ArTop10Accuracy=0.7313, over 10891.57 frames. ], batch size: 22, lr: 1.09e-02
2024-08-06 05:12:40,287 INFO [trainer.py:765] (2/8) Epoch 11, batch 600, train_loss[loss=2.773, ArTop10Accuracy=0.7548, over 11681.00 frames. ], tot_loss[loss=2.909, ArTop10Accuracy=0.7303, over 11432.08 frames. ], batch size: 18, lr: 1.08e-02
2024-08-06 05:13:25,708 INFO [trainer.py:765] (2/8) Epoch 11, batch 700, train_loss[loss=2.781, ArTop10Accuracy=0.7469, over 10182.00 frames. ], tot_loss[loss=2.918, ArTop10Accuracy=0.7287, over 11569.39 frames. ], batch size: 12, lr: 1.08e-02
2024-08-06 05:14:04,206 INFO [trainer.py:765] (2/8) Epoch 11, batch 800, train_loss[loss=2.793, ArTop10Accuracy=0.7451, over 10069.00 frames. ], tot_loss[loss=2.921, ArTop10Accuracy=0.728, over 11681.47 frames. ], batch size: 12, lr: 1.07e-02
2024-08-06 05:14:35,666 INFO [trainer.py:765] (2/8) Epoch 11, batch 900, train_loss[loss=2.867, ArTop10Accuracy=0.7368, over 12934.00 frames. ], tot_loss[loss=2.914, ArTop10Accuracy=0.7294, over 11728.60 frames. ], batch size: 27, lr: 1.07e-02
2024-08-06 05:15:07,263 INFO [trainer.py:765] (2/8) Epoch 11, batch 1000, train_loss[loss=2.997, ArTop10Accuracy=0.7159, over 13352.00 frames. ], tot_loss[loss=2.914, ArTop10Accuracy=0.7291, over 11927.60 frames. ], batch size: 28, lr: 1.07e-02
2024-08-06 05:15:38,259 INFO [trainer.py:765] (2/8) Epoch 11, batch 1100, train_loss[loss=2.882, ArTop10Accuracy=0.732, over 13622.00 frames. ], tot_loss[loss=2.92, ArTop10Accuracy=0.7277, over 11984.60 frames. ], batch size: 34, lr: 1.06e-02
2024-08-06 05:16:08,497 INFO [trainer.py:765] (2/8) Epoch 11, batch 1200, train_loss[loss=3.096, ArTop10Accuracy=0.693, over 12237.00 frames. ], tot_loss[loss=2.923, ArTop10Accuracy=0.7274, over 11936.43 frames. ], batch size: 98, lr: 1.06e-02
2024-08-06 05:16:12,697 INFO [trainer.py:803] (2/8) Computing validation loss
2024-08-06 05:16:21,622 INFO [trainer.py:811] (2/8) Epoch 11, validation: loss=2.923, ArTop10Accuracy=0.7318, over 1829298.00 frames. 
2024-08-06 05:16:21,623 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33194MB
2024-08-06 05:16:21,949 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.076e+02 1.268e+02 1.368e+02 1.481e+02 4.790e+02, threshold=2.736e+02, percent-clipped=0.6
2024-08-06 05:16:42,805 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 05:18:03,006 INFO [trainer.py:765] (2/8) Epoch 12, batch 100, train_loss[loss=2.985, ArTop10Accuracy=0.7173, over 14457.00 frames. ], tot_loss[loss=2.887, ArTop10Accuracy=0.7351, over 4751.58 frames. ], batch size: 62, lr: 1.01e-02
2024-08-06 05:18:46,005 INFO [trainer.py:765] (2/8) Epoch 12, batch 200, train_loss[loss=2.876, ArTop10Accuracy=0.7382, over 13936.00 frames. ], tot_loss[loss=2.878, ArTop10Accuracy=0.7367, over 7778.74 frames. ], batch size: 34, lr: 1.01e-02
2024-08-06 05:19:31,947 INFO [trainer.py:765] (2/8) Epoch 12, batch 300, train_loss[loss=2.918, ArTop10Accuracy=0.7335, over 13852.00 frames. ], tot_loss[loss=2.874, ArTop10Accuracy=0.7374, over 9395.43 frames. ], batch size: 43, lr: 1.01e-02
2024-08-06 05:20:12,431 INFO [trainer.py:765] (2/8) Epoch 12, batch 400, train_loss[loss=2.814, ArTop10Accuracy=0.7557, over 10314.00 frames. ], tot_loss[loss=2.882, ArTop10Accuracy=0.736, over 10309.33 frames. ], batch size: 14, lr: 1.00e-02
2024-08-06 05:21:00,640 INFO [trainer.py:765] (2/8) Epoch 12, batch 500, train_loss[loss=2.952, ArTop10Accuracy=0.7228, over 12138.00 frames. ], tot_loss[loss=2.884, ArTop10Accuracy=0.7356, over 10884.38 frames. ], batch size: 22, lr: 9.99e-03
2024-08-06 05:21:43,916 INFO [trainer.py:765] (2/8) Epoch 12, batch 600, train_loss[loss=2.802, ArTop10Accuracy=0.7451, over 11621.00 frames. ], tot_loss[loss=2.89, ArTop10Accuracy=0.734, over 11423.38 frames. ], batch size: 18, lr: 9.96e-03
2024-08-06 05:22:32,206 INFO [trainer.py:765] (2/8) Epoch 12, batch 700, train_loss[loss=2.767, ArTop10Accuracy=0.7568, over 9980.00 frames. ], tot_loss[loss=2.897, ArTop10Accuracy=0.7326, over 11570.51 frames. ], batch size: 12, lr: 9.93e-03
2024-08-06 05:23:08,912 INFO [trainer.py:765] (2/8) Epoch 12, batch 800, train_loss[loss=2.875, ArTop10Accuracy=0.7385, over 10064.00 frames. ], tot_loss[loss=2.905, ArTop10Accuracy=0.731, over 11669.44 frames. ], batch size: 12, lr: 9.90e-03
2024-08-06 05:23:40,460 INFO [trainer.py:765] (2/8) Epoch 12, batch 900, train_loss[loss=2.835, ArTop10Accuracy=0.7463, over 13139.00 frames. ], tot_loss[loss=2.897, ArTop10Accuracy=0.7327, over 11730.86 frames. ], batch size: 27, lr: 9.87e-03
2024-08-06 05:23:54,576 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.067e+02 1.273e+02 1.376e+02 1.503e+02 4.050e+02, threshold=2.752e+02, percent-clipped=0.4
2024-08-06 05:24:14,346 INFO [trainer.py:765] (2/8) Epoch 12, batch 1000, train_loss[loss=2.817, ArTop10Accuracy=0.7436, over 13009.00 frames. ], tot_loss[loss=2.899, ArTop10Accuracy=0.7322, over 11941.24 frames. ], batch size: 27, lr: 9.84e-03
2024-08-06 05:24:45,502 INFO [trainer.py:765] (2/8) Epoch 12, batch 1100, train_loss[loss=2.936, ArTop10Accuracy=0.7239, over 13711.00 frames. ], tot_loss[loss=2.906, ArTop10Accuracy=0.7308, over 12005.16 frames. ], batch size: 34, lr: 9.81e-03
2024-08-06 05:25:15,882 INFO [trainer.py:765] (2/8) Epoch 12, batch 1200, train_loss[loss=3.01, ArTop10Accuracy=0.7104, over 12131.00 frames. ], tot_loss[loss=2.907, ArTop10Accuracy=0.7307, over 11944.94 frames. ], batch size: 98, lr: 9.78e-03
2024-08-06 05:25:41,436 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 05:26:46,787 INFO [trainer.py:765] (2/8) Epoch 13, batch 100, train_loss[loss=2.932, ArTop10Accuracy=0.7276, over 14692.00 frames. ], tot_loss[loss=2.877, ArTop10Accuracy=0.7377, over 4774.15 frames. ], batch size: 61, lr: 9.36e-03
2024-08-06 05:27:32,552 INFO [trainer.py:765] (2/8) Epoch 13, batch 200, train_loss[loss=2.878, ArTop10Accuracy=0.7415, over 13615.00 frames. ], tot_loss[loss=2.874, ArTop10Accuracy=0.7378, over 7798.34 frames. ], batch size: 34, lr: 9.34e-03
2024-08-06 05:28:16,036 INFO [trainer.py:765] (2/8) Epoch 13, batch 300, train_loss[loss=2.861, ArTop10Accuracy=0.7421, over 14256.00 frames. ], tot_loss[loss=2.871, ArTop10Accuracy=0.7385, over 9419.42 frames. ], batch size: 44, lr: 9.31e-03
2024-08-06 05:29:00,149 INFO [trainer.py:765] (2/8) Epoch 13, batch 400, train_loss[loss=2.646, ArTop10Accuracy=0.7744, over 10261.00 frames. ], tot_loss[loss=2.872, ArTop10Accuracy=0.7378, over 10316.57 frames. ], batch size: 14, lr: 9.28e-03
2024-08-06 05:29:43,967 INFO [trainer.py:765] (2/8) Epoch 13, batch 500, train_loss[loss=2.773, ArTop10Accuracy=0.7575, over 12363.00 frames. ], tot_loss[loss=2.871, ArTop10Accuracy=0.738, over 10885.74 frames. ], batch size: 22, lr: 9.26e-03
2024-08-06 05:30:24,247 INFO [trainer.py:765] (2/8) Epoch 13, batch 600, train_loss[loss=2.885, ArTop10Accuracy=0.7434, over 11525.00 frames. ], tot_loss[loss=2.877, ArTop10Accuracy=0.7367, over 11421.59 frames. ], batch size: 18, lr: 9.23e-03
2024-08-06 05:30:58,110 INFO [trainer.py:803] (2/8) Computing validation loss
2024-08-06 05:31:07,054 INFO [trainer.py:811] (2/8) Epoch 13, validation: loss=2.918, ArTop10Accuracy=0.733, over 1829298.00 frames. 
2024-08-06 05:31:07,054 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33194MB
2024-08-06 05:31:07,351 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.049e+02 1.283e+02 1.389e+02 1.496e+02 2.729e+02, threshold=2.779e+02, percent-clipped=0.0
2024-08-06 05:31:24,043 INFO [trainer.py:765] (2/8) Epoch 13, batch 700, train_loss[loss=2.894, ArTop10Accuracy=0.7257, over 9882.00 frames. ], tot_loss[loss=2.884, ArTop10Accuracy=0.7352, over 11544.89 frames. ], batch size: 12, lr: 9.20e-03
2024-08-06 05:32:00,147 INFO [trainer.py:765] (2/8) Epoch 13, batch 800, train_loss[loss=2.705, ArTop10Accuracy=0.7712, over 10220.00 frames. ], tot_loss[loss=2.887, ArTop10Accuracy=0.7348, over 11668.46 frames. ], batch size: 12, lr: 9.18e-03
2024-08-06 05:32:31,521 INFO [trainer.py:765] (2/8) Epoch 13, batch 900, train_loss[loss=2.834, ArTop10Accuracy=0.7448, over 12923.00 frames. ], tot_loss[loss=2.884, ArTop10Accuracy=0.7353, over 11731.17 frames. ], batch size: 27, lr: 9.15e-03
2024-08-06 05:33:03,043 INFO [trainer.py:765] (2/8) Epoch 13, batch 1000, train_loss[loss=2.729, ArTop10Accuracy=0.7598, over 12946.00 frames. ], tot_loss[loss=2.883, ArTop10Accuracy=0.7352, over 11939.27 frames. ], batch size: 27, lr: 9.13e-03
2024-08-06 05:33:34,233 INFO [trainer.py:765] (2/8) Epoch 13, batch 1100, train_loss[loss=2.955, ArTop10Accuracy=0.7218, over 13772.00 frames. ], tot_loss[loss=2.893, ArTop10Accuracy=0.7333, over 12016.87 frames. ], batch size: 34, lr: 9.10e-03
2024-08-06 05:34:04,519 INFO [trainer.py:765] (2/8) Epoch 13, batch 1200, train_loss[loss=3.045, ArTop10Accuracy=0.7098, over 11902.00 frames. ], tot_loss[loss=2.892, ArTop10Accuracy=0.7331, over 11935.20 frames. ], batch size: 97, lr: 9.07e-03
2024-08-06 05:34:29,356 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 05:35:39,198 INFO [trainer.py:765] (2/8) Epoch 14, batch 100, train_loss[loss=2.872, ArTop10Accuracy=0.7365, over 14687.00 frames. ], tot_loss[loss=2.862, ArTop10Accuracy=0.7403, over 4792.39 frames. ], batch size: 61, lr: 8.71e-03
2024-08-06 05:36:23,063 INFO [trainer.py:765] (2/8) Epoch 14, batch 200, train_loss[loss=2.867, ArTop10Accuracy=0.7411, over 13728.00 frames. ], tot_loss[loss=2.856, ArTop10Accuracy=0.7417, over 7798.85 frames. ], batch size: 34, lr: 8.68e-03
2024-08-06 05:37:09,309 INFO [trainer.py:765] (2/8) Epoch 14, batch 300, train_loss[loss=2.913, ArTop10Accuracy=0.7299, over 14314.00 frames. ], tot_loss[loss=2.854, ArTop10Accuracy=0.7418, over 9424.44 frames. ], batch size: 44, lr: 8.66e-03
2024-08-06 05:37:46,030 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.097e+02 1.304e+02 1.410e+02 1.531e+02 2.912e+02, threshold=2.820e+02, percent-clipped=0.2
2024-08-06 05:37:55,139 INFO [trainer.py:765] (2/8) Epoch 14, batch 400, train_loss[loss=2.777, ArTop10Accuracy=0.7459, over 10801.00 frames. ], tot_loss[loss=2.853, ArTop10Accuracy=0.7419, over 10315.88 frames. ], batch size: 15, lr: 8.64e-03
2024-08-06 05:38:42,025 INFO [trainer.py:765] (2/8) Epoch 14, batch 500, train_loss[loss=2.76, ArTop10Accuracy=0.7576, over 12396.00 frames. ], tot_loss[loss=2.852, ArTop10Accuracy=0.7416, over 10890.81 frames. ], batch size: 22, lr: 8.61e-03
2024-08-06 05:39:22,374 INFO [trainer.py:765] (2/8) Epoch 14, batch 600, train_loss[loss=2.873, ArTop10Accuracy=0.7326, over 11490.00 frames. ], tot_loss[loss=2.858, ArTop10Accuracy=0.7404, over 11418.29 frames. ], batch size: 18, lr: 8.59e-03
2024-08-06 05:40:15,143 INFO [trainer.py:765] (2/8) Epoch 14, batch 700, train_loss[loss=2.907, ArTop10Accuracy=0.7269, over 10082.00 frames. ], tot_loss[loss=2.866, ArTop10Accuracy=0.7385, over 11551.12 frames. ], batch size: 12, lr: 8.57e-03
2024-08-06 05:40:49,136 INFO [trainer.py:765] (2/8) Epoch 14, batch 800, train_loss[loss=2.667, ArTop10Accuracy=0.7775, over 9850.00 frames. ], tot_loss[loss=2.875, ArTop10Accuracy=0.7373, over 11668.77 frames. ], batch size: 12, lr: 8.55e-03
2024-08-06 05:41:20,466 INFO [trainer.py:765] (2/8) Epoch 14, batch 900, train_loss[loss=2.876, ArTop10Accuracy=0.7369, over 12986.00 frames. ], tot_loss[loss=2.867, ArTop10Accuracy=0.7386, over 11723.01 frames. ], batch size: 27, lr: 8.52e-03
2024-08-06 05:41:51,995 INFO [trainer.py:765] (2/8) Epoch 14, batch 1000, train_loss[loss=2.83, ArTop10Accuracy=0.7426, over 13044.00 frames. ], tot_loss[loss=2.871, ArTop10Accuracy=0.7377, over 11933.16 frames. ], batch size: 27, lr: 8.50e-03
2024-08-06 05:42:23,217 INFO [trainer.py:765] (2/8) Epoch 14, batch 1100, train_loss[loss=2.842, ArTop10Accuracy=0.7439, over 13773.00 frames. ], tot_loss[loss=2.879, ArTop10Accuracy=0.7363, over 11999.74 frames. ], batch size: 34, lr: 8.48e-03
2024-08-06 05:42:53,549 INFO [trainer.py:765] (2/8) Epoch 14, batch 1200, train_loss[loss=3.028, ArTop10Accuracy=0.7076, over 12237.00 frames. ], tot_loss[loss=2.877, ArTop10Accuracy=0.7364, over 11949.06 frames. ], batch size: 98, lr: 8.46e-03
2024-08-06 05:43:19,085 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 05:44:28,572 INFO [trainer.py:765] (2/8) Epoch 15, batch 100, train_loss[loss=2.963, ArTop10Accuracy=0.7197, over 14826.00 frames. ], tot_loss[loss=2.857, ArTop10Accuracy=0.7422, over 4788.38 frames. ], batch size: 62, lr: 8.14e-03
2024-08-06 05:44:29,214 INFO [trainer.py:803] (2/8) Computing validation loss
2024-08-06 05:44:38,023 INFO [trainer.py:811] (2/8) Epoch 15, validation: loss=2.913, ArTop10Accuracy=0.7339, over 1829298.00 frames. 
2024-08-06 05:44:38,024 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33246MB
2024-08-06 05:44:38,413 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.100e+02 1.307e+02 1.417e+02 1.528e+02 2.981e+02, threshold=2.833e+02, percent-clipped=0.1
2024-08-06 05:45:20,185 INFO [trainer.py:765] (2/8) Epoch 15, batch 200, train_loss[loss=2.807, ArTop10Accuracy=0.7537, over 13710.00 frames. ], tot_loss[loss=2.845, ArTop10Accuracy=0.7439, over 7805.06 frames. ], batch size: 34, lr: 8.11e-03
2024-08-06 05:46:04,647 INFO [trainer.py:765] (2/8) Epoch 15, batch 300, train_loss[loss=2.942, ArTop10Accuracy=0.7258, over 14508.00 frames. ], tot_loss[loss=2.843, ArTop10Accuracy=0.7439, over 9429.96 frames. ], batch size: 44, lr: 8.09e-03
2024-08-06 05:46:51,902 INFO [trainer.py:765] (2/8) Epoch 15, batch 400, train_loss[loss=2.655, ArTop10Accuracy=0.7743, over 10217.00 frames. ], tot_loss[loss=2.842, ArTop10Accuracy=0.7438, over 10322.02 frames. ], batch size: 14, lr: 8.07e-03
2024-08-06 05:47:36,911 INFO [trainer.py:765] (2/8) Epoch 15, batch 500, train_loss[loss=2.736, ArTop10Accuracy=0.7632, over 12303.00 frames. ], tot_loss[loss=2.842, ArTop10Accuracy=0.7434, over 10881.57 frames. ], batch size: 22, lr: 8.05e-03
2024-08-06 05:48:24,723 INFO [trainer.py:765] (2/8) Epoch 15, batch 600, train_loss[loss=2.761, ArTop10Accuracy=0.7587, over 11526.00 frames. ], tot_loss[loss=2.847, ArTop10Accuracy=0.7425, over 11419.26 frames. ], batch size: 18, lr: 8.03e-03
2024-08-06 05:49:11,856 INFO [trainer.py:765] (2/8) Epoch 15, batch 700, train_loss[loss=2.846, ArTop10Accuracy=0.7335, over 10062.00 frames. ], tot_loss[loss=2.855, ArTop10Accuracy=0.7405, over 11564.98 frames. ], batch size: 12, lr: 8.01e-03
2024-08-06 05:49:45,779 INFO [trainer.py:765] (2/8) Epoch 15, batch 800, train_loss[loss=2.925, ArTop10Accuracy=0.7188, over 9224.00 frames. ], tot_loss[loss=2.864, ArTop10Accuracy=0.7391, over 11660.66 frames. ], batch size: 11, lr: 7.99e-03
2024-08-06 05:50:17,210 INFO [trainer.py:765] (2/8) Epoch 15, batch 900, train_loss[loss=3.033, ArTop10Accuracy=0.7154, over 13057.00 frames. ], tot_loss[loss=2.854, ArTop10Accuracy=0.7409, over 11730.28 frames. ], batch size: 27, lr: 7.97e-03
2024-08-06 05:50:48,830 INFO [trainer.py:765] (2/8) Epoch 15, batch 1000, train_loss[loss=2.834, ArTop10Accuracy=0.7484, over 12972.00 frames. ], tot_loss[loss=2.858, ArTop10Accuracy=0.7404, over 11937.53 frames. ], batch size: 27, lr: 7.95e-03
2024-08-06 05:51:20,070 INFO [trainer.py:765] (2/8) Epoch 15, batch 1100, train_loss[loss=2.899, ArTop10Accuracy=0.7337, over 13832.00 frames. ], tot_loss[loss=2.867, ArTop10Accuracy=0.7386, over 11992.96 frames. ], batch size: 34, lr: 7.93e-03
2024-08-06 05:51:23,515 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.123e+02 1.337e+02 1.431e+02 1.541e+02 2.784e+02, threshold=2.862e+02, percent-clipped=0.0
2024-08-06 05:51:53,082 INFO [trainer.py:765] (2/8) Epoch 15, batch 1200, train_loss[loss=2.971, ArTop10Accuracy=0.7191, over 12900.00 frames. ], tot_loss[loss=2.873, ArTop10Accuracy=0.738, over 11941.95 frames. ], batch size: 99, lr: 7.91e-03
2024-08-06 05:52:18,086 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 05:53:29,263 INFO [trainer.py:765] (2/8) Epoch 16, batch 100, train_loss[loss=2.894, ArTop10Accuracy=0.7377, over 14427.00 frames. ], tot_loss[loss=2.834, ArTop10Accuracy=0.7468, over 4785.00 frames. ], batch size: 61, lr: 7.63e-03
2024-08-06 05:54:12,877 INFO [trainer.py:765] (2/8) Epoch 16, batch 200, train_loss[loss=2.831, ArTop10Accuracy=0.7462, over 13780.00 frames. ], tot_loss[loss=2.833, ArTop10Accuracy=0.7467, over 7781.14 frames. ], batch size: 34, lr: 7.61e-03
2024-08-06 05:54:59,737 INFO [trainer.py:765] (2/8) Epoch 16, batch 300, train_loss[loss=2.906, ArTop10Accuracy=0.7335, over 14155.00 frames. ], tot_loss[loss=2.831, ArTop10Accuracy=0.7468, over 9420.48 frames. ], batch size: 44, lr: 7.59e-03
2024-08-06 05:55:41,930 INFO [trainer.py:765] (2/8) Epoch 16, batch 400, train_loss[loss=2.686, ArTop10Accuracy=0.7733, over 10961.00 frames. ], tot_loss[loss=2.831, ArTop10Accuracy=0.7466, over 10323.32 frames. ], batch size: 15, lr: 7.58e-03
2024-08-06 05:56:27,680 INFO [trainer.py:765] (2/8) Epoch 16, batch 500, train_loss[loss=2.809, ArTop10Accuracy=0.7594, over 12164.00 frames. ], tot_loss[loss=2.83, ArTop10Accuracy=0.7463, over 10905.04 frames. ], batch size: 22, lr: 7.56e-03
2024-08-06 05:57:12,439 INFO [trainer.py:765] (2/8) Epoch 16, batch 600, train_loss[loss=2.71, ArTop10Accuracy=0.7724, over 11487.00 frames. ], tot_loss[loss=2.834, ArTop10Accuracy=0.7452, over 11438.14 frames. ], batch size: 18, lr: 7.54e-03
2024-08-06 05:58:00,040 INFO [trainer.py:765] (2/8) Epoch 16, batch 700, train_loss[loss=2.843, ArTop10Accuracy=0.7465, over 9292.00 frames. ], tot_loss[loss=2.84, ArTop10Accuracy=0.7441, over 11558.69 frames. ], batch size: 11, lr: 7.52e-03
2024-08-06 05:58:34,024 INFO [trainer.py:765] (2/8) Epoch 16, batch 800, train_loss[loss=2.79, ArTop10Accuracy=0.7507, over 10225.00 frames. ], tot_loss[loss=2.85, ArTop10Accuracy=0.7423, over 11679.46 frames. ], batch size: 12, lr: 7.50e-03
2024-08-06 05:58:41,569 INFO [trainer.py:803] (2/8) Computing validation loss
2024-08-06 05:58:50,426 INFO [trainer.py:811] (2/8) Epoch 16, validation: loss=2.915, ArTop10Accuracy=0.7338, over 1829298.00 frames. 
2024-08-06 05:58:50,427 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33340MB
2024-08-06 05:58:50,730 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.121e+02 1.335e+02 1.445e+02 1.570e+02 3.252e+02, threshold=2.890e+02, percent-clipped=0.1
2024-08-06 05:59:14,321 INFO [trainer.py:765] (2/8) Epoch 16, batch 900, train_loss[loss=2.791, ArTop10Accuracy=0.7515, over 12798.00 frames. ], tot_loss[loss=2.846, ArTop10Accuracy=0.7429, over 11723.23 frames. ], batch size: 27, lr: 7.49e-03
2024-08-06 05:59:45,915 INFO [trainer.py:765] (2/8) Epoch 16, batch 1000, train_loss[loss=2.776, ArTop10Accuracy=0.7557, over 12832.00 frames. ], tot_loss[loss=2.85, ArTop10Accuracy=0.7421, over 11918.58 frames. ], batch size: 27, lr: 7.47e-03
2024-08-06 06:00:17,091 INFO [trainer.py:765] (2/8) Epoch 16, batch 1100, train_loss[loss=2.883, ArTop10Accuracy=0.7371, over 13757.00 frames. ], tot_loss[loss=2.86, ArTop10Accuracy=0.7403, over 11985.14 frames. ], batch size: 34, lr: 7.45e-03
2024-08-06 06:00:47,464 INFO [trainer.py:765] (2/8) Epoch 16, batch 1200, train_loss[loss=3.013, ArTop10Accuracy=0.7111, over 12027.00 frames. ], tot_loss[loss=2.856, ArTop10Accuracy=0.7408, over 11942.83 frames. ], batch size: 97, lr: 7.43e-03
2024-08-06 06:01:12,361 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 06:02:27,261 INFO [trainer.py:765] (2/8) Epoch 17, batch 100, train_loss[loss=2.949, ArTop10Accuracy=0.7286, over 14418.00 frames. ], tot_loss[loss=2.835, ArTop10Accuracy=0.7461, over 4798.04 frames. ], batch size: 61, lr: 7.18e-03
2024-08-06 06:03:11,850 INFO [trainer.py:765] (2/8) Epoch 17, batch 200, train_loss[loss=2.77, ArTop10Accuracy=0.7507, over 13471.00 frames. ], tot_loss[loss=2.834, ArTop10Accuracy=0.746, over 7806.93 frames. ], batch size: 34, lr: 7.17e-03
2024-08-06 06:03:57,502 INFO [trainer.py:765] (2/8) Epoch 17, batch 300, train_loss[loss=2.929, ArTop10Accuracy=0.7311, over 14126.00 frames. ], tot_loss[loss=2.822, ArTop10Accuracy=0.748, over 9433.07 frames. ], batch size: 44, lr: 7.15e-03
2024-08-06 06:04:42,838 INFO [trainer.py:765] (2/8) Epoch 17, batch 400, train_loss[loss=2.643, ArTop10Accuracy=0.7722, over 10347.00 frames. ], tot_loss[loss=2.822, ArTop10Accuracy=0.7477, over 10349.35 frames. ], batch size: 14, lr: 7.13e-03
2024-08-06 06:05:29,004 INFO [trainer.py:765] (2/8) Epoch 17, batch 500, train_loss[loss=2.906, ArTop10Accuracy=0.7349, over 12205.00 frames. ], tot_loss[loss=2.818, ArTop10Accuracy=0.7484, over 10923.35 frames. ], batch size: 22, lr: 7.12e-03
2024-08-06 06:05:49,551 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.142e+02 1.359e+02 1.445e+02 1.551e+02 2.741e+02, threshold=2.891e+02, percent-clipped=0.0
2024-08-06 06:06:20,723 INFO [trainer.py:765] (2/8) Epoch 17, batch 600, train_loss[loss=2.804, ArTop10Accuracy=0.7508, over 11660.00 frames. ], tot_loss[loss=2.829, ArTop10Accuracy=0.7461, over 11452.61 frames. ], batch size: 18, lr: 7.10e-03
2024-08-06 06:07:04,695 INFO [trainer.py:765] (2/8) Epoch 17, batch 700, train_loss[loss=2.703, ArTop10Accuracy=0.7668, over 10055.00 frames. ], tot_loss[loss=2.83, ArTop10Accuracy=0.7458, over 11584.29 frames. ], batch size: 12, lr: 7.09e-03
2024-08-06 06:07:44,896 INFO [trainer.py:765] (2/8) Epoch 17, batch 800, train_loss[loss=2.682, ArTop10Accuracy=0.7718, over 9902.00 frames. ], tot_loss[loss=2.833, ArTop10Accuracy=0.7453, over 11687.60 frames. ], batch size: 12, lr: 7.07e-03
2024-08-06 06:08:16,384 INFO [trainer.py:765] (2/8) Epoch 17, batch 900, train_loss[loss=2.868, ArTop10Accuracy=0.7369, over 13122.00 frames. ], tot_loss[loss=2.831, ArTop10Accuracy=0.7456, over 11742.05 frames. ], batch size: 27, lr: 7.05e-03
2024-08-06 06:08:47,995 INFO [trainer.py:765] (2/8) Epoch 17, batch 1000, train_loss[loss=2.71, ArTop10Accuracy=0.7683, over 12741.00 frames. ], tot_loss[loss=2.838, ArTop10Accuracy=0.7443, over 11942.29 frames. ], batch size: 27, lr: 7.04e-03
2024-08-06 06:09:19,134 INFO [trainer.py:765] (2/8) Epoch 17, batch 1100, train_loss[loss=2.813, ArTop10Accuracy=0.7484, over 13705.00 frames. ], tot_loss[loss=2.845, ArTop10Accuracy=0.7428, over 11995.99 frames. ], batch size: 34, lr: 7.02e-03
2024-08-06 06:09:49,446 INFO [trainer.py:765] (2/8) Epoch 17, batch 1200, train_loss[loss=2.972, ArTop10Accuracy=0.7206, over 12667.00 frames. ], tot_loss[loss=2.843, ArTop10Accuracy=0.7434, over 11937.41 frames. ], batch size: 100, lr: 7.01e-03
2024-08-06 06:10:15,269 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 06:11:23,101 INFO [trainer.py:765] (2/8) Epoch 18, batch 100, train_loss[loss=2.985, ArTop10Accuracy=0.7228, over 14784.00 frames. ], tot_loss[loss=2.825, ArTop10Accuracy=0.7484, over 4797.01 frames. ], batch size: 62, lr: 6.78e-03
2024-08-06 06:12:16,259 INFO [trainer.py:765] (2/8) Epoch 18, batch 200, train_loss[loss=2.809, ArTop10Accuracy=0.7482, over 13618.00 frames. ], tot_loss[loss=2.821, ArTop10Accuracy=0.7491, over 7816.57 frames. ], batch size: 34, lr: 6.77e-03
2024-08-06 06:12:40,317 INFO [trainer.py:803] (2/8) Computing validation loss
2024-08-06 06:12:48,991 INFO [trainer.py:811] (2/8) Epoch 18, validation: loss=2.916, ArTop10Accuracy=0.7343, over 1829298.00 frames. 
2024-08-06 06:12:48,992 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33340MB
2024-08-06 06:12:49,335 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.163e+02 1.377e+02 1.476e+02 1.588e+02 2.450e+02, threshold=2.952e+02, percent-clipped=0.0
2024-08-06 06:13:07,116 INFO [trainer.py:765] (2/8) Epoch 18, batch 300, train_loss[loss=2.862, ArTop10Accuracy=0.747, over 14437.00 frames. ], tot_loss[loss=2.811, ArTop10Accuracy=0.7507, over 9441.11 frames. ], batch size: 44, lr: 6.75e-03
2024-08-06 06:13:54,098 INFO [trainer.py:765] (2/8) Epoch 18, batch 400, train_loss[loss=2.705, ArTop10Accuracy=0.7705, over 10374.00 frames. ], tot_loss[loss=2.81, ArTop10Accuracy=0.7506, over 10324.79 frames. ], batch size: 14, lr: 6.74e-03
2024-08-06 06:14:38,488 INFO [trainer.py:765] (2/8) Epoch 18, batch 500, train_loss[loss=2.831, ArTop10Accuracy=0.7445, over 12176.00 frames. ], tot_loss[loss=2.804, ArTop10Accuracy=0.7513, over 10888.06 frames. ], batch size: 22, lr: 6.73e-03
2024-08-06 06:15:23,628 INFO [trainer.py:765] (2/8) Epoch 18, batch 600, train_loss[loss=2.787, ArTop10Accuracy=0.7518, over 12008.00 frames. ], tot_loss[loss=2.814, ArTop10Accuracy=0.7496, over 11428.27 frames. ], batch size: 19, lr: 6.71e-03
2024-08-06 06:16:17,342 INFO [trainer.py:765] (2/8) Epoch 18, batch 700, train_loss[loss=2.535, ArTop10Accuracy=0.7898, over 10077.00 frames. ], tot_loss[loss=2.822, ArTop10Accuracy=0.7477, over 11595.39 frames. ], batch size: 12, lr: 6.70e-03
2024-08-06 06:16:51,428 INFO [trainer.py:765] (2/8) Epoch 18, batch 800, train_loss[loss=2.81, ArTop10Accuracy=0.7513, over 10196.00 frames. ], tot_loss[loss=2.832, ArTop10Accuracy=0.7458, over 11703.66 frames. ], batch size: 12, lr: 6.68e-03
2024-08-06 06:17:22,913 INFO [trainer.py:765] (2/8) Epoch 18, batch 900, train_loss[loss=2.845, ArTop10Accuracy=0.7459, over 12829.00 frames. ], tot_loss[loss=2.826, ArTop10Accuracy=0.7472, over 11742.87 frames. ], batch size: 27, lr: 6.67e-03
2024-08-06 06:17:54,529 INFO [trainer.py:765] (2/8) Epoch 18, batch 1000, train_loss[loss=2.83, ArTop10Accuracy=0.7482, over 12975.00 frames. ], tot_loss[loss=2.833, ArTop10Accuracy=0.7459, over 11949.79 frames. ], batch size: 27, lr: 6.65e-03
2024-08-06 06:18:25,664 INFO [trainer.py:765] (2/8) Epoch 18, batch 1100, train_loss[loss=2.837, ArTop10Accuracy=0.7422, over 13645.00 frames. ], tot_loss[loss=2.842, ArTop10Accuracy=0.744, over 12011.69 frames. ], batch size: 34, lr: 6.64e-03
2024-08-06 06:18:55,972 INFO [trainer.py:765] (2/8) Epoch 18, batch 1200, train_loss[loss=3.045, ArTop10Accuracy=0.6986, over 11883.00 frames. ], tot_loss[loss=2.841, ArTop10Accuracy=0.7439, over 11955.71 frames. ], batch size: 97, lr: 6.63e-03
2024-08-06 06:19:19,163 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.178e+02 1.387e+02 1.492e+02 1.607e+02 2.982e+02, threshold=2.983e+02, percent-clipped=0.1
2024-08-06 06:19:23,732 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 06:20:29,727 INFO [trainer.py:765] (2/8) Epoch 19, batch 100, train_loss[loss=2.794, ArTop10Accuracy=0.7559, over 14034.00 frames. ], tot_loss[loss=2.8, ArTop10Accuracy=0.7529, over 4769.06 frames. ], batch size: 61, lr: 6.43e-03
2024-08-06 06:21:11,274 INFO [trainer.py:765] (2/8) Epoch 19, batch 200, train_loss[loss=2.774, ArTop10Accuracy=0.7534, over 13748.00 frames. ], tot_loss[loss=2.806, ArTop10Accuracy=0.752, over 7787.60 frames. ], batch size: 34, lr: 6.41e-03
2024-08-06 06:21:56,077 INFO [trainer.py:765] (2/8) Epoch 19, batch 300, train_loss[loss=2.935, ArTop10Accuracy=0.7293, over 14360.00 frames. ], tot_loss[loss=2.807, ArTop10Accuracy=0.7514, over 9416.01 frames. ], batch size: 44, lr: 6.40e-03
2024-08-06 06:22:36,014 INFO [trainer.py:765] (2/8) Epoch 19, batch 400, train_loss[loss=2.635, ArTop10Accuracy=0.7794, over 10369.00 frames. ], tot_loss[loss=2.804, ArTop10Accuracy=0.752, over 10332.28 frames. ], batch size: 14, lr: 6.39e-03
2024-08-06 06:23:18,997 INFO [trainer.py:765] (2/8) Epoch 19, batch 500, train_loss[loss=2.699, ArTop10Accuracy=0.774, over 12306.00 frames. ], tot_loss[loss=2.799, ArTop10Accuracy=0.7526, over 10926.80 frames. ], batch size: 22, lr: 6.37e-03
2024-08-06 06:24:03,685 INFO [trainer.py:765] (2/8) Epoch 19, batch 600, train_loss[loss=2.775, ArTop10Accuracy=0.7611, over 11535.00 frames. ], tot_loss[loss=2.806, ArTop10Accuracy=0.7513, over 11443.41 frames. ], batch size: 18, lr: 6.36e-03
2024-08-06 06:24:46,184 INFO [trainer.py:765] (2/8) Epoch 19, batch 700, train_loss[loss=2.588, ArTop10Accuracy=0.7913, over 10357.00 frames. ], tot_loss[loss=2.81, ArTop10Accuracy=0.7502, over 11593.46 frames. ], batch size: 12, lr: 6.35e-03
2024-08-06 06:25:22,355 INFO [trainer.py:765] (2/8) Epoch 19, batch 800, train_loss[loss=2.72, ArTop10Accuracy=0.7633, over 10066.00 frames. ], tot_loss[loss=2.816, ArTop10Accuracy=0.749, over 11715.86 frames. ], batch size: 12, lr: 6.33e-03
2024-08-06 06:25:53,624 INFO [trainer.py:765] (2/8) Epoch 19, batch 900, train_loss[loss=2.68, ArTop10Accuracy=0.7632, over 13113.00 frames. ], tot_loss[loss=2.808, ArTop10Accuracy=0.7503, over 11748.63 frames. ], batch size: 27, lr: 6.32e-03
2024-08-06 06:26:21,772 INFO [trainer.py:803] (2/8) Computing validation loss
2024-08-06 06:26:30,765 INFO [trainer.py:811] (2/8) Epoch 19, validation: loss=2.918, ArTop10Accuracy=0.733, over 1829298.00 frames. 
2024-08-06 06:26:30,766 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33340MB
2024-08-06 06:26:31,053 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.198e+02 1.416e+02 1.525e+02 1.662e+02 2.849e+02, threshold=3.050e+02, percent-clipped=0.0
2024-08-06 06:26:34,032 INFO [trainer.py:765] (2/8) Epoch 19, batch 1000, train_loss[loss=2.78, ArTop10Accuracy=0.7612, over 12888.00 frames. ], tot_loss[loss=2.817, ArTop10Accuracy=0.7487, over 11942.50 frames. ], batch size: 27, lr: 6.31e-03
2024-08-06 06:27:05,190 INFO [trainer.py:765] (2/8) Epoch 19, batch 1100, train_loss[loss=2.828, ArTop10Accuracy=0.7477, over 13837.00 frames. ], tot_loss[loss=2.827, ArTop10Accuracy=0.7468, over 11994.26 frames. ], batch size: 34, lr: 6.30e-03
2024-08-06 06:27:35,454 INFO [trainer.py:765] (2/8) Epoch 19, batch 1200, train_loss[loss=2.986, ArTop10Accuracy=0.7209, over 12040.00 frames. ], tot_loss[loss=2.827, ArTop10Accuracy=0.7469, over 11918.07 frames. ], batch size: 101, lr: 6.28e-03
2024-08-06 06:28:00,542 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 06:29:08,984 INFO [trainer.py:765] (2/8) Epoch 20, batch 100, train_loss[loss=2.887, ArTop10Accuracy=0.7307, over 14720.00 frames. ], tot_loss[loss=2.791, ArTop10Accuracy=0.7544, over 4788.15 frames. ], batch size: 61, lr: 6.10e-03
2024-08-06 06:29:50,318 INFO [trainer.py:765] (2/8) Epoch 20, batch 200, train_loss[loss=2.798, ArTop10Accuracy=0.7507, over 13550.00 frames. ], tot_loss[loss=2.788, ArTop10Accuracy=0.755, over 7785.68 frames. ], batch size: 34, lr: 6.09e-03
2024-08-06 06:30:37,105 INFO [trainer.py:765] (2/8) Epoch 20, batch 300, train_loss[loss=2.85, ArTop10Accuracy=0.746, over 14568.00 frames. ], tot_loss[loss=2.787, ArTop10Accuracy=0.7551, over 9422.94 frames. ], batch size: 44, lr: 6.08e-03
2024-08-06 06:31:16,354 INFO [trainer.py:765] (2/8) Epoch 20, batch 400, train_loss[loss=2.774, ArTop10Accuracy=0.7555, over 10346.00 frames. ], tot_loss[loss=2.785, ArTop10Accuracy=0.7553, over 10315.08 frames. ], batch size: 14, lr: 6.07e-03
2024-08-06 06:32:03,758 INFO [trainer.py:765] (2/8) Epoch 20, batch 500, train_loss[loss=2.859, ArTop10Accuracy=0.7423, over 12227.00 frames. ], tot_loss[loss=2.785, ArTop10Accuracy=0.755, over 10888.33 frames. ], batch size: 22, lr: 6.05e-03
2024-08-06 06:32:43,356 INFO [trainer.py:765] (2/8) Epoch 20, batch 600, train_loss[loss=2.668, ArTop10Accuracy=0.7744, over 11616.00 frames. ], tot_loss[loss=2.793, ArTop10Accuracy=0.7532, over 11413.38 frames. ], batch size: 18, lr: 6.04e-03
2024-08-06 06:33:36,751 INFO [trainer.py:765] (2/8) Epoch 20, batch 700, train_loss[loss=2.82, ArTop10Accuracy=0.7457, over 10272.00 frames. ], tot_loss[loss=2.804, ArTop10Accuracy=0.751, over 11570.07 frames. ], batch size: 12, lr: 6.03e-03
2024-08-06 06:33:43,830 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.196e+02 1.417e+02 1.526e+02 1.639e+02 3.791e+02, threshold=3.052e+02, percent-clipped=0.1
2024-08-06 06:34:13,304 INFO [trainer.py:765] (2/8) Epoch 20, batch 800, train_loss[loss=2.598, ArTop10Accuracy=0.7936, over 10115.00 frames. ], tot_loss[loss=2.807, ArTop10Accuracy=0.7505, over 11678.62 frames. ], batch size: 12, lr: 6.02e-03
2024-08-06 06:34:44,580 INFO [trainer.py:765] (2/8) Epoch 20, batch 900, train_loss[loss=2.811, ArTop10Accuracy=0.7549, over 13034.00 frames. ], tot_loss[loss=2.803, ArTop10Accuracy=0.7514, over 11749.17 frames. ], batch size: 27, lr: 6.01e-03
2024-08-06 06:35:16,139 INFO [trainer.py:765] (2/8) Epoch 20, batch 1000, train_loss[loss=2.783, ArTop10Accuracy=0.7539, over 12951.00 frames. ], tot_loss[loss=2.807, ArTop10Accuracy=0.7507, over 11955.33 frames. ], batch size: 27, lr: 6.00e-03
2024-08-06 06:35:47,214 INFO [trainer.py:765] (2/8) Epoch 20, batch 1100, train_loss[loss=2.809, ArTop10Accuracy=0.7529, over 13720.00 frames. ], tot_loss[loss=2.817, ArTop10Accuracy=0.7487, over 12010.35 frames. ], batch size: 34, lr: 5.99e-03
2024-08-06 06:36:17,439 INFO [trainer.py:765] (2/8) Epoch 20, batch 1200, train_loss[loss=2.914, ArTop10Accuracy=0.7276, over 13237.00 frames. ], tot_loss[loss=2.82, ArTop10Accuracy=0.7483, over 11953.30 frames. ], batch size: 100, lr: 5.97e-03
2024-08-06 06:36:42,597 INFO [trainer.py:650] (2/8) Reaches end of dataloader.
2024-08-06 06:36:42,600 INFO [trainer.py:1069] (2/8) Done!