File size: 71,448 Bytes
c96c265
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
2024-08-06 08:06:14,314 INFO [trainer.py:870] (1/8) Training started
2024-08-06 08:06:14,315 INFO [trainer.py:889] (1/8) Device: cuda:1
2024-08-06 08:06:14,315 INFO [trainer.py:890] (1/8) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 100, 'reset_interval': 200, 'valid_interval': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '279b0c87015a615b81b147251814d737a548f397', 'k2-git-date': 'Wed May 24 22:24:09 2023', 'lhotse-version': '1.26.0', 'torch-version': '2.0.1+cu118', 'torch-cuda-available': True, 'torch-cuda-version': '11.8', 'python-version': '3.10', 'icefall-git-branch': None, 'icefall-git-sha1': None, 'icefall-git-date': None, 'icefall-path': '/workspace/icefall_llm', 'k2-path': '/usr/local/lib/python3.10/dist-packages/k2/__init__.py', 'lhotse-path': '/usr/local/lib/python3.10/dist-packages/lhotse/__init__.py', 'hostname': '6867463', 'IP address': '0.104.202.7'}, 'world_size': 8, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 20, 'start_epoch': 1, 'start_batch': 0, 'exp_dir': PosixPath('exp/valle'), 'optimizer_name': 'ScaledAdam', 'scheduler_name': 'Eden', 'base_lr': 0.03, 'warmup_steps': 200, 'seed': 42, 'inf_check': False, 'save_every_n': 20000, 'keep_last_k': 20, 'average_period': 0, 'accumulate_grad_steps': 1, 'dtype': 'bfloat16', 'filter_min_duration': 0.5, 'filter_max_duration': 14.0, 'train_stage': 1, 'visualize': False, 'oom_check': False, 'model_name': 'valle', 'decoder_dim': 1024, 'nhead': 16, 'num_decoder_layers': 12, 'scale_factor': 1.0, 'norm_first': True, 'add_prenet': False, 'prefix_mode': 1, 'share_embedding': True, 'prepend_bos': False, 'num_quantizers': 8, 'scaling_xformers': False, 'manifest_dir': PosixPath('data/tokenized'), 'max_duration': 320, 'bucketing_sampler': True, 'num_buckets': 6, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 0.1, 'on_the_fly_feats': False, 'shuffle': True, 'buffer_size': 40000, 'shuffle_buffer_size': 100000, 'drop_last': False, 'return_cuts': True, 'num_workers': 8, 'enable_spec_aug': False, 'spec_aug_time_warp_factor': 80, 'input_strategy': 'PrecomputedFeatures', 'dataset': 'libritts', 'text_tokens': 'data/tokenized/unique_text_tokens.k2symbols', 'sampling_rate': 24000}
2024-08-06 08:06:14,315 INFO [trainer.py:892] (1/8) About to create model
2024-08-06 08:06:15,030 INFO [trainer.py:899] (1/8) Number of model parameters: 367386628
2024-08-06 08:06:16,712 INFO [trainer.py:914] (1/8) Using DDP
2024-08-06 08:06:19,149 INFO [datamodule.py:427] (1/8) About to get train cuts
2024-08-06 08:06:19,151 INFO [datamodule.py:434] (1/8) About to get dev cuts
2024-08-06 08:06:19,152 INFO [datamodule.py:292] (1/8) Disable SpecAugment
2024-08-06 08:06:19,152 INFO [datamodule.py:294] (1/8) About to create train dataset
2024-08-06 08:06:19,153 INFO [datamodule.py:323] (1/8) Using DynamicBucketingSampler
2024-08-06 08:06:19,769 INFO [datamodule.py:344] (1/8) About to create train dataloader
2024-08-06 08:06:19,769 INFO [datamodule.py:367] (1/8) About to create dev dataset
2024-08-06 08:06:20,100 INFO [datamodule.py:388] (1/8) About to create dev dataloader
2024-08-06 08:08:02,125 INFO [trainer.py:765] (1/8) Epoch 1, batch 100, train_loss[loss=4.313, ArTop10Accuracy=0.499, over 14373.00 frames. ], tot_loss[loss=5.051, ArTop10Accuracy=0.3736, over 4747.16 frames. ], batch size: 63, lr: 2.25e-02
2024-08-06 08:09:28,831 INFO [trainer.py:765] (1/8) Epoch 1, batch 200, train_loss[loss=4.082, ArTop10Accuracy=0.5339, over 13701.00 frames. ], tot_loss[loss=4.494, ArTop10Accuracy=0.4669, over 7740.47 frames. ], batch size: 34, lr: 3.00e-02
2024-08-06 08:10:52,432 INFO [trainer.py:765] (1/8) Epoch 1, batch 300, train_loss[loss=3.827, ArTop10Accuracy=0.5819, over 14076.00 frames. ], tot_loss[loss=4.214, ArTop10Accuracy=0.5136, over 9378.41 frames. ], batch size: 44, lr: 3.00e-02
2024-08-06 08:12:12,703 INFO [trainer.py:765] (1/8) Epoch 1, batch 400, train_loss[loss=3.646, ArTop10Accuracy=0.6151, over 10353.00 frames. ], tot_loss[loss=4.028, ArTop10Accuracy=0.5453, over 10284.75 frames. ], batch size: 14, lr: 3.00e-02
2024-08-06 08:13:40,054 INFO [trainer.py:765] (1/8) Epoch 1, batch 500, train_loss[loss=3.622, ArTop10Accuracy=0.6179, over 12669.00 frames. ], tot_loss[loss=3.883, ArTop10Accuracy=0.5706, over 10856.75 frames. ], batch size: 23, lr: 2.99e-02
2024-08-06 08:15:00,247 INFO [trainer.py:765] (1/8) Epoch 1, batch 600, train_loss[loss=3.602, ArTop10Accuracy=0.6197, over 11541.00 frames. ], tot_loss[loss=3.77, ArTop10Accuracy=0.5906, over 11363.67 frames. ], batch size: 18, lr: 2.99e-02
2024-08-06 08:16:26,429 INFO [trainer.py:765] (1/8) Epoch 1, batch 700, train_loss[loss=3.496, ArTop10Accuracy=0.6398, over 10332.00 frames. ], tot_loss[loss=3.689, ArTop10Accuracy=0.6051, over 11510.86 frames. ], batch size: 12, lr: 2.99e-02
2024-08-06 08:17:43,022 INFO [trainer.py:765] (1/8) Epoch 1, batch 800, train_loss[loss=3.526, ArTop10Accuracy=0.635, over 10014.00 frames. ], tot_loss[loss=3.625, ArTop10Accuracy=0.6167, over 11655.02 frames. ], batch size: 12, lr: 2.98e-02
2024-08-06 08:18:56,155 INFO [trainer.py:765] (1/8) Epoch 1, batch 900, train_loss[loss=3.49, ArTop10Accuracy=0.6426, over 12882.00 frames. ], tot_loss[loss=3.567, ArTop10Accuracy=0.6274, over 11695.58 frames. ], batch size: 27, lr: 2.98e-02
2024-08-06 08:20:12,867 INFO [trainer.py:765] (1/8) Epoch 1, batch 1000, train_loss[loss=3.434, ArTop10Accuracy=0.6517, over 13002.00 frames. ], tot_loss[loss=3.525, ArTop10Accuracy=0.635, over 11887.94 frames. ], batch size: 27, lr: 2.97e-02
2024-08-06 08:20:13,547 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 9.300e+01 1.871e+02 2.675e+02 4.030e+02 9.119e+03, threshold=5.351e+02, percent-clipped=0.0
2024-08-06 08:21:29,161 INFO [trainer.py:765] (1/8) Epoch 1, batch 1100, train_loss[loss=3.453, ArTop10Accuracy=0.6489, over 14007.00 frames. ], tot_loss[loss=3.488, ArTop10Accuracy=0.6419, over 11969.79 frames. ], batch size: 35, lr: 2.96e-02
2024-08-06 08:22:45,417 INFO [trainer.py:765] (1/8) Epoch 1, batch 1200, train_loss[loss=3.476, ArTop10Accuracy=0.644, over 12039.00 frames. ], tot_loss[loss=3.463, ArTop10Accuracy=0.6461, over 11878.23 frames. ], batch size: 101, lr: 2.96e-02
2024-08-06 08:23:45,310 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 08:25:36,244 INFO [trainer.py:765] (1/8) Epoch 2, batch 100, train_loss[loss=3.4, ArTop10Accuracy=0.6565, over 14508.00 frames. ], tot_loss[loss=3.423, ArTop10Accuracy=0.6528, over 4764.29 frames. ], batch size: 62, lr: 2.90e-02
2024-08-06 08:26:58,962 INFO [trainer.py:765] (1/8) Epoch 2, batch 200, train_loss[loss=3.323, ArTop10Accuracy=0.6679, over 13725.00 frames. ], tot_loss[loss=3.385, ArTop10Accuracy=0.6599, over 7749.22 frames. ], batch size: 34, lr: 2.89e-02
2024-08-06 08:28:25,539 INFO [trainer.py:765] (1/8) Epoch 2, batch 300, train_loss[loss=3.403, ArTop10Accuracy=0.6592, over 14277.00 frames. ], tot_loss[loss=3.366, ArTop10Accuracy=0.6635, over 9375.13 frames. ], batch size: 44, lr: 2.89e-02
2024-08-06 08:29:48,644 INFO [trainer.py:765] (1/8) Epoch 2, batch 400, train_loss[loss=3.391, ArTop10Accuracy=0.6543, over 11046.00 frames. ], tot_loss[loss=3.354, ArTop10Accuracy=0.666, over 10294.61 frames. ], batch size: 15, lr: 2.88e-02
2024-08-06 08:31:22,906 INFO [trainer.py:765] (1/8) Epoch 2, batch 500, train_loss[loss=3.266, ArTop10Accuracy=0.6837, over 12363.00 frames. ], tot_loss[loss=3.337, ArTop10Accuracy=0.6696, over 10867.29 frames. ], batch size: 22, lr: 2.87e-02
2024-08-06 08:32:45,694 INFO [trainer.py:765] (1/8) Epoch 2, batch 600, train_loss[loss=3.294, ArTop10Accuracy=0.6813, over 11376.00 frames. ], tot_loss[loss=3.327, ArTop10Accuracy=0.6714, over 11379.06 frames. ], batch size: 18, lr: 2.86e-02
2024-08-06 08:34:13,587 INFO [trainer.py:765] (1/8) Epoch 2, batch 700, train_loss[loss=3.278, ArTop10Accuracy=0.6789, over 10239.00 frames. ], tot_loss[loss=3.322, ArTop10Accuracy=0.6723, over 11515.89 frames. ], batch size: 12, lr: 2.85e-02
2024-08-06 08:34:31,179 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 08:34:40,887 INFO [trainer.py:811] (1/8) Epoch 2, validation: loss=3.277, ArTop10Accuracy=0.6803, over 1827537.00 frames. 
2024-08-06 08:34:40,888 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 31570MB
2024-08-06 08:34:41,706 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 7.953e+01 1.592e+02 2.200e+02 3.344e+02 2.949e+03, threshold=4.400e+02, percent-clipped=8.6
2024-08-06 08:35:39,883 INFO [trainer.py:765] (1/8) Epoch 2, batch 800, train_loss[loss=3.314, ArTop10Accuracy=0.6699, over 9348.00 frames. ], tot_loss[loss=3.319, ArTop10Accuracy=0.673, over 11636.96 frames. ], batch size: 11, lr: 2.84e-02
2024-08-06 08:36:56,377 INFO [trainer.py:765] (1/8) Epoch 2, batch 900, train_loss[loss=3.373, ArTop10Accuracy=0.6616, over 12846.00 frames. ], tot_loss[loss=3.305, ArTop10Accuracy=0.6758, over 11683.18 frames. ], batch size: 27, lr: 2.83e-02
2024-08-06 08:38:10,518 INFO [trainer.py:765] (1/8) Epoch 2, batch 1000, train_loss[loss=3.233, ArTop10Accuracy=0.6893, over 12870.00 frames. ], tot_loss[loss=3.296, ArTop10Accuracy=0.6774, over 11873.81 frames. ], batch size: 27, lr: 2.82e-02
2024-08-06 08:39:25,065 INFO [trainer.py:765] (1/8) Epoch 2, batch 1100, train_loss[loss=3.28, ArTop10Accuracy=0.6837, over 13569.00 frames. ], tot_loss[loss=3.291, ArTop10Accuracy=0.6783, over 11963.75 frames. ], batch size: 34, lr: 2.81e-02
2024-08-06 08:40:38,225 INFO [trainer.py:765] (1/8) Epoch 2, batch 1200, train_loss[loss=3.337, ArTop10Accuracy=0.6672, over 12903.00 frames. ], tot_loss[loss=3.281, ArTop10Accuracy=0.6802, over 11861.92 frames. ], batch size: 101, lr: 2.80e-02
2024-08-06 08:41:38,205 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 08:43:36,655 INFO [trainer.py:765] (1/8) Epoch 3, batch 100, train_loss[loss=3.334, ArTop10Accuracy=0.6681, over 14691.00 frames. ], tot_loss[loss=3.254, ArTop10Accuracy=0.6846, over 4778.14 frames. ], batch size: 62, lr: 2.67e-02
2024-08-06 08:45:10,505 INFO [trainer.py:765] (1/8) Epoch 3, batch 200, train_loss[loss=3.187, ArTop10Accuracy=0.6983, over 13692.00 frames. ], tot_loss[loss=3.223, ArTop10Accuracy=0.6906, over 7780.48 frames. ], batch size: 34, lr: 2.66e-02
2024-08-06 08:46:29,264 INFO [trainer.py:765] (1/8) Epoch 3, batch 300, train_loss[loss=3.197, ArTop10Accuracy=0.7005, over 14133.00 frames. ], tot_loss[loss=3.206, ArTop10Accuracy=0.6938, over 9395.78 frames. ], batch size: 44, lr: 2.64e-02
2024-08-06 08:48:04,223 INFO [trainer.py:765] (1/8) Epoch 3, batch 400, train_loss[loss=3.106, ArTop10Accuracy=0.7194, over 10431.00 frames. ], tot_loss[loss=3.191, ArTop10Accuracy=0.6968, over 10278.07 frames. ], batch size: 14, lr: 2.63e-02
2024-08-06 08:48:40,887 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 9.282e+01 1.561e+02 1.981e+02 2.686e+02 1.768e+03, threshold=3.962e+02, percent-clipped=7.6
2024-08-06 08:49:25,548 INFO [trainer.py:765] (1/8) Epoch 3, batch 500, train_loss[loss=3.143, ArTop10Accuracy=0.7066, over 12162.00 frames. ], tot_loss[loss=3.171, ArTop10Accuracy=0.7005, over 10856.70 frames. ], batch size: 22, lr: 2.62e-02
2024-08-06 08:51:00,483 INFO [trainer.py:765] (1/8) Epoch 3, batch 600, train_loss[loss=3.082, ArTop10Accuracy=0.723, over 11733.00 frames. ], tot_loss[loss=3.156, ArTop10Accuracy=0.7034, over 11384.37 frames. ], batch size: 18, lr: 2.61e-02
2024-08-06 08:52:31,624 INFO [trainer.py:765] (1/8) Epoch 3, batch 700, train_loss[loss=3.085, ArTop10Accuracy=0.721, over 10002.00 frames. ], tot_loss[loss=3.15, ArTop10Accuracy=0.7044, over 11517.63 frames. ], batch size: 12, lr: 2.60e-02
2024-08-06 08:53:57,395 INFO [trainer.py:765] (1/8) Epoch 3, batch 800, train_loss[loss=3.078, ArTop10Accuracy=0.7226, over 10086.00 frames. ], tot_loss[loss=3.142, ArTop10Accuracy=0.7064, over 11639.54 frames. ], batch size: 12, lr: 2.59e-02
2024-08-06 08:55:15,124 INFO [trainer.py:765] (1/8) Epoch 3, batch 900, train_loss[loss=3.066, ArTop10Accuracy=0.7258, over 12849.00 frames. ], tot_loss[loss=3.12, ArTop10Accuracy=0.7104, over 11684.21 frames. ], batch size: 27, lr: 2.57e-02
2024-08-06 08:56:31,564 INFO [trainer.py:765] (1/8) Epoch 3, batch 1000, train_loss[loss=3.051, ArTop10Accuracy=0.7245, over 12855.00 frames. ], tot_loss[loss=3.112, ArTop10Accuracy=0.7118, over 11895.25 frames. ], batch size: 27, lr: 2.56e-02
2024-08-06 08:57:46,510 INFO [trainer.py:765] (1/8) Epoch 3, batch 1100, train_loss[loss=3.066, ArTop10Accuracy=0.7182, over 13584.00 frames. ], tot_loss[loss=3.104, ArTop10Accuracy=0.7134, over 11977.87 frames. ], batch size: 34, lr: 2.55e-02
2024-08-06 08:59:01,403 INFO [trainer.py:765] (1/8) Epoch 3, batch 1200, train_loss[loss=3.159, ArTop10Accuracy=0.702, over 11196.00 frames. ], tot_loss[loss=3.095, ArTop10Accuracy=0.715, over 11868.07 frames. ], batch size: 103, lr: 2.54e-02
2024-08-06 09:00:01,941 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 09:01:50,745 INFO [trainer.py:765] (1/8) Epoch 4, batch 100, train_loss[loss=3.027, ArTop10Accuracy=0.7318, over 14526.00 frames. ], tot_loss[loss=3.07, ArTop10Accuracy=0.7198, over 4767.61 frames. ], batch size: 62, lr: 2.38e-02
2024-08-06 09:02:52,864 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 09:03:02,383 INFO [trainer.py:811] (1/8) Epoch 4, validation: loss=2.997, ArTop10Accuracy=0.7338, over 1827537.00 frames. 
2024-08-06 09:03:02,384 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 31570MB
2024-08-06 09:03:03,368 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.072e+02 1.499e+02 1.782e+02 2.273e+02 1.100e+03, threshold=3.565e+02, percent-clipped=4.7
2024-08-06 09:03:29,277 INFO [trainer.py:765] (1/8) Epoch 4, batch 200, train_loss[loss=2.974, ArTop10Accuracy=0.7396, over 13569.00 frames. ], tot_loss[loss=3.051, ArTop10Accuracy=0.7232, over 7747.56 frames. ], batch size: 34, lr: 2.37e-02
2024-08-06 09:05:01,738 INFO [trainer.py:765] (1/8) Epoch 4, batch 300, train_loss[loss=3.089, ArTop10Accuracy=0.7124, over 14157.00 frames. ], tot_loss[loss=3.041, ArTop10Accuracy=0.7252, over 9375.36 frames. ], batch size: 44, lr: 2.36e-02
2024-08-06 09:06:28,155 INFO [trainer.py:765] (1/8) Epoch 4, batch 400, train_loss[loss=2.98, ArTop10Accuracy=0.7353, over 10224.00 frames. ], tot_loss[loss=3.034, ArTop10Accuracy=0.7266, over 10275.30 frames. ], batch size: 14, lr: 2.34e-02
2024-08-06 09:08:01,929 INFO [trainer.py:765] (1/8) Epoch 4, batch 500, train_loss[loss=2.934, ArTop10Accuracy=0.7454, over 12306.00 frames. ], tot_loss[loss=3.025, ArTop10Accuracy=0.7283, over 10830.06 frames. ], batch size: 22, lr: 2.33e-02
2024-08-06 09:09:28,546 INFO [trainer.py:765] (1/8) Epoch 4, batch 600, train_loss[loss=3.117, ArTop10Accuracy=0.7081, over 11433.00 frames. ], tot_loss[loss=3.018, ArTop10Accuracy=0.7296, over 11350.44 frames. ], batch size: 18, lr: 2.32e-02
2024-08-06 09:10:59,871 INFO [trainer.py:765] (1/8) Epoch 4, batch 700, train_loss[loss=3.038, ArTop10Accuracy=0.7271, over 10323.00 frames. ], tot_loss[loss=3.024, ArTop10Accuracy=0.7283, over 11501.78 frames. ], batch size: 12, lr: 2.31e-02
2024-08-06 09:12:17,518 INFO [trainer.py:765] (1/8) Epoch 4, batch 800, train_loss[loss=3.003, ArTop10Accuracy=0.7298, over 9342.00 frames. ], tot_loss[loss=3.023, ArTop10Accuracy=0.7284, over 11621.48 frames. ], batch size: 11, lr: 2.30e-02
2024-08-06 09:13:33,218 INFO [trainer.py:765] (1/8) Epoch 4, batch 900, train_loss[loss=3.1, ArTop10Accuracy=0.7141, over 12897.00 frames. ], tot_loss[loss=3.016, ArTop10Accuracy=0.7296, over 11659.32 frames. ], batch size: 27, lr: 2.29e-02
2024-08-06 09:14:47,526 INFO [trainer.py:765] (1/8) Epoch 4, batch 1000, train_loss[loss=3.042, ArTop10Accuracy=0.7206, over 12765.00 frames. ], tot_loss[loss=3.013, ArTop10Accuracy=0.7304, over 11873.50 frames. ], batch size: 27, lr: 2.28e-02
2024-08-06 09:16:02,987 INFO [trainer.py:765] (1/8) Epoch 4, batch 1100, train_loss[loss=3.001, ArTop10Accuracy=0.7331, over 13710.00 frames. ], tot_loss[loss=3.014, ArTop10Accuracy=0.7303, over 11946.58 frames. ], batch size: 34, lr: 2.26e-02
2024-08-06 09:16:53,297 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.100e+02 1.440e+02 1.636e+02 1.968e+02 7.702e+02, threshold=3.273e+02, percent-clipped=1.3
2024-08-06 09:17:18,350 INFO [trainer.py:765] (1/8) Epoch 4, batch 1200, train_loss[loss=3.076, ArTop10Accuracy=0.7165, over 12258.00 frames. ], tot_loss[loss=3.011, ArTop10Accuracy=0.7307, over 11834.91 frames. ], batch size: 101, lr: 2.25e-02
2024-08-06 09:18:17,461 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 09:20:17,177 INFO [trainer.py:765] (1/8) Epoch 5, batch 100, train_loss[loss=2.968, ArTop10Accuracy=0.7418, over 14499.00 frames. ], tot_loss[loss=2.991, ArTop10Accuracy=0.7345, over 4753.20 frames. ], batch size: 62, lr: 2.10e-02
2024-08-06 09:21:52,300 INFO [trainer.py:765] (1/8) Epoch 5, batch 200, train_loss[loss=3.047, ArTop10Accuracy=0.7255, over 13533.00 frames. ], tot_loss[loss=2.983, ArTop10Accuracy=0.7362, over 7733.64 frames. ], batch size: 34, lr: 2.09e-02
2024-08-06 09:23:19,245 INFO [trainer.py:765] (1/8) Epoch 5, batch 300, train_loss[loss=3, ArTop10Accuracy=0.7324, over 14067.00 frames. ], tot_loss[loss=2.971, ArTop10Accuracy=0.7382, over 9372.02 frames. ], batch size: 44, lr: 2.08e-02
2024-08-06 09:24:53,543 INFO [trainer.py:765] (1/8) Epoch 5, batch 400, train_loss[loss=2.851, ArTop10Accuracy=0.7643, over 10191.00 frames. ], tot_loss[loss=2.965, ArTop10Accuracy=0.7392, over 10286.55 frames. ], batch size: 14, lr: 2.07e-02
2024-08-06 09:26:19,424 INFO [trainer.py:765] (1/8) Epoch 5, batch 500, train_loss[loss=2.986, ArTop10Accuracy=0.7376, over 12162.00 frames. ], tot_loss[loss=2.963, ArTop10Accuracy=0.7397, over 10872.84 frames. ], batch size: 22, lr: 2.06e-02
2024-08-06 09:27:49,543 INFO [trainer.py:765] (1/8) Epoch 5, batch 600, train_loss[loss=2.903, ArTop10Accuracy=0.7532, over 11931.00 frames. ], tot_loss[loss=2.964, ArTop10Accuracy=0.7397, over 11399.96 frames. ], batch size: 19, lr: 2.05e-02
2024-08-06 09:29:21,676 INFO [trainer.py:765] (1/8) Epoch 5, batch 700, train_loss[loss=2.898, ArTop10Accuracy=0.7536, over 9321.00 frames. ], tot_loss[loss=2.972, ArTop10Accuracy=0.7382, over 11531.87 frames. ], batch size: 11, lr: 2.04e-02
2024-08-06 09:30:44,699 INFO [trainer.py:765] (1/8) Epoch 5, batch 800, train_loss[loss=3.077, ArTop10Accuracy=0.7202, over 9351.00 frames. ], tot_loss[loss=2.974, ArTop10Accuracy=0.738, over 11643.11 frames. ], batch size: 11, lr: 2.03e-02
2024-08-06 09:31:51,245 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 09:32:00,761 INFO [trainer.py:811] (1/8) Epoch 5, validation: loss=2.926, ArTop10Accuracy=0.7466, over 1827537.00 frames. 
2024-08-06 09:32:00,761 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 31570MB
2024-08-06 09:32:01,712 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.060e+02 1.349e+02 1.525e+02 1.806e+02 1.007e+03, threshold=3.049e+02, percent-clipped=2.3
2024-08-06 09:32:10,557 INFO [trainer.py:765] (1/8) Epoch 5, batch 900, train_loss[loss=2.939, ArTop10Accuracy=0.7484, over 12774.00 frames. ], tot_loss[loss=2.962, ArTop10Accuracy=0.7404, over 11677.07 frames. ], batch size: 27, lr: 2.02e-02
2024-08-06 09:33:27,329 INFO [trainer.py:765] (1/8) Epoch 5, batch 1000, train_loss[loss=3, ArTop10Accuracy=0.7307, over 13125.00 frames. ], tot_loss[loss=2.962, ArTop10Accuracy=0.7405, over 11870.34 frames. ], batch size: 28, lr: 2.01e-02
2024-08-06 09:34:42,306 INFO [trainer.py:765] (1/8) Epoch 5, batch 1100, train_loss[loss=2.929, ArTop10Accuracy=0.7503, over 13596.00 frames. ], tot_loss[loss=2.964, ArTop10Accuracy=0.7399, over 11943.56 frames. ], batch size: 34, lr: 2.00e-02
2024-08-06 09:35:56,339 INFO [trainer.py:765] (1/8) Epoch 5, batch 1200, train_loss[loss=3.058, ArTop10Accuracy=0.7186, over 13242.00 frames. ], tot_loss[loss=2.963, ArTop10Accuracy=0.7399, over 11871.80 frames. ], batch size: 101, lr: 1.99e-02
2024-08-06 09:36:55,360 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 09:38:52,668 INFO [trainer.py:765] (1/8) Epoch 6, batch 100, train_loss[loss=2.962, ArTop10Accuracy=0.7441, over 14187.00 frames. ], tot_loss[loss=2.946, ArTop10Accuracy=0.7427, over 4761.42 frames. ], batch size: 62, lr: 1.85e-02
2024-08-06 09:40:19,840 INFO [trainer.py:765] (1/8) Epoch 6, batch 200, train_loss[loss=2.889, ArTop10Accuracy=0.7544, over 13533.00 frames. ], tot_loss[loss=2.934, ArTop10Accuracy=0.7452, over 7753.64 frames. ], batch size: 34, lr: 1.84e-02
2024-08-06 09:41:52,971 INFO [trainer.py:765] (1/8) Epoch 6, batch 300, train_loss[loss=2.977, ArTop10Accuracy=0.7367, over 14202.00 frames. ], tot_loss[loss=2.928, ArTop10Accuracy=0.7462, over 9400.74 frames. ], batch size: 44, lr: 1.83e-02
2024-08-06 09:43:17,833 INFO [trainer.py:765] (1/8) Epoch 6, batch 400, train_loss[loss=2.822, ArTop10Accuracy=0.7726, over 10521.00 frames. ], tot_loss[loss=2.925, ArTop10Accuracy=0.747, over 10303.55 frames. ], batch size: 14, lr: 1.83e-02
2024-08-06 09:44:54,134 INFO [trainer.py:765] (1/8) Epoch 6, batch 500, train_loss[loss=2.937, ArTop10Accuracy=0.7418, over 12210.00 frames. ], tot_loss[loss=2.918, ArTop10Accuracy=0.7481, over 10854.68 frames. ], batch size: 22, lr: 1.82e-02
2024-08-06 09:46:22,879 INFO [trainer.py:765] (1/8) Epoch 6, batch 600, train_loss[loss=2.901, ArTop10Accuracy=0.7563, over 11883.00 frames. ], tot_loss[loss=2.925, ArTop10Accuracy=0.7468, over 11386.09 frames. ], batch size: 19, lr: 1.81e-02
2024-08-06 09:46:37,225 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.012e+02 1.339e+02 1.480e+02 1.701e+02 7.506e+02, threshold=2.959e+02, percent-clipped=1.1
2024-08-06 09:47:57,875 INFO [trainer.py:765] (1/8) Epoch 6, batch 700, train_loss[loss=2.899, ArTop10Accuracy=0.7566, over 10038.00 frames. ], tot_loss[loss=2.929, ArTop10Accuracy=0.7462, over 11537.98 frames. ], batch size: 12, lr: 1.80e-02
2024-08-06 09:49:15,961 INFO [trainer.py:765] (1/8) Epoch 6, batch 800, train_loss[loss=2.845, ArTop10Accuracy=0.7569, over 9402.00 frames. ], tot_loss[loss=2.932, ArTop10Accuracy=0.7455, over 11662.00 frames. ], batch size: 11, lr: 1.79e-02
2024-08-06 09:50:32,141 INFO [trainer.py:765] (1/8) Epoch 6, batch 900, train_loss[loss=2.907, ArTop10Accuracy=0.748, over 12954.00 frames. ], tot_loss[loss=2.927, ArTop10Accuracy=0.7466, over 11709.57 frames. ], batch size: 27, lr: 1.78e-02
2024-08-06 09:51:47,303 INFO [trainer.py:765] (1/8) Epoch 6, batch 1000, train_loss[loss=2.976, ArTop10Accuracy=0.7375, over 12882.00 frames. ], tot_loss[loss=2.929, ArTop10Accuracy=0.7461, over 11880.37 frames. ], batch size: 27, lr: 1.77e-02
2024-08-06 09:53:00,927 INFO [trainer.py:765] (1/8) Epoch 6, batch 1100, train_loss[loss=2.896, ArTop10Accuracy=0.7569, over 13659.00 frames. ], tot_loss[loss=2.929, ArTop10Accuracy=0.7462, over 11936.22 frames. ], batch size: 34, lr: 1.77e-02
2024-08-06 09:54:14,343 INFO [trainer.py:765] (1/8) Epoch 6, batch 1200, train_loss[loss=3.025, ArTop10Accuracy=0.7304, over 12231.00 frames. ], tot_loss[loss=2.928, ArTop10Accuracy=0.7461, over 11865.74 frames. ], batch size: 101, lr: 1.76e-02
2024-08-06 09:55:13,177 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 09:57:06,705 INFO [trainer.py:765] (1/8) Epoch 7, batch 100, train_loss[loss=3.002, ArTop10Accuracy=0.7374, over 14385.00 frames. ], tot_loss[loss=2.913, ArTop10Accuracy=0.7486, over 4745.63 frames. ], batch size: 62, lr: 1.64e-02
2024-08-06 09:58:39,429 INFO [trainer.py:765] (1/8) Epoch 7, batch 200, train_loss[loss=2.902, ArTop10Accuracy=0.7502, over 13656.00 frames. ], tot_loss[loss=2.9, ArTop10Accuracy=0.7515, over 7751.20 frames. ], batch size: 34, lr: 1.64e-02
2024-08-06 10:00:06,090 INFO [trainer.py:765] (1/8) Epoch 7, batch 300, train_loss[loss=2.908, ArTop10Accuracy=0.755, over 14034.00 frames. ], tot_loss[loss=2.896, ArTop10Accuracy=0.7522, over 9344.26 frames. ], batch size: 44, lr: 1.63e-02
2024-08-06 10:00:40,514 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 10:00:50,245 INFO [trainer.py:811] (1/8) Epoch 7, validation: loss=2.88, ArTop10Accuracy=0.7554, over 1827537.00 frames. 
2024-08-06 10:00:50,246 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 31570MB
2024-08-06 10:00:50,983 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.002e+02 1.286e+02 1.429e+02 1.605e+02 1.020e+03, threshold=2.857e+02, percent-clipped=1.5
2024-08-06 10:01:49,123 INFO [trainer.py:765] (1/8) Epoch 7, batch 400, train_loss[loss=2.808, ArTop10Accuracy=0.769, over 10485.00 frames. ], tot_loss[loss=2.893, ArTop10Accuracy=0.7531, over 10278.27 frames. ], batch size: 14, lr: 1.62e-02
2024-08-06 10:03:21,463 INFO [trainer.py:765] (1/8) Epoch 7, batch 500, train_loss[loss=2.828, ArTop10Accuracy=0.7612, over 12168.00 frames. ], tot_loss[loss=2.889, ArTop10Accuracy=0.754, over 10832.15 frames. ], batch size: 22, lr: 1.61e-02
2024-08-06 10:04:51,889 INFO [trainer.py:765] (1/8) Epoch 7, batch 600, train_loss[loss=2.795, ArTop10Accuracy=0.7731, over 11178.00 frames. ], tot_loss[loss=2.888, ArTop10Accuracy=0.7539, over 11361.58 frames. ], batch size: 18, lr: 1.61e-02
2024-08-06 10:06:25,117 INFO [trainer.py:765] (1/8) Epoch 7, batch 700, train_loss[loss=2.795, ArTop10Accuracy=0.766, over 9279.00 frames. ], tot_loss[loss=2.896, ArTop10Accuracy=0.7525, over 11491.31 frames. ], batch size: 11, lr: 1.60e-02
2024-08-06 10:07:46,955 INFO [trainer.py:765] (1/8) Epoch 7, batch 800, train_loss[loss=2.803, ArTop10Accuracy=0.7747, over 10359.00 frames. ], tot_loss[loss=2.896, ArTop10Accuracy=0.7525, over 11632.01 frames. ], batch size: 12, lr: 1.59e-02
2024-08-06 10:09:02,828 INFO [trainer.py:765] (1/8) Epoch 7, batch 900, train_loss[loss=2.792, ArTop10Accuracy=0.7744, over 12621.00 frames. ], tot_loss[loss=2.892, ArTop10Accuracy=0.7532, over 11686.51 frames. ], batch size: 27, lr: 1.59e-02
2024-08-06 10:10:19,642 INFO [trainer.py:765] (1/8) Epoch 7, batch 1000, train_loss[loss=2.938, ArTop10Accuracy=0.7412, over 13341.00 frames. ], tot_loss[loss=2.894, ArTop10Accuracy=0.7527, over 11857.32 frames. ], batch size: 28, lr: 1.58e-02
2024-08-06 10:11:35,214 INFO [trainer.py:765] (1/8) Epoch 7, batch 1100, train_loss[loss=2.966, ArTop10Accuracy=0.7316, over 13683.00 frames. ], tot_loss[loss=2.902, ArTop10Accuracy=0.7512, over 11942.87 frames. ], batch size: 34, lr: 1.57e-02
2024-08-06 10:12:48,210 INFO [trainer.py:765] (1/8) Epoch 7, batch 1200, train_loss[loss=3.03, ArTop10Accuracy=0.7313, over 12375.00 frames. ], tot_loss[loss=2.901, ArTop10Accuracy=0.7515, over 11863.93 frames. ], batch size: 101, lr: 1.57e-02
2024-08-06 10:13:46,697 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 10:15:03,607 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.017e+02 1.283e+02 1.410e+02 1.601e+02 1.017e+03, threshold=2.820e+02, percent-clipped=0.9
2024-08-06 10:15:40,827 INFO [trainer.py:765] (1/8) Epoch 8, batch 100, train_loss[loss=2.922, ArTop10Accuracy=0.7479, over 14244.00 frames. ], tot_loss[loss=2.885, ArTop10Accuracy=0.7541, over 4746.16 frames. ], batch size: 62, lr: 1.47e-02
2024-08-06 10:17:12,868 INFO [trainer.py:765] (1/8) Epoch 8, batch 200, train_loss[loss=2.894, ArTop10Accuracy=0.7525, over 13881.00 frames. ], tot_loss[loss=2.873, ArTop10Accuracy=0.7565, over 7754.49 frames. ], batch size: 34, lr: 1.46e-02
2024-08-06 10:18:37,904 INFO [trainer.py:765] (1/8) Epoch 8, batch 300, train_loss[loss=2.901, ArTop10Accuracy=0.7532, over 14163.00 frames. ], tot_loss[loss=2.87, ArTop10Accuracy=0.7571, over 9382.60 frames. ], batch size: 44, lr: 1.46e-02
2024-08-06 10:20:06,348 INFO [trainer.py:765] (1/8) Epoch 8, batch 400, train_loss[loss=2.749, ArTop10Accuracy=0.7818, over 10323.00 frames. ], tot_loss[loss=2.864, ArTop10Accuracy=0.7585, over 10284.68 frames. ], batch size: 14, lr: 1.45e-02
2024-08-06 10:21:32,417 INFO [trainer.py:765] (1/8) Epoch 8, batch 500, train_loss[loss=2.842, ArTop10Accuracy=0.7659, over 12006.00 frames. ], tot_loss[loss=2.864, ArTop10Accuracy=0.7587, over 10864.33 frames. ], batch size: 22, lr: 1.45e-02
2024-08-06 10:23:00,980 INFO [trainer.py:765] (1/8) Epoch 8, batch 600, train_loss[loss=2.892, ArTop10Accuracy=0.7495, over 11358.00 frames. ], tot_loss[loss=2.867, ArTop10Accuracy=0.7579, over 11374.38 frames. ], batch size: 18, lr: 1.44e-02
2024-08-06 10:24:37,794 INFO [trainer.py:765] (1/8) Epoch 8, batch 700, train_loss[loss=2.749, ArTop10Accuracy=0.7816, over 10002.00 frames. ], tot_loss[loss=2.87, ArTop10Accuracy=0.7573, over 11528.07 frames. ], batch size: 12, lr: 1.43e-02
2024-08-06 10:25:56,091 INFO [trainer.py:765] (1/8) Epoch 8, batch 800, train_loss[loss=2.752, ArTop10Accuracy=0.775, over 9339.00 frames. ], tot_loss[loss=2.875, ArTop10Accuracy=0.7566, over 11620.96 frames. ], batch size: 11, lr: 1.43e-02
2024-08-06 10:27:12,249 INFO [trainer.py:765] (1/8) Epoch 8, batch 900, train_loss[loss=2.869, ArTop10Accuracy=0.7567, over 13035.00 frames. ], tot_loss[loss=2.867, ArTop10Accuracy=0.7579, over 11677.37 frames. ], batch size: 27, lr: 1.42e-02
2024-08-06 10:28:25,269 INFO [trainer.py:765] (1/8) Epoch 8, batch 1000, train_loss[loss=2.85, ArTop10Accuracy=0.7617, over 12864.00 frames. ], tot_loss[loss=2.872, ArTop10Accuracy=0.7572, over 11871.35 frames. ], batch size: 27, lr: 1.42e-02
2024-08-06 10:29:07,161 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 10:29:16,830 INFO [trainer.py:811] (1/8) Epoch 8, validation: loss=2.858, ArTop10Accuracy=0.7594, over 1827537.00 frames. 
2024-08-06 10:29:16,831 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 31570MB
2024-08-06 10:29:17,496 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.032e+02 1.275e+02 1.390e+02 1.547e+02 3.717e+02, threshold=2.781e+02, percent-clipped=0.7
2024-08-06 10:29:51,738 INFO [trainer.py:765] (1/8) Epoch 8, batch 1100, train_loss[loss=2.851, ArTop10Accuracy=0.7608, over 13545.00 frames. ], tot_loss[loss=2.88, ArTop10Accuracy=0.7554, over 11946.85 frames. ], batch size: 34, lr: 1.41e-02
2024-08-06 10:31:05,952 INFO [trainer.py:765] (1/8) Epoch 8, batch 1200, train_loss[loss=3.013, ArTop10Accuracy=0.7242, over 12375.00 frames. ], tot_loss[loss=2.878, ArTop10Accuracy=0.756, over 11877.61 frames. ], batch size: 101, lr: 1.40e-02
2024-08-06 10:32:05,554 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 10:34:01,262 INFO [trainer.py:765] (1/8) Epoch 9, batch 100, train_loss[loss=2.974, ArTop10Accuracy=0.7336, over 14712.00 frames. ], tot_loss[loss=2.854, ArTop10Accuracy=0.7598, over 4756.21 frames. ], batch size: 62, lr: 1.32e-02
2024-08-06 10:35:31,778 INFO [trainer.py:765] (1/8) Epoch 9, batch 200, train_loss[loss=2.743, ArTop10Accuracy=0.782, over 13812.00 frames. ], tot_loss[loss=2.845, ArTop10Accuracy=0.7618, over 7742.43 frames. ], batch size: 34, lr: 1.32e-02
2024-08-06 10:36:57,933 INFO [trainer.py:765] (1/8) Epoch 9, batch 300, train_loss[loss=2.905, ArTop10Accuracy=0.7537, over 14358.00 frames. ], tot_loss[loss=2.842, ArTop10Accuracy=0.7627, over 9372.16 frames. ], batch size: 45, lr: 1.31e-02
2024-08-06 10:38:32,702 INFO [trainer.py:765] (1/8) Epoch 9, batch 400, train_loss[loss=2.77, ArTop10Accuracy=0.7783, over 10809.00 frames. ], tot_loss[loss=2.843, ArTop10Accuracy=0.7625, over 10272.39 frames. ], batch size: 15, lr: 1.31e-02
2024-08-06 10:39:59,262 INFO [trainer.py:765] (1/8) Epoch 9, batch 500, train_loss[loss=2.804, ArTop10Accuracy=0.7676, over 12750.00 frames. ], tot_loss[loss=2.837, ArTop10Accuracy=0.7636, over 10841.66 frames. ], batch size: 23, lr: 1.30e-02
2024-08-06 10:41:29,694 INFO [trainer.py:765] (1/8) Epoch 9, batch 600, train_loss[loss=2.872, ArTop10Accuracy=0.7589, over 11277.00 frames. ], tot_loss[loss=2.837, ArTop10Accuracy=0.7634, over 11363.34 frames. ], batch size: 18, lr: 1.30e-02
2024-08-06 10:42:58,446 INFO [trainer.py:765] (1/8) Epoch 9, batch 700, train_loss[loss=2.794, ArTop10Accuracy=0.7749, over 10224.00 frames. ], tot_loss[loss=2.847, ArTop10Accuracy=0.7615, over 11506.27 frames. ], batch size: 12, lr: 1.29e-02
2024-08-06 10:44:02,958 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.039e+02 1.253e+02 1.352e+02 1.493e+02 7.010e+02, threshold=2.704e+02, percent-clipped=0.6
2024-08-06 10:44:19,675 INFO [trainer.py:765] (1/8) Epoch 9, batch 800, train_loss[loss=2.758, ArTop10Accuracy=0.7845, over 9348.00 frames. ], tot_loss[loss=2.852, ArTop10Accuracy=0.7607, over 11633.24 frames. ], batch size: 11, lr: 1.29e-02
2024-08-06 10:45:35,725 INFO [trainer.py:765] (1/8) Epoch 9, batch 900, train_loss[loss=2.776, ArTop10Accuracy=0.7739, over 12975.00 frames. ], tot_loss[loss=2.847, ArTop10Accuracy=0.7616, over 11688.38 frames. ], batch size: 27, lr: 1.28e-02
2024-08-06 10:46:51,277 INFO [trainer.py:765] (1/8) Epoch 9, batch 1000, train_loss[loss=2.904, ArTop10Accuracy=0.7464, over 12858.00 frames. ], tot_loss[loss=2.852, ArTop10Accuracy=0.7605, over 11884.24 frames. ], batch size: 27, lr: 1.28e-02
2024-08-06 10:48:06,253 INFO [trainer.py:765] (1/8) Epoch 9, batch 1100, train_loss[loss=2.846, ArTop10Accuracy=0.7651, over 13599.00 frames. ], tot_loss[loss=2.856, ArTop10Accuracy=0.7598, over 11944.83 frames. ], batch size: 34, lr: 1.28e-02
2024-08-06 10:49:21,058 INFO [trainer.py:765] (1/8) Epoch 9, batch 1200, train_loss[loss=2.94, ArTop10Accuracy=0.7412, over 12285.00 frames. ], tot_loss[loss=2.852, ArTop10Accuracy=0.7606, over 11849.65 frames. ], batch size: 103, lr: 1.27e-02
2024-08-06 10:50:22,395 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 10:52:12,332 INFO [trainer.py:765] (1/8) Epoch 10, batch 100, train_loss[loss=2.84, ArTop10Accuracy=0.7599, over 14454.00 frames. ], tot_loss[loss=2.84, ArTop10Accuracy=0.7628, over 4762.71 frames. ], batch size: 62, lr: 1.20e-02
2024-08-06 10:53:44,591 INFO [trainer.py:765] (1/8) Epoch 10, batch 200, train_loss[loss=2.841, ArTop10Accuracy=0.7632, over 13776.00 frames. ], tot_loss[loss=2.832, ArTop10Accuracy=0.7643, over 7751.20 frames. ], batch size: 34, lr: 1.20e-02
2024-08-06 10:55:08,096 INFO [trainer.py:765] (1/8) Epoch 10, batch 300, train_loss[loss=2.908, ArTop10Accuracy=0.7502, over 13872.00 frames. ], tot_loss[loss=2.828, ArTop10Accuracy=0.765, over 9380.76 frames. ], batch size: 44, lr: 1.19e-02
2024-08-06 10:56:41,181 INFO [trainer.py:765] (1/8) Epoch 10, batch 400, train_loss[loss=2.775, ArTop10Accuracy=0.7748, over 10197.00 frames. ], tot_loss[loss=2.824, ArTop10Accuracy=0.7658, over 10286.84 frames. ], batch size: 14, lr: 1.19e-02
2024-08-06 10:58:04,944 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 10:58:14,557 INFO [trainer.py:811] (1/8) Epoch 10, validation: loss=2.842, ArTop10Accuracy=0.7624, over 1827537.00 frames. 
2024-08-06 10:58:14,557 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 31570MB
2024-08-06 10:58:15,576 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.035e+02 1.228e+02 1.320e+02 1.458e+02 6.096e+02, threshold=2.641e+02, percent-clipped=0.6
2024-08-06 10:58:15,583 INFO [trainer.py:765] (1/8) Epoch 10, batch 500, train_loss[loss=2.86, ArTop10Accuracy=0.7601, over 12069.00 frames. ], tot_loss[loss=2.822, ArTop10Accuracy=0.7664, over 10851.85 frames. ], batch size: 22, lr: 1.19e-02
2024-08-06 10:59:42,821 INFO [trainer.py:765] (1/8) Epoch 10, batch 600, train_loss[loss=2.747, ArTop10Accuracy=0.7847, over 11316.00 frames. ], tot_loss[loss=2.823, ArTop10Accuracy=0.7663, over 11362.02 frames. ], batch size: 18, lr: 1.18e-02
2024-08-06 11:01:18,113 INFO [trainer.py:765] (1/8) Epoch 10, batch 700, train_loss[loss=2.815, ArTop10Accuracy=0.7639, over 10041.00 frames. ], tot_loss[loss=2.828, ArTop10Accuracy=0.7651, over 11512.01 frames. ], batch size: 12, lr: 1.18e-02
2024-08-06 11:02:36,923 INFO [trainer.py:765] (1/8) Epoch 10, batch 800, train_loss[loss=2.721, ArTop10Accuracy=0.7851, over 9291.00 frames. ], tot_loss[loss=2.83, ArTop10Accuracy=0.7648, over 11633.61 frames. ], batch size: 11, lr: 1.17e-02
2024-08-06 11:03:51,218 INFO [trainer.py:765] (1/8) Epoch 10, batch 900, train_loss[loss=2.86, ArTop10Accuracy=0.7579, over 12807.00 frames. ], tot_loss[loss=2.825, ArTop10Accuracy=0.7657, over 11681.50 frames. ], batch size: 27, lr: 1.17e-02
2024-08-06 11:05:06,358 INFO [trainer.py:765] (1/8) Epoch 10, batch 1000, train_loss[loss=2.904, ArTop10Accuracy=0.7478, over 12924.00 frames. ], tot_loss[loss=2.832, ArTop10Accuracy=0.7645, over 11872.55 frames. ], batch size: 27, lr: 1.17e-02
2024-08-06 11:06:21,725 INFO [trainer.py:765] (1/8) Epoch 10, batch 1100, train_loss[loss=2.817, ArTop10Accuracy=0.7649, over 13518.00 frames. ], tot_loss[loss=2.839, ArTop10Accuracy=0.763, over 11926.81 frames. ], batch size: 34, lr: 1.16e-02
2024-08-06 11:07:34,778 INFO [trainer.py:765] (1/8) Epoch 10, batch 1200, train_loss[loss=2.922, ArTop10Accuracy=0.7471, over 12081.00 frames. ], tot_loss[loss=2.837, ArTop10Accuracy=0.7634, over 11849.57 frames. ], batch size: 101, lr: 1.16e-02
2024-08-06 11:08:33,905 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 11:10:29,960 INFO [trainer.py:765] (1/8) Epoch 11, batch 100, train_loss[loss=2.921, ArTop10Accuracy=0.7486, over 14277.00 frames. ], tot_loss[loss=2.819, ArTop10Accuracy=0.7666, over 4744.36 frames. ], batch size: 62, lr: 1.10e-02
2024-08-06 11:12:04,679 INFO [trainer.py:765] (1/8) Epoch 11, batch 200, train_loss[loss=2.759, ArTop10Accuracy=0.7787, over 13740.00 frames. ], tot_loss[loss=2.815, ArTop10Accuracy=0.7673, over 7726.69 frames. ], batch size: 34, lr: 1.10e-02
2024-08-06 11:12:22,831 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 9.884e+01 1.240e+02 1.333e+02 1.457e+02 6.939e+02, threshold=2.667e+02, percent-clipped=0.1
2024-08-06 11:13:31,551 INFO [trainer.py:765] (1/8) Epoch 11, batch 300, train_loss[loss=2.949, ArTop10Accuracy=0.739, over 14226.00 frames. ], tot_loss[loss=2.811, ArTop10Accuracy=0.7683, over 9344.10 frames. ], batch size: 44, lr: 1.09e-02
2024-08-06 11:15:03,275 INFO [trainer.py:765] (1/8) Epoch 11, batch 400, train_loss[loss=2.758, ArTop10Accuracy=0.7821, over 11040.00 frames. ], tot_loss[loss=2.808, ArTop10Accuracy=0.7688, over 10269.43 frames. ], batch size: 15, lr: 1.09e-02
2024-08-06 11:16:29,642 INFO [trainer.py:765] (1/8) Epoch 11, batch 500, train_loss[loss=2.739, ArTop10Accuracy=0.7819, over 12315.00 frames. ], tot_loss[loss=2.8, ArTop10Accuracy=0.7704, over 10842.51 frames. ], batch size: 22, lr: 1.09e-02
2024-08-06 11:18:00,523 INFO [trainer.py:765] (1/8) Epoch 11, batch 600, train_loss[loss=2.793, ArTop10Accuracy=0.7793, over 11529.00 frames. ], tot_loss[loss=2.807, ArTop10Accuracy=0.7689, over 11373.43 frames. ], batch size: 18, lr: 1.08e-02
2024-08-06 11:19:34,516 INFO [trainer.py:765] (1/8) Epoch 11, batch 700, train_loss[loss=2.696, ArTop10Accuracy=0.7903, over 9396.00 frames. ], tot_loss[loss=2.813, ArTop10Accuracy=0.768, over 11509.70 frames. ], batch size: 11, lr: 1.08e-02
2024-08-06 11:20:55,487 INFO [trainer.py:765] (1/8) Epoch 11, batch 800, train_loss[loss=2.761, ArTop10Accuracy=0.7746, over 10377.00 frames. ], tot_loss[loss=2.815, ArTop10Accuracy=0.7673, over 11641.00 frames. ], batch size: 12, lr: 1.07e-02
2024-08-06 11:22:13,712 INFO [trainer.py:765] (1/8) Epoch 11, batch 900, train_loss[loss=2.919, ArTop10Accuracy=0.7417, over 13068.00 frames. ], tot_loss[loss=2.811, ArTop10Accuracy=0.7683, over 11682.77 frames. ], batch size: 27, lr: 1.07e-02
2024-08-06 11:23:31,805 INFO [trainer.py:765] (1/8) Epoch 11, batch 1000, train_loss[loss=2.768, ArTop10Accuracy=0.7785, over 12609.00 frames. ], tot_loss[loss=2.813, ArTop10Accuracy=0.768, over 11877.30 frames. ], batch size: 27, lr: 1.07e-02
2024-08-06 11:24:46,908 INFO [trainer.py:765] (1/8) Epoch 11, batch 1100, train_loss[loss=2.88, ArTop10Accuracy=0.7552, over 13635.00 frames. ], tot_loss[loss=2.818, ArTop10Accuracy=0.767, over 11958.09 frames. ], batch size: 34, lr: 1.06e-02
2024-08-06 11:26:00,739 INFO [trainer.py:765] (1/8) Epoch 11, batch 1200, train_loss[loss=2.993, ArTop10Accuracy=0.7277, over 12222.00 frames. ], tot_loss[loss=2.82, ArTop10Accuracy=0.7665, over 11887.81 frames. ], batch size: 101, lr: 1.06e-02
2024-08-06 11:26:15,853 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 11:26:25,556 INFO [trainer.py:811] (1/8) Epoch 11, validation: loss=2.831, ArTop10Accuracy=0.7643, over 1827537.00 frames. 
2024-08-06 11:26:25,557 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 31570MB
2024-08-06 11:26:26,191 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.029e+02 1.251e+02 1.335e+02 1.441e+02 2.942e+02, threshold=2.669e+02, percent-clipped=0.1
2024-08-06 11:27:09,715 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 11:29:03,456 INFO [trainer.py:765] (1/8) Epoch 12, batch 100, train_loss[loss=2.881, ArTop10Accuracy=0.7566, over 14679.00 frames. ], tot_loss[loss=2.805, ArTop10Accuracy=0.769, over 4772.54 frames. ], batch size: 62, lr: 1.01e-02
2024-08-06 11:30:30,679 INFO [trainer.py:765] (1/8) Epoch 12, batch 200, train_loss[loss=2.779, ArTop10Accuracy=0.7758, over 13632.00 frames. ], tot_loss[loss=2.798, ArTop10Accuracy=0.7707, over 7752.01 frames. ], batch size: 34, lr: 1.01e-02
2024-08-06 11:31:57,661 INFO [trainer.py:765] (1/8) Epoch 12, batch 300, train_loss[loss=2.833, ArTop10Accuracy=0.7634, over 14193.00 frames. ], tot_loss[loss=2.785, ArTop10Accuracy=0.7732, over 9379.82 frames. ], batch size: 45, lr: 1.01e-02
2024-08-06 11:33:30,744 INFO [trainer.py:765] (1/8) Epoch 12, batch 400, train_loss[loss=2.727, ArTop10Accuracy=0.7805, over 10134.00 frames. ], tot_loss[loss=2.785, ArTop10Accuracy=0.7731, over 10292.35 frames. ], batch size: 14, lr: 1.00e-02
2024-08-06 11:34:55,737 INFO [trainer.py:765] (1/8) Epoch 12, batch 500, train_loss[loss=2.804, ArTop10Accuracy=0.7718, over 11982.00 frames. ], tot_loss[loss=2.781, ArTop10Accuracy=0.7739, over 10839.09 frames. ], batch size: 22, lr: 1.00e-02
2024-08-06 11:36:29,367 INFO [trainer.py:765] (1/8) Epoch 12, batch 600, train_loss[loss=2.87, ArTop10Accuracy=0.7539, over 11367.00 frames. ], tot_loss[loss=2.788, ArTop10Accuracy=0.7727, over 11335.89 frames. ], batch size: 18, lr: 9.97e-03
2024-08-06 11:38:00,349 INFO [trainer.py:765] (1/8) Epoch 12, batch 700, train_loss[loss=2.716, ArTop10Accuracy=0.7844, over 10320.00 frames. ], tot_loss[loss=2.794, ArTop10Accuracy=0.7717, over 11503.62 frames. ], batch size: 12, lr: 9.93e-03
2024-08-06 11:39:23,617 INFO [trainer.py:765] (1/8) Epoch 12, batch 800, train_loss[loss=2.846, ArTop10Accuracy=0.7584, over 10128.00 frames. ], tot_loss[loss=2.798, ArTop10Accuracy=0.7709, over 11629.35 frames. ], batch size: 12, lr: 9.90e-03
2024-08-06 11:40:39,895 INFO [trainer.py:765] (1/8) Epoch 12, batch 900, train_loss[loss=2.831, ArTop10Accuracy=0.7651, over 12711.00 frames. ], tot_loss[loss=2.793, ArTop10Accuracy=0.772, over 11677.66 frames. ], batch size: 27, lr: 9.87e-03
2024-08-06 11:41:13,999 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.041e+02 1.248e+02 1.348e+02 1.459e+02 5.540e+02, threshold=2.695e+02, percent-clipped=0.3
2024-08-06 11:41:56,195 INFO [trainer.py:765] (1/8) Epoch 12, batch 1000, train_loss[loss=2.809, ArTop10Accuracy=0.7676, over 12882.00 frames. ], tot_loss[loss=2.797, ArTop10Accuracy=0.7709, over 11898.86 frames. ], batch size: 27, lr: 9.85e-03
2024-08-06 11:43:14,326 INFO [trainer.py:765] (1/8) Epoch 12, batch 1100, train_loss[loss=2.782, ArTop10Accuracy=0.7774, over 13779.00 frames. ], tot_loss[loss=2.805, ArTop10Accuracy=0.7695, over 11949.96 frames. ], batch size: 34, lr: 9.82e-03
2024-08-06 11:44:26,162 INFO [trainer.py:765] (1/8) Epoch 12, batch 1200, train_loss[loss=2.944, ArTop10Accuracy=0.7393, over 12054.00 frames. ], tot_loss[loss=2.803, ArTop10Accuracy=0.7701, over 11864.81 frames. ], batch size: 101, lr: 9.79e-03
2024-08-06 11:45:26,869 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 11:47:26,604 INFO [trainer.py:765] (1/8) Epoch 13, batch 100, train_loss[loss=2.844, ArTop10Accuracy=0.7621, over 14415.00 frames. ], tot_loss[loss=2.788, ArTop10Accuracy=0.7719, over 4772.92 frames. ], batch size: 62, lr: 9.37e-03
2024-08-06 11:48:54,785 INFO [trainer.py:765] (1/8) Epoch 13, batch 200, train_loss[loss=2.752, ArTop10Accuracy=0.7792, over 13590.00 frames. ], tot_loss[loss=2.782, ArTop10Accuracy=0.7732, over 7754.83 frames. ], batch size: 34, lr: 9.34e-03
2024-08-06 11:50:20,521 INFO [trainer.py:765] (1/8) Epoch 13, batch 300, train_loss[loss=2.766, ArTop10Accuracy=0.7763, over 14256.00 frames. ], tot_loss[loss=2.776, ArTop10Accuracy=0.7748, over 9373.21 frames. ], batch size: 44, lr: 9.31e-03
2024-08-06 11:51:48,771 INFO [trainer.py:765] (1/8) Epoch 13, batch 400, train_loss[loss=2.624, ArTop10Accuracy=0.8076, over 10278.00 frames. ], tot_loss[loss=2.773, ArTop10Accuracy=0.7757, over 10275.22 frames. ], batch size: 14, lr: 9.28e-03
2024-08-06 11:53:13,412 INFO [trainer.py:765] (1/8) Epoch 13, batch 500, train_loss[loss=2.676, ArTop10Accuracy=0.798, over 12201.00 frames. ], tot_loss[loss=2.769, ArTop10Accuracy=0.7763, over 10857.58 frames. ], batch size: 22, lr: 9.26e-03
2024-08-06 11:54:52,229 INFO [trainer.py:765] (1/8) Epoch 13, batch 600, train_loss[loss=2.698, ArTop10Accuracy=0.7882, over 11418.00 frames. ], tot_loss[loss=2.776, ArTop10Accuracy=0.7749, over 11355.82 frames. ], batch size: 18, lr: 9.23e-03
2024-08-06 11:55:47,086 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 11:55:56,834 INFO [trainer.py:811] (1/8) Epoch 13, validation: loss=2.824, ArTop10Accuracy=0.7662, over 1827537.00 frames. 
2024-08-06 11:55:56,835 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 33972MB
2024-08-06 11:55:57,718 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.064e+02 1.255e+02 1.343e+02 1.452e+02 4.888e+02, threshold=2.687e+02, percent-clipped=0.1
2024-08-06 11:56:28,471 INFO [trainer.py:765] (1/8) Epoch 13, batch 700, train_loss[loss=2.772, ArTop10Accuracy=0.7774, over 9309.00 frames. ], tot_loss[loss=2.783, ArTop10Accuracy=0.7735, over 11488.70 frames. ], batch size: 11, lr: 9.20e-03
2024-08-06 11:57:46,690 INFO [trainer.py:765] (1/8) Epoch 13, batch 800, train_loss[loss=2.713, ArTop10Accuracy=0.7875, over 10161.00 frames. ], tot_loss[loss=2.786, ArTop10Accuracy=0.773, over 11619.65 frames. ], batch size: 12, lr: 9.18e-03
2024-08-06 11:59:03,290 INFO [trainer.py:765] (1/8) Epoch 13, batch 900, train_loss[loss=2.756, ArTop10Accuracy=0.7807, over 13041.00 frames. ], tot_loss[loss=2.782, ArTop10Accuracy=0.7737, over 11686.03 frames. ], batch size: 27, lr: 9.15e-03
2024-08-06 12:00:19,180 INFO [trainer.py:765] (1/8) Epoch 13, batch 1000, train_loss[loss=2.81, ArTop10Accuracy=0.7706, over 12786.00 frames. ], tot_loss[loss=2.786, ArTop10Accuracy=0.773, over 11890.74 frames. ], batch size: 27, lr: 9.13e-03
2024-08-06 12:01:34,885 INFO [trainer.py:765] (1/8) Epoch 13, batch 1100, train_loss[loss=2.775, ArTop10Accuracy=0.7717, over 13575.00 frames. ], tot_loss[loss=2.797, ArTop10Accuracy=0.7709, over 11954.47 frames. ], batch size: 34, lr: 9.10e-03
2024-08-06 12:02:48,669 INFO [trainer.py:765] (1/8) Epoch 13, batch 1200, train_loss[loss=2.981, ArTop10Accuracy=0.7357, over 12087.00 frames. ], tot_loss[loss=2.795, ArTop10Accuracy=0.7714, over 11877.57 frames. ], batch size: 101, lr: 9.08e-03
2024-08-06 12:03:48,616 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 12:05:45,337 INFO [trainer.py:765] (1/8) Epoch 14, batch 100, train_loss[loss=2.83, ArTop10Accuracy=0.766, over 14226.00 frames. ], tot_loss[loss=2.772, ArTop10Accuracy=0.7751, over 4764.55 frames. ], batch size: 62, lr: 8.71e-03
2024-08-06 12:07:16,607 INFO [trainer.py:765] (1/8) Epoch 14, batch 200, train_loss[loss=2.838, ArTop10Accuracy=0.7595, over 13674.00 frames. ], tot_loss[loss=2.762, ArTop10Accuracy=0.7773, over 7749.81 frames. ], batch size: 34, lr: 8.69e-03
2024-08-06 12:08:44,317 INFO [trainer.py:765] (1/8) Epoch 14, batch 300, train_loss[loss=2.832, ArTop10Accuracy=0.7616, over 14316.00 frames. ], tot_loss[loss=2.762, ArTop10Accuracy=0.777, over 9372.28 frames. ], batch size: 44, lr: 8.66e-03
2024-08-06 12:10:01,135 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.072e+02 1.266e+02 1.374e+02 1.483e+02 6.480e+02, threshold=2.748e+02, percent-clipped=0.2
2024-08-06 12:10:10,232 INFO [trainer.py:765] (1/8) Epoch 14, batch 400, train_loss[loss=2.79, ArTop10Accuracy=0.7714, over 10845.00 frames. ], tot_loss[loss=2.761, ArTop10Accuracy=0.7776, over 10295.62 frames. ], batch size: 15, lr: 8.64e-03
2024-08-06 12:11:36,157 INFO [trainer.py:765] (1/8) Epoch 14, batch 500, train_loss[loss=2.754, ArTop10Accuracy=0.7758, over 11991.00 frames. ], tot_loss[loss=2.759, ArTop10Accuracy=0.778, over 10856.86 frames. ], batch size: 22, lr: 8.62e-03
2024-08-06 12:13:06,000 INFO [trainer.py:765] (1/8) Epoch 14, batch 600, train_loss[loss=2.779, ArTop10Accuracy=0.7749, over 11514.00 frames. ], tot_loss[loss=2.762, ArTop10Accuracy=0.7771, over 11376.97 frames. ], batch size: 18, lr: 8.59e-03
2024-08-06 12:14:38,559 INFO [trainer.py:765] (1/8) Epoch 14, batch 700, train_loss[loss=2.725, ArTop10Accuracy=0.7884, over 9402.00 frames. ], tot_loss[loss=2.768, ArTop10Accuracy=0.776, over 11521.69 frames. ], batch size: 11, lr: 8.57e-03
2024-08-06 12:15:58,076 INFO [trainer.py:765] (1/8) Epoch 14, batch 800, train_loss[loss=2.679, ArTop10Accuracy=0.795, over 9294.00 frames. ], tot_loss[loss=2.771, ArTop10Accuracy=0.7755, over 11635.91 frames. ], batch size: 11, lr: 8.55e-03
2024-08-06 12:17:12,872 INFO [trainer.py:765] (1/8) Epoch 14, batch 900, train_loss[loss=2.759, ArTop10Accuracy=0.7819, over 13014.00 frames. ], tot_loss[loss=2.768, ArTop10Accuracy=0.7764, over 11670.90 frames. ], batch size: 27, lr: 8.52e-03
2024-08-06 12:18:29,618 INFO [trainer.py:765] (1/8) Epoch 14, batch 1000, train_loss[loss=2.777, ArTop10Accuracy=0.7752, over 13023.00 frames. ], tot_loss[loss=2.771, ArTop10Accuracy=0.7758, over 11869.41 frames. ], batch size: 27, lr: 8.50e-03
2024-08-06 12:19:45,382 INFO [trainer.py:765] (1/8) Epoch 14, batch 1100, train_loss[loss=2.772, ArTop10Accuracy=0.7752, over 13365.00 frames. ], tot_loss[loss=2.781, ArTop10Accuracy=0.7738, over 11956.46 frames. ], batch size: 34, lr: 8.48e-03
2024-08-06 12:20:59,284 INFO [trainer.py:765] (1/8) Epoch 14, batch 1200, train_loss[loss=2.913, ArTop10Accuracy=0.749, over 11832.00 frames. ], tot_loss[loss=2.779, ArTop10Accuracy=0.7742, over 11850.00 frames. ], batch size: 101, lr: 8.46e-03
2024-08-06 12:21:57,643 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 12:23:51,968 INFO [trainer.py:765] (1/8) Epoch 15, batch 100, train_loss[loss=2.838, ArTop10Accuracy=0.7632, over 14451.00 frames. ], tot_loss[loss=2.759, ArTop10Accuracy=0.7777, over 4747.89 frames. ], batch size: 62, lr: 8.14e-03
2024-08-06 12:24:00,605 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 12:24:10,290 INFO [trainer.py:811] (1/8) Epoch 15, validation: loss=2.819, ArTop10Accuracy=0.7675, over 1827537.00 frames. 
2024-08-06 12:24:10,290 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 33972MB
2024-08-06 12:24:11,100 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.080e+02 1.284e+02 1.371e+02 1.488e+02 4.667e+02, threshold=2.743e+02, percent-clipped=0.2
2024-08-06 12:25:29,992 INFO [trainer.py:765] (1/8) Epoch 15, batch 200, train_loss[loss=2.771, ArTop10Accuracy=0.7773, over 13839.00 frames. ], tot_loss[loss=2.76, ArTop10Accuracy=0.7774, over 7753.74 frames. ], batch size: 35, lr: 8.12e-03
2024-08-06 12:26:58,701 INFO [trainer.py:765] (1/8) Epoch 15, batch 300, train_loss[loss=2.794, ArTop10Accuracy=0.7749, over 13956.00 frames. ], tot_loss[loss=2.756, ArTop10Accuracy=0.7783, over 9373.07 frames. ], batch size: 44, lr: 8.09e-03
2024-08-06 12:28:28,538 INFO [trainer.py:765] (1/8) Epoch 15, batch 400, train_loss[loss=2.758, ArTop10Accuracy=0.777, over 10365.00 frames. ], tot_loss[loss=2.752, ArTop10Accuracy=0.7793, over 10266.45 frames. ], batch size: 14, lr: 8.07e-03
2024-08-06 12:29:54,036 INFO [trainer.py:765] (1/8) Epoch 15, batch 500, train_loss[loss=2.73, ArTop10Accuracy=0.7861, over 12609.00 frames. ], tot_loss[loss=2.748, ArTop10Accuracy=0.7801, over 10847.41 frames. ], batch size: 23, lr: 8.05e-03
2024-08-06 12:31:23,297 INFO [trainer.py:765] (1/8) Epoch 15, batch 600, train_loss[loss=2.736, ArTop10Accuracy=0.7852, over 11271.00 frames. ], tot_loss[loss=2.746, ArTop10Accuracy=0.7805, over 11369.05 frames. ], batch size: 18, lr: 8.03e-03
2024-08-06 12:32:53,180 INFO [trainer.py:765] (1/8) Epoch 15, batch 700, train_loss[loss=2.627, ArTop10Accuracy=0.8074, over 10299.00 frames. ], tot_loss[loss=2.752, ArTop10Accuracy=0.7793, over 11529.74 frames. ], batch size: 12, lr: 8.01e-03
2024-08-06 12:34:18,260 INFO [trainer.py:765] (1/8) Epoch 15, batch 800, train_loss[loss=2.647, ArTop10Accuracy=0.8006, over 9471.00 frames. ], tot_loss[loss=2.759, ArTop10Accuracy=0.7781, over 11626.40 frames. ], batch size: 11, lr: 7.99e-03
2024-08-06 12:35:34,733 INFO [trainer.py:765] (1/8) Epoch 15, batch 900, train_loss[loss=2.757, ArTop10Accuracy=0.7784, over 12828.00 frames. ], tot_loss[loss=2.753, ArTop10Accuracy=0.7794, over 11678.28 frames. ], batch size: 27, lr: 7.97e-03
2024-08-06 12:36:50,547 INFO [trainer.py:765] (1/8) Epoch 15, batch 1000, train_loss[loss=2.76, ArTop10Accuracy=0.7799, over 12825.00 frames. ], tot_loss[loss=2.76, ArTop10Accuracy=0.7781, over 11871.63 frames. ], batch size: 27, lr: 7.95e-03
2024-08-06 12:38:05,183 INFO [trainer.py:765] (1/8) Epoch 15, batch 1100, train_loss[loss=2.717, ArTop10Accuracy=0.785, over 13494.00 frames. ], tot_loss[loss=2.769, ArTop10Accuracy=0.7765, over 11950.15 frames. ], batch size: 34, lr: 7.93e-03
2024-08-06 12:38:12,847 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.080e+02 1.293e+02 1.379e+02 1.467e+02 2.824e+02, threshold=2.759e+02, percent-clipped=0.1
2024-08-06 12:39:18,795 INFO [trainer.py:765] (1/8) Epoch 15, batch 1200, train_loss[loss=2.892, ArTop10Accuracy=0.7571, over 12477.00 frames. ], tot_loss[loss=2.769, ArTop10Accuracy=0.7762, over 11866.65 frames. ], batch size: 101, lr: 7.91e-03
2024-08-06 12:40:18,961 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 12:42:17,623 INFO [trainer.py:765] (1/8) Epoch 16, batch 100, train_loss[loss=2.815, ArTop10Accuracy=0.765, over 14853.00 frames. ], tot_loss[loss=2.743, ArTop10Accuracy=0.781, over 4763.41 frames. ], batch size: 62, lr: 7.63e-03
2024-08-06 12:43:49,569 INFO [trainer.py:765] (1/8) Epoch 16, batch 200, train_loss[loss=2.731, ArTop10Accuracy=0.7824, over 13905.00 frames. ], tot_loss[loss=2.739, ArTop10Accuracy=0.7817, over 7738.07 frames. ], batch size: 35, lr: 7.61e-03
2024-08-06 12:45:18,507 INFO [trainer.py:765] (1/8) Epoch 16, batch 300, train_loss[loss=2.766, ArTop10Accuracy=0.7766, over 14256.00 frames. ], tot_loss[loss=2.735, ArTop10Accuracy=0.7825, over 9366.92 frames. ], batch size: 44, lr: 7.59e-03
2024-08-06 12:46:45,212 INFO [trainer.py:765] (1/8) Epoch 16, batch 400, train_loss[loss=2.789, ArTop10Accuracy=0.7706, over 10821.00 frames. ], tot_loss[loss=2.733, ArTop10Accuracy=0.7829, over 10278.07 frames. ], batch size: 15, lr: 7.58e-03
2024-08-06 12:48:16,316 INFO [trainer.py:765] (1/8) Epoch 16, batch 500, train_loss[loss=2.804, ArTop10Accuracy=0.771, over 12237.00 frames. ], tot_loss[loss=2.731, ArTop10Accuracy=0.7833, over 10829.44 frames. ], batch size: 22, lr: 7.56e-03
2024-08-06 12:49:46,648 INFO [trainer.py:765] (1/8) Epoch 16, batch 600, train_loss[loss=2.69, ArTop10Accuracy=0.794, over 11436.00 frames. ], tot_loss[loss=2.736, ArTop10Accuracy=0.7824, over 11359.23 frames. ], batch size: 18, lr: 7.54e-03
2024-08-06 12:51:23,687 INFO [trainer.py:765] (1/8) Epoch 16, batch 700, train_loss[loss=2.575, ArTop10Accuracy=0.8169, over 10113.00 frames. ], tot_loss[loss=2.737, ArTop10Accuracy=0.7824, over 11499.13 frames. ], batch size: 12, lr: 7.52e-03
2024-08-06 12:52:43,507 INFO [trainer.py:765] (1/8) Epoch 16, batch 800, train_loss[loss=2.739, ArTop10Accuracy=0.7811, over 9315.00 frames. ], tot_loss[loss=2.743, ArTop10Accuracy=0.7812, over 11618.42 frames. ], batch size: 11, lr: 7.51e-03
2024-08-06 12:53:06,020 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 12:53:15,496 INFO [trainer.py:811] (1/8) Epoch 16, validation: loss=2.816, ArTop10Accuracy=0.7678, over 1827537.00 frames. 
2024-08-06 12:53:15,496 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 33972MB
2024-08-06 12:53:16,191 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.112e+02 1.291e+02 1.391e+02 1.487e+02 3.459e+02, threshold=2.783e+02, percent-clipped=0.1
2024-08-06 12:54:06,487 INFO [trainer.py:765] (1/8) Epoch 16, batch 900, train_loss[loss=2.656, ArTop10Accuracy=0.7962, over 12948.00 frames. ], tot_loss[loss=2.738, ArTop10Accuracy=0.7822, over 11658.62 frames. ], batch size: 27, lr: 7.49e-03
2024-08-06 12:55:19,797 INFO [trainer.py:765] (1/8) Epoch 16, batch 1000, train_loss[loss=2.713, ArTop10Accuracy=0.7918, over 12969.00 frames. ], tot_loss[loss=2.746, ArTop10Accuracy=0.7808, over 11870.63 frames. ], batch size: 27, lr: 7.47e-03
2024-08-06 12:56:33,168 INFO [trainer.py:765] (1/8) Epoch 16, batch 1100, train_loss[loss=2.718, ArTop10Accuracy=0.7869, over 13368.00 frames. ], tot_loss[loss=2.757, ArTop10Accuracy=0.7786, over 11929.12 frames. ], batch size: 34, lr: 7.45e-03
2024-08-06 12:57:48,491 INFO [trainer.py:765] (1/8) Epoch 16, batch 1200, train_loss[loss=2.885, ArTop10Accuracy=0.7546, over 12060.00 frames. ], tot_loss[loss=2.755, ArTop10Accuracy=0.7789, over 11851.65 frames. ], batch size: 101, lr: 7.44e-03
2024-08-06 12:58:48,499 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 13:00:47,904 INFO [trainer.py:765] (1/8) Epoch 17, batch 100, train_loss[loss=2.817, ArTop10Accuracy=0.7671, over 14796.00 frames. ], tot_loss[loss=2.734, ArTop10Accuracy=0.7833, over 4759.58 frames. ], batch size: 62, lr: 7.18e-03
2024-08-06 13:02:19,308 INFO [trainer.py:765] (1/8) Epoch 17, batch 200, train_loss[loss=2.825, ArTop10Accuracy=0.7651, over 13587.00 frames. ], tot_loss[loss=2.733, ArTop10Accuracy=0.7833, over 7743.12 frames. ], batch size: 34, lr: 7.17e-03
2024-08-06 13:03:45,523 INFO [trainer.py:765] (1/8) Epoch 17, batch 300, train_loss[loss=2.757, ArTop10Accuracy=0.7768, over 14088.00 frames. ], tot_loss[loss=2.726, ArTop10Accuracy=0.7844, over 9362.71 frames. ], batch size: 44, lr: 7.15e-03
2024-08-06 13:05:21,767 INFO [trainer.py:765] (1/8) Epoch 17, batch 400, train_loss[loss=2.723, ArTop10Accuracy=0.7856, over 10410.00 frames. ], tot_loss[loss=2.725, ArTop10Accuracy=0.7844, over 10281.41 frames. ], batch size: 14, lr: 7.14e-03
2024-08-06 13:06:47,027 INFO [trainer.py:765] (1/8) Epoch 17, batch 500, train_loss[loss=2.711, ArTop10Accuracy=0.7892, over 12222.00 frames. ], tot_loss[loss=2.724, ArTop10Accuracy=0.7845, over 10840.39 frames. ], batch size: 22, lr: 7.12e-03
2024-08-06 13:07:39,882 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.140e+02 1.293e+02 1.386e+02 1.488e+02 3.253e+02, threshold=2.772e+02, percent-clipped=0.1
2024-08-06 13:08:22,694 INFO [trainer.py:765] (1/8) Epoch 17, batch 600, train_loss[loss=2.759, ArTop10Accuracy=0.779, over 11331.00 frames. ], tot_loss[loss=2.728, ArTop10Accuracy=0.7837, over 11352.01 frames. ], batch size: 18, lr: 7.10e-03
2024-08-06 13:09:54,842 INFO [trainer.py:765] (1/8) Epoch 17, batch 700, train_loss[loss=2.614, ArTop10Accuracy=0.8014, over 10098.00 frames. ], tot_loss[loss=2.734, ArTop10Accuracy=0.7825, over 11507.09 frames. ], batch size: 12, lr: 7.09e-03
2024-08-06 13:11:19,487 INFO [trainer.py:765] (1/8) Epoch 17, batch 800, train_loss[loss=2.756, ArTop10Accuracy=0.7715, over 10443.00 frames. ], tot_loss[loss=2.739, ArTop10Accuracy=0.7816, over 11651.32 frames. ], batch size: 12, lr: 7.07e-03
2024-08-06 13:12:35,676 INFO [trainer.py:765] (1/8) Epoch 17, batch 900, train_loss[loss=2.768, ArTop10Accuracy=0.7783, over 12885.00 frames. ], tot_loss[loss=2.733, ArTop10Accuracy=0.7827, over 11681.22 frames. ], batch size: 27, lr: 7.06e-03
2024-08-06 13:13:53,068 INFO [trainer.py:765] (1/8) Epoch 17, batch 1000, train_loss[loss=2.674, ArTop10Accuracy=0.7989, over 12906.00 frames. ], tot_loss[loss=2.738, ArTop10Accuracy=0.782, over 11879.24 frames. ], batch size: 27, lr: 7.04e-03
2024-08-06 13:15:08,490 INFO [trainer.py:765] (1/8) Epoch 17, batch 1100, train_loss[loss=2.717, ArTop10Accuracy=0.7841, over 13662.00 frames. ], tot_loss[loss=2.745, ArTop10Accuracy=0.7807, over 11971.73 frames. ], batch size: 34, lr: 7.02e-03
2024-08-06 13:16:22,394 INFO [trainer.py:765] (1/8) Epoch 17, batch 1200, train_loss[loss=2.911, ArTop10Accuracy=0.7513, over 11856.00 frames. ], tot_loss[loss=2.748, ArTop10Accuracy=0.7802, over 11884.24 frames. ], batch size: 103, lr: 7.01e-03
2024-08-06 13:17:21,749 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 13:19:16,000 INFO [trainer.py:765] (1/8) Epoch 18, batch 100, train_loss[loss=2.769, ArTop10Accuracy=0.7795, over 14613.00 frames. ], tot_loss[loss=2.722, ArTop10Accuracy=0.7849, over 4769.80 frames. ], batch size: 63, lr: 6.78e-03
2024-08-06 13:20:46,604 INFO [trainer.py:765] (1/8) Epoch 18, batch 200, train_loss[loss=2.692, ArTop10Accuracy=0.7902, over 13776.00 frames. ], tot_loss[loss=2.72, ArTop10Accuracy=0.7853, over 7773.05 frames. ], batch size: 34, lr: 6.77e-03
2024-08-06 13:21:55,111 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 13:22:04,751 INFO [trainer.py:811] (1/8) Epoch 18, validation: loss=2.817, ArTop10Accuracy=0.768, over 1827537.00 frames. 
2024-08-06 13:22:04,752 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 33972MB
2024-08-06 13:22:05,479 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.131e+02 1.323e+02 1.409e+02 1.514e+02 3.209e+02, threshold=2.818e+02, percent-clipped=0.1
2024-08-06 13:22:26,587 INFO [trainer.py:765] (1/8) Epoch 18, batch 300, train_loss[loss=2.765, ArTop10Accuracy=0.7781, over 14328.00 frames. ], tot_loss[loss=2.717, ArTop10Accuracy=0.7859, over 9385.66 frames. ], batch size: 44, lr: 6.76e-03
2024-08-06 13:23:57,935 INFO [trainer.py:765] (1/8) Epoch 18, batch 400, train_loss[loss=2.545, ArTop10Accuracy=0.8139, over 10287.00 frames. ], tot_loss[loss=2.717, ArTop10Accuracy=0.7857, over 10291.69 frames. ], batch size: 14, lr: 6.74e-03
2024-08-06 13:25:34,019 INFO [trainer.py:765] (1/8) Epoch 18, batch 500, train_loss[loss=2.681, ArTop10Accuracy=0.7954, over 12234.00 frames. ], tot_loss[loss=2.713, ArTop10Accuracy=0.7866, over 10834.26 frames. ], batch size: 22, lr: 6.73e-03
2024-08-06 13:27:00,640 INFO [trainer.py:765] (1/8) Epoch 18, batch 600, train_loss[loss=2.674, ArTop10Accuracy=0.794, over 11292.00 frames. ], tot_loss[loss=2.717, ArTop10Accuracy=0.7857, over 11355.11 frames. ], batch size: 18, lr: 6.71e-03
2024-08-06 13:28:33,588 INFO [trainer.py:765] (1/8) Epoch 18, batch 700, train_loss[loss=2.683, ArTop10Accuracy=0.7953, over 10089.00 frames. ], tot_loss[loss=2.721, ArTop10Accuracy=0.7849, over 11511.68 frames. ], batch size: 12, lr: 6.70e-03
2024-08-06 13:29:54,989 INFO [trainer.py:765] (1/8) Epoch 18, batch 800, train_loss[loss=2.677, ArTop10Accuracy=0.7978, over 10206.00 frames. ], tot_loss[loss=2.725, ArTop10Accuracy=0.7842, over 11652.90 frames. ], batch size: 12, lr: 6.68e-03
2024-08-06 13:31:12,525 INFO [trainer.py:765] (1/8) Epoch 18, batch 900, train_loss[loss=2.734, ArTop10Accuracy=0.7881, over 12990.00 frames. ], tot_loss[loss=2.721, ArTop10Accuracy=0.7853, over 11691.99 frames. ], batch size: 27, lr: 6.67e-03
2024-08-06 13:32:26,557 INFO [trainer.py:765] (1/8) Epoch 18, batch 1000, train_loss[loss=2.804, ArTop10Accuracy=0.77, over 12660.00 frames. ], tot_loss[loss=2.727, ArTop10Accuracy=0.7841, over 11895.92 frames. ], batch size: 27, lr: 6.66e-03
2024-08-06 13:33:41,503 INFO [trainer.py:765] (1/8) Epoch 18, batch 1100, train_loss[loss=2.752, ArTop10Accuracy=0.7808, over 13716.00 frames. ], tot_loss[loss=2.736, ArTop10Accuracy=0.7824, over 11960.66 frames. ], batch size: 35, lr: 6.64e-03
2024-08-06 13:34:54,680 INFO [trainer.py:765] (1/8) Epoch 18, batch 1200, train_loss[loss=2.896, ArTop10Accuracy=0.753, over 12399.00 frames. ], tot_loss[loss=2.737, ArTop10Accuracy=0.7821, over 11856.69 frames. ], batch size: 101, lr: 6.63e-03
2024-08-06 13:35:51,070 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.124e+02 1.340e+02 1.433e+02 1.533e+02 2.444e+02, threshold=2.867e+02, percent-clipped=0.0
2024-08-06 13:35:54,948 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 13:37:48,630 INFO [trainer.py:765] (1/8) Epoch 19, batch 100, train_loss[loss=2.699, ArTop10Accuracy=0.7911, over 14262.00 frames. ], tot_loss[loss=2.715, ArTop10Accuracy=0.7858, over 4758.72 frames. ], batch size: 63, lr: 6.43e-03
2024-08-06 13:39:23,263 INFO [trainer.py:765] (1/8) Epoch 19, batch 200, train_loss[loss=2.776, ArTop10Accuracy=0.7724, over 13521.00 frames. ], tot_loss[loss=2.709, ArTop10Accuracy=0.7868, over 7762.49 frames. ], batch size: 34, lr: 6.41e-03
2024-08-06 13:40:48,365 INFO [trainer.py:765] (1/8) Epoch 19, batch 300, train_loss[loss=2.801, ArTop10Accuracy=0.768, over 13980.00 frames. ], tot_loss[loss=2.712, ArTop10Accuracy=0.7867, over 9377.16 frames. ], batch size: 44, lr: 6.40e-03
2024-08-06 13:42:21,073 INFO [trainer.py:765] (1/8) Epoch 19, batch 400, train_loss[loss=2.678, ArTop10Accuracy=0.7925, over 10728.00 frames. ], tot_loss[loss=2.705, ArTop10Accuracy=0.7881, over 10287.82 frames. ], batch size: 15, lr: 6.39e-03
2024-08-06 13:43:44,961 INFO [trainer.py:765] (1/8) Epoch 19, batch 500, train_loss[loss=2.629, ArTop10Accuracy=0.8012, over 12252.00 frames. ], tot_loss[loss=2.701, ArTop10Accuracy=0.7885, over 10839.43 frames. ], batch size: 22, lr: 6.37e-03
2024-08-06 13:45:16,687 INFO [trainer.py:765] (1/8) Epoch 19, batch 600, train_loss[loss=2.795, ArTop10Accuracy=0.7709, over 11283.00 frames. ], tot_loss[loss=2.704, ArTop10Accuracy=0.7882, over 11371.27 frames. ], batch size: 18, lr: 6.36e-03
2024-08-06 13:46:48,328 INFO [trainer.py:765] (1/8) Epoch 19, batch 700, train_loss[loss=2.647, ArTop10Accuracy=0.7986, over 10188.00 frames. ], tot_loss[loss=2.712, ArTop10Accuracy=0.7867, over 11507.32 frames. ], batch size: 12, lr: 6.35e-03
2024-08-06 13:48:11,890 INFO [trainer.py:765] (1/8) Epoch 19, batch 800, train_loss[loss=2.643, ArTop10Accuracy=0.8003, over 10218.00 frames. ], tot_loss[loss=2.715, ArTop10Accuracy=0.7864, over 11619.30 frames. ], batch size: 12, lr: 6.34e-03
2024-08-06 13:49:27,263 INFO [trainer.py:765] (1/8) Epoch 19, batch 900, train_loss[loss=2.758, ArTop10Accuracy=0.777, over 13005.00 frames. ], tot_loss[loss=2.712, ArTop10Accuracy=0.7869, over 11663.49 frames. ], batch size: 27, lr: 6.32e-03
2024-08-06 13:50:40,658 INFO [trainer.py:803] (1/8) Computing validation loss
2024-08-06 13:50:50,535 INFO [trainer.py:811] (1/8) Epoch 19, validation: loss=2.818, ArTop10Accuracy=0.7679, over 1827537.00 frames. 
2024-08-06 13:50:50,536 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 33972MB
2024-08-06 13:50:51,493 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.161e+02 1.371e+02 1.455e+02 1.550e+02 3.697e+02, threshold=2.909e+02, percent-clipped=0.2
2024-08-06 13:50:52,923 INFO [trainer.py:765] (1/8) Epoch 19, batch 1000, train_loss[loss=2.812, ArTop10Accuracy=0.7639, over 12567.00 frames. ], tot_loss[loss=2.718, ArTop10Accuracy=0.7857, over 11873.45 frames. ], batch size: 27, lr: 6.31e-03
2024-08-06 13:52:08,273 INFO [trainer.py:765] (1/8) Epoch 19, batch 1100, train_loss[loss=2.733, ArTop10Accuracy=0.7851, over 13866.00 frames. ], tot_loss[loss=2.727, ArTop10Accuracy=0.7839, over 11972.86 frames. ], batch size: 34, lr: 6.30e-03
2024-08-06 13:53:22,319 INFO [trainer.py:765] (1/8) Epoch 19, batch 1200, train_loss[loss=2.845, ArTop10Accuracy=0.7572, over 12054.00 frames. ], tot_loss[loss=2.728, ArTop10Accuracy=0.7836, over 11871.82 frames. ], batch size: 101, lr: 6.28e-03
2024-08-06 13:54:21,985 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 13:56:12,911 INFO [trainer.py:765] (1/8) Epoch 20, batch 100, train_loss[loss=2.773, ArTop10Accuracy=0.7772, over 14652.00 frames. ], tot_loss[loss=2.711, ArTop10Accuracy=0.7862, over 4761.24 frames. ], batch size: 62, lr: 6.10e-03
2024-08-06 13:57:42,499 INFO [trainer.py:765] (1/8) Epoch 20, batch 200, train_loss[loss=2.695, ArTop10Accuracy=0.7874, over 13536.00 frames. ], tot_loss[loss=2.704, ArTop10Accuracy=0.7882, over 7748.16 frames. ], batch size: 34, lr: 6.09e-03
2024-08-06 13:59:15,436 INFO [trainer.py:765] (1/8) Epoch 20, batch 300, train_loss[loss=2.737, ArTop10Accuracy=0.779, over 14100.00 frames. ], tot_loss[loss=2.7, ArTop10Accuracy=0.7891, over 9381.75 frames. ], batch size: 44, lr: 6.08e-03
2024-08-06 14:00:44,362 INFO [trainer.py:765] (1/8) Epoch 20, batch 400, train_loss[loss=2.589, ArTop10Accuracy=0.8097, over 10383.00 frames. ], tot_loss[loss=2.698, ArTop10Accuracy=0.7892, over 10302.26 frames. ], batch size: 14, lr: 6.07e-03
2024-08-06 14:02:14,860 INFO [trainer.py:765] (1/8) Epoch 20, batch 500, train_loss[loss=2.766, ArTop10Accuracy=0.7745, over 12717.00 frames. ], tot_loss[loss=2.697, ArTop10Accuracy=0.7896, over 10858.59 frames. ], batch size: 23, lr: 6.06e-03
2024-08-06 14:03:40,860 INFO [trainer.py:765] (1/8) Epoch 20, batch 600, train_loss[loss=2.662, ArTop10Accuracy=0.7981, over 11499.00 frames. ], tot_loss[loss=2.699, ArTop10Accuracy=0.7891, over 11367.87 frames. ], batch size: 18, lr: 6.04e-03
2024-08-06 14:05:13,872 INFO [trainer.py:765] (1/8) Epoch 20, batch 700, train_loss[loss=2.667, ArTop10Accuracy=0.7949, over 10287.00 frames. ], tot_loss[loss=2.706, ArTop10Accuracy=0.7879, over 11530.59 frames. ], batch size: 12, lr: 6.03e-03
2024-08-06 14:05:30,795 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.180e+02 1.365e+02 1.456e+02 1.550e+02 3.525e+02, threshold=2.913e+02, percent-clipped=0.1
2024-08-06 14:06:34,515 INFO [trainer.py:765] (1/8) Epoch 20, batch 800, train_loss[loss=2.604, ArTop10Accuracy=0.8108, over 9273.00 frames. ], tot_loss[loss=2.713, ArTop10Accuracy=0.7865, over 11640.77 frames. ], batch size: 11, lr: 6.02e-03
2024-08-06 14:07:50,950 INFO [trainer.py:765] (1/8) Epoch 20, batch 900, train_loss[loss=2.748, ArTop10Accuracy=0.7846, over 12843.00 frames. ], tot_loss[loss=2.704, ArTop10Accuracy=0.7883, over 11683.45 frames. ], batch size: 27, lr: 6.01e-03
2024-08-06 14:09:07,180 INFO [trainer.py:765] (1/8) Epoch 20, batch 1000, train_loss[loss=2.718, ArTop10Accuracy=0.7892, over 12714.00 frames. ], tot_loss[loss=2.706, ArTop10Accuracy=0.7881, over 11882.84 frames. ], batch size: 27, lr: 6.00e-03
2024-08-06 14:10:21,216 INFO [trainer.py:765] (1/8) Epoch 20, batch 1100, train_loss[loss=2.752, ArTop10Accuracy=0.7788, over 14235.00 frames. ], tot_loss[loss=2.712, ArTop10Accuracy=0.7871, over 11959.70 frames. ], batch size: 35, lr: 5.99e-03
2024-08-06 14:11:37,820 INFO [trainer.py:765] (1/8) Epoch 20, batch 1200, train_loss[loss=2.856, ArTop10Accuracy=0.758, over 12105.00 frames. ], tot_loss[loss=2.715, ArTop10Accuracy=0.7866, over 11876.99 frames. ], batch size: 101, lr: 5.98e-03
2024-08-06 14:12:36,847 INFO [trainer.py:650] (1/8) Reaches end of dataloader.
2024-08-06 14:12:36,850 INFO [trainer.py:1069] (1/8) Done!