2024-08-06 03:39:40,318 INFO [trainer.py:870] (2/8) Training started 2024-08-06 03:39:40,319 INFO [trainer.py:889] (2/8) Device: cuda:2 2024-08-06 03:39:40,319 INFO [trainer.py:890] (2/8) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 100, 'reset_interval': 200, 'valid_interval': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '279b0c87015a615b81b147251814d737a548f397', 'k2-git-date': 'Wed May 24 22:24:09 2023', 'lhotse-version': '1.26.0', 'torch-version': '2.0.1+cu118', 'torch-cuda-available': True, 'torch-cuda-version': '11.8', 'python-version': '3.10', 'icefall-git-branch': 'main', 'icefall-git-sha1': '7d2e5f4-dirty', 'icefall-git-date': 'Tue Aug 6 02:59:12 2024', 'icefall-path': '/workspace/icefall_llm', 'k2-path': '/usr/local/lib/python3.10/dist-packages/k2/__init__.py', 'lhotse-path': '/usr/local/lib/python3.10/dist-packages/lhotse/__init__.py', 'hostname': '6865771', 'IP address': '0.104.195.107'}, 'world_size': 8, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 20, 'start_epoch': 1, 'start_batch': 0, 'exp_dir': PosixPath('exp/valle'), 'optimizer_name': 'ScaledAdam', 'scheduler_name': 'Eden', 'base_lr': 0.03, 'warmup_steps': 200, 'seed': 42, 'inf_check': False, 'save_every_n': 1000, 'keep_last_k': 20, 'average_period': 0, 'accumulate_grad_steps': 1, 'dtype': 'bfloat16', 'filter_min_duration': 0.5, 'filter_max_duration': 14.0, 'train_stage': 1, 'visualize': False, 'oom_check': False, 'model_name': 'valle', 'decoder_dim': 1024, 'nhead': 16, 'num_decoder_layers': 12, 'scale_factor': 1.0, 'norm_first': True, 'add_prenet': False, 'prefix_mode': 1, 'share_embedding': True, 'prepend_bos': False, 'num_quantizers': 8, 'scaling_xformers': False, 'manifest_dir': PosixPath('data/tokenized'), 'max_duration': 320, 'bucketing_sampler': True, 'num_buckets': 6, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 0.1, 'on_the_fly_feats': False, 'shuffle': True, 'buffer_size': 40000, 'shuffle_buffer_size': 100000, 'drop_last': False, 'return_cuts': True, 'num_workers': 8, 'enable_spec_aug': False, 'spec_aug_time_warp_factor': 80, 'input_strategy': 'PrecomputedFeatures', 'dataset': 'libritts', 'text_tokens': 'data/tokenized/unique_text_tokens.k2symbols', 'sampling_rate': 24000} 2024-08-06 03:39:40,319 INFO [trainer.py:892] (2/8) About to create model 2024-08-06 03:39:41,079 INFO [trainer.py:899] (2/8) Number of model parameters: 367386628 2024-08-06 03:39:41,905 INFO [trainer.py:914] (2/8) Using DDP 2024-08-06 03:39:43,993 INFO [datamodule.py:427] (2/8) About to get train cuts 2024-08-06 03:39:43,995 INFO [datamodule.py:434] (2/8) About to get dev cuts 2024-08-06 03:39:43,997 INFO [datamodule.py:292] (2/8) Disable SpecAugment 2024-08-06 03:39:43,997 INFO [datamodule.py:294] (2/8) About to create train dataset 2024-08-06 03:39:43,998 INFO [datamodule.py:323] (2/8) Using DynamicBucketingSampler 2024-08-06 03:39:44,608 INFO [datamodule.py:344] (2/8) About to create train dataloader 2024-08-06 03:39:44,608 INFO [datamodule.py:367] (2/8) About to create dev dataset 2024-08-06 03:39:44,934 INFO [datamodule.py:388] (2/8) About to create dev dataloader 2024-08-06 03:40:39,570 INFO [trainer.py:765] (2/8) Epoch 1, batch 100, train_loss[loss=4.238, ArTop10Accuracy=0.4902, over 14599.00 frames. ], tot_loss[loss=4.784, ArTop10Accuracy=0.3953, over 4788.28 frames. ], batch size: 61, lr: 2.25e-02 2024-08-06 03:41:16,922 INFO [trainer.py:765] (2/8) Epoch 1, batch 200, train_loss[loss=3.97, ArTop10Accuracy=0.5311, over 13617.00 frames. ], tot_loss[loss=4.305, ArTop10Accuracy=0.4754, over 7786.16 frames. ], batch size: 34, lr: 3.00e-02 2024-08-06 03:41:57,950 INFO [trainer.py:765] (2/8) Epoch 1, batch 300, train_loss[loss=3.828, ArTop10Accuracy=0.5518, over 14061.00 frames. ], tot_loss[loss=4.093, ArTop10Accuracy=0.5099, over 9425.13 frames. ], batch size: 44, lr: 3.00e-02 2024-08-06 03:42:33,080 INFO [trainer.py:765] (2/8) Epoch 1, batch 400, train_loss[loss=3.717, ArTop10Accuracy=0.5717, over 11056.00 frames. ], tot_loss[loss=3.942, ArTop10Accuracy=0.5348, over 10340.47 frames. ], batch size: 15, lr: 3.00e-02 2024-08-06 03:43:11,271 INFO [trainer.py:765] (2/8) Epoch 1, batch 500, train_loss[loss=3.542, ArTop10Accuracy=0.6033, over 12246.00 frames. ], tot_loss[loss=3.831, ArTop10Accuracy=0.5535, over 10902.49 frames. ], batch size: 22, lr: 2.99e-02 2024-08-06 03:43:46,593 INFO [trainer.py:765] (2/8) Epoch 1, batch 600, train_loss[loss=3.484, ArTop10Accuracy=0.6182, over 11327.00 frames. ], tot_loss[loss=3.746, ArTop10Accuracy=0.568, over 11421.62 frames. ], batch size: 18, lr: 2.99e-02 2024-08-06 03:44:27,899 INFO [trainer.py:765] (2/8) Epoch 1, batch 700, train_loss[loss=3.646, ArTop10Accuracy=0.5823, over 10104.00 frames. ], tot_loss[loss=3.685, ArTop10Accuracy=0.579, over 11567.62 frames. ], batch size: 12, lr: 2.99e-02 2024-08-06 03:45:01,514 INFO [trainer.py:765] (2/8) Epoch 1, batch 800, train_loss[loss=3.438, ArTop10Accuracy=0.6215, over 10209.00 frames. ], tot_loss[loss=3.637, ArTop10Accuracy=0.5874, over 11679.20 frames. ], batch size: 12, lr: 2.98e-02 2024-08-06 03:45:32,558 INFO [trainer.py:765] (2/8) Epoch 1, batch 900, train_loss[loss=3.594, ArTop10Accuracy=0.5968, over 12892.00 frames. ], tot_loss[loss=3.583, ArTop10Accuracy=0.5973, over 11727.14 frames. ], batch size: 27, lr: 2.98e-02 2024-08-06 03:46:03,649 INFO [trainer.py:765] (2/8) Epoch 1, batch 1000, train_loss[loss=3.459, ArTop10Accuracy=0.62, over 12809.00 frames. ], tot_loss[loss=3.55, ArTop10Accuracy=0.6035, over 11929.18 frames. ], batch size: 27, lr: 2.97e-02 2024-08-06 03:46:07,988 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 8.169e+01 1.565e+02 2.239e+02 3.485e+02 9.105e+03, threshold=4.478e+02, percent-clipped=0.0 2024-08-06 03:46:38,612 INFO [trainer.py:765] (2/8) Epoch 1, batch 1100, train_loss[loss=3.54, ArTop10Accuracy=0.6066, over 13569.00 frames. ], tot_loss[loss=3.526, ArTop10Accuracy=0.6082, over 11994.07 frames. ], batch size: 34, lr: 2.96e-02 2024-08-06 03:47:08,745 INFO [trainer.py:765] (2/8) Epoch 1, batch 1200, train_loss[loss=3.542, ArTop10Accuracy=0.606, over 11052.00 frames. ], tot_loss[loss=3.496, ArTop10Accuracy=0.6136, over 11929.77 frames. ], batch size: 99, lr: 2.96e-02 2024-08-06 03:47:33,759 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 03:48:38,677 INFO [trainer.py:765] (2/8) Epoch 2, batch 100, train_loss[loss=3.423, ArTop10Accuracy=0.6282, over 14455.00 frames. ], tot_loss[loss=3.441, ArTop10Accuracy=0.6243, over 4794.60 frames. ], batch size: 61, lr: 2.90e-02 2024-08-06 03:49:14,597 INFO [trainer.py:765] (2/8) Epoch 2, batch 200, train_loss[loss=3.527, ArTop10Accuracy=0.6072, over 13942.00 frames. ], tot_loss[loss=3.438, ArTop10Accuracy=0.6251, over 7807.13 frames. ], batch size: 34, lr: 2.89e-02 2024-08-06 03:49:56,520 INFO [trainer.py:765] (2/8) Epoch 2, batch 300, train_loss[loss=3.462, ArTop10Accuracy=0.6197, over 14367.00 frames. ], tot_loss[loss=3.426, ArTop10Accuracy=0.6271, over 9433.56 frames. ], batch size: 44, lr: 2.89e-02 2024-08-06 03:50:32,000 INFO [trainer.py:765] (2/8) Epoch 2, batch 400, train_loss[loss=3.259, ArTop10Accuracy=0.6613, over 10127.00 frames. ], tot_loss[loss=3.414, ArTop10Accuracy=0.6297, over 10340.39 frames. ], batch size: 14, lr: 2.88e-02 2024-08-06 03:51:17,110 INFO [trainer.py:765] (2/8) Epoch 2, batch 500, train_loss[loss=3.289, ArTop10Accuracy=0.6452, over 12078.00 frames. ], tot_loss[loss=3.407, ArTop10Accuracy=0.631, over 10899.71 frames. ], batch size: 22, lr: 2.87e-02 2024-08-06 03:51:53,206 INFO [trainer.py:765] (2/8) Epoch 2, batch 600, train_loss[loss=3.37, ArTop10Accuracy=0.6379, over 11569.00 frames. ], tot_loss[loss=3.397, ArTop10Accuracy=0.633, over 11439.27 frames. ], batch size: 18, lr: 2.86e-02 2024-08-06 03:52:38,994 INFO [trainer.py:765] (2/8) Epoch 2, batch 700, train_loss[loss=3.362, ArTop10Accuracy=0.6345, over 10167.00 frames. ], tot_loss[loss=3.396, ArTop10Accuracy=0.6331, over 11591.15 frames. ], batch size: 12, lr: 2.85e-02 2024-08-06 03:52:47,092 INFO [trainer.py:803] (2/8) Computing validation loss 2024-08-06 03:52:56,023 INFO [trainer.py:811] (2/8) Epoch 2, validation: loss=3.327, ArTop10Accuracy=0.6492, over 1829298.00 frames. 2024-08-06 03:52:56,024 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 28804MB 2024-08-06 03:52:56,541 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 8.181e+01 1.431e+02 1.849e+02 2.730e+02 2.344e+03, threshold=3.697e+02, percent-clipped=7.2 2024-08-06 03:53:21,881 INFO [trainer.py:765] (2/8) Epoch 2, batch 800, train_loss[loss=3.249, ArTop10Accuracy=0.6537, over 10208.00 frames. ], tot_loss[loss=3.386, ArTop10Accuracy=0.6347, over 11706.94 frames. ], batch size: 12, lr: 2.84e-02 2024-08-06 03:53:53,299 INFO [trainer.py:765] (2/8) Epoch 2, batch 900, train_loss[loss=3.354, ArTop10Accuracy=0.6334, over 12977.00 frames. ], tot_loss[loss=3.37, ArTop10Accuracy=0.638, over 11741.79 frames. ], batch size: 27, lr: 2.83e-02 2024-08-06 03:54:24,809 INFO [trainer.py:765] (2/8) Epoch 2, batch 1000, train_loss[loss=3.322, ArTop10Accuracy=0.6474, over 13032.00 frames. ], tot_loss[loss=3.363, ArTop10Accuracy=0.6394, over 11951.43 frames. ], batch size: 27, lr: 2.82e-02 2024-08-06 03:54:56,006 INFO [trainer.py:765] (2/8) Epoch 2, batch 1100, train_loss[loss=3.301, ArTop10Accuracy=0.6477, over 13736.00 frames. ], tot_loss[loss=3.362, ArTop10Accuracy=0.6397, over 11995.67 frames. ], batch size: 34, lr: 2.81e-02 2024-08-06 03:55:26,228 INFO [trainer.py:765] (2/8) Epoch 2, batch 1200, train_loss[loss=3.427, ArTop10Accuracy=0.632, over 12590.00 frames. ], tot_loss[loss=3.354, ArTop10Accuracy=0.6413, over 11940.59 frames. ], batch size: 99, lr: 2.80e-02 2024-08-06 03:55:51,132 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 03:57:04,102 INFO [trainer.py:765] (2/8) Epoch 3, batch 100, train_loss[loss=3.315, ArTop10Accuracy=0.6471, over 14358.00 frames. ], tot_loss[loss=3.315, ArTop10Accuracy=0.6487, over 4780.00 frames. ], batch size: 61, lr: 2.67e-02 2024-08-06 03:57:50,979 INFO [trainer.py:765] (2/8) Epoch 3, batch 200, train_loss[loss=3.234, ArTop10Accuracy=0.666, over 13654.00 frames. ], tot_loss[loss=3.293, ArTop10Accuracy=0.6534, over 7796.59 frames. ], batch size: 34, lr: 2.66e-02 2024-08-06 03:58:26,074 INFO [trainer.py:765] (2/8) Epoch 3, batch 300, train_loss[loss=3.247, ArTop10Accuracy=0.6639, over 14242.00 frames. ], tot_loss[loss=3.279, ArTop10Accuracy=0.6563, over 9417.80 frames. ], batch size: 44, lr: 2.64e-02 2024-08-06 03:59:11,253 INFO [trainer.py:765] (2/8) Epoch 3, batch 400, train_loss[loss=3.091, ArTop10Accuracy=0.6961, over 10417.00 frames. ], tot_loss[loss=3.26, ArTop10Accuracy=0.6596, over 10337.38 frames. ], batch size: 14, lr: 2.63e-02 2024-08-06 03:59:29,674 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 8.720e+01 1.461e+02 1.775e+02 2.344e+02 9.150e+02, threshold=3.550e+02, percent-clipped=5.2 2024-08-06 03:59:49,303 INFO [trainer.py:765] (2/8) Epoch 3, batch 500, train_loss[loss=3.186, ArTop10Accuracy=0.6673, over 12323.00 frames. ], tot_loss[loss=3.249, ArTop10Accuracy=0.6617, over 10910.34 frames. ], batch size: 22, lr: 2.62e-02 2024-08-06 04:00:35,095 INFO [trainer.py:765] (2/8) Epoch 3, batch 600, train_loss[loss=3.144, ArTop10Accuracy=0.6824, over 11789.00 frames. ], tot_loss[loss=3.23, ArTop10Accuracy=0.6655, over 11415.53 frames. ], batch size: 18, lr: 2.61e-02 2024-08-06 04:01:22,060 INFO [trainer.py:765] (2/8) Epoch 3, batch 700, train_loss[loss=3.153, ArTop10Accuracy=0.6802, over 10131.00 frames. ], tot_loss[loss=3.226, ArTop10Accuracy=0.6665, over 11576.97 frames. ], batch size: 12, lr: 2.60e-02 2024-08-06 04:01:56,270 INFO [trainer.py:765] (2/8) Epoch 3, batch 800, train_loss[loss=2.947, ArTop10Accuracy=0.7092, over 10105.00 frames. ], tot_loss[loss=3.216, ArTop10Accuracy=0.6688, over 11691.18 frames. ], batch size: 12, lr: 2.59e-02 2024-08-06 04:02:27,741 INFO [trainer.py:765] (2/8) Epoch 3, batch 900, train_loss[loss=3.278, ArTop10Accuracy=0.6582, over 12969.00 frames. ], tot_loss[loss=3.199, ArTop10Accuracy=0.6721, over 11735.44 frames. ], batch size: 27, lr: 2.57e-02 2024-08-06 04:02:59,284 INFO [trainer.py:765] (2/8) Epoch 3, batch 1000, train_loss[loss=3.106, ArTop10Accuracy=0.6993, over 13160.00 frames. ], tot_loss[loss=3.195, ArTop10Accuracy=0.6727, over 11942.81 frames. ], batch size: 27, lr: 2.56e-02 2024-08-06 04:03:30,942 INFO [trainer.py:765] (2/8) Epoch 3, batch 1100, train_loss[loss=3.136, ArTop10Accuracy=0.6876, over 13802.00 frames. ], tot_loss[loss=3.193, ArTop10Accuracy=0.6733, over 11997.13 frames. ], batch size: 34, lr: 2.55e-02 2024-08-06 04:04:01,313 INFO [trainer.py:765] (2/8) Epoch 3, batch 1200, train_loss[loss=3.282, ArTop10Accuracy=0.6577, over 12318.00 frames. ], tot_loss[loss=3.183, ArTop10Accuracy=0.6751, over 11945.69 frames. ], batch size: 99, lr: 2.54e-02 2024-08-06 04:04:26,694 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 04:05:43,369 INFO [trainer.py:765] (2/8) Epoch 4, batch 100, train_loss[loss=3.177, ArTop10Accuracy=0.6706, over 14302.00 frames. ], tot_loss[loss=3.14, ArTop10Accuracy=0.6842, over 4789.47 frames. ], batch size: 61, lr: 2.38e-02 2024-08-06 04:06:07,077 INFO [trainer.py:803] (2/8) Computing validation loss 2024-08-06 04:06:16,404 INFO [trainer.py:811] (2/8) Epoch 4, validation: loss=3.063, ArTop10Accuracy=0.7031, over 1829298.00 frames. 2024-08-06 04:06:16,404 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33194MB 2024-08-06 04:06:16,746 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.091e+02 1.493e+02 1.709e+02 2.068e+02 7.969e+02, threshold=3.418e+02, percent-clipped=2.9 2024-08-06 04:06:31,826 INFO [trainer.py:765] (2/8) Epoch 4, batch 200, train_loss[loss=3.072, ArTop10Accuracy=0.7041, over 13627.00 frames. ], tot_loss[loss=3.122, ArTop10Accuracy=0.6886, over 7782.21 frames. ], batch size: 34, lr: 2.37e-02 2024-08-06 04:07:18,544 INFO [trainer.py:765] (2/8) Epoch 4, batch 300, train_loss[loss=3.159, ArTop10Accuracy=0.6799, over 14517.00 frames. ], tot_loss[loss=3.116, ArTop10Accuracy=0.6894, over 9421.67 frames. ], batch size: 44, lr: 2.36e-02 2024-08-06 04:08:01,910 INFO [trainer.py:765] (2/8) Epoch 4, batch 400, train_loss[loss=3.091, ArTop10Accuracy=0.6954, over 10874.00 frames. ], tot_loss[loss=3.109, ArTop10Accuracy=0.69, over 10347.99 frames. ], batch size: 15, lr: 2.34e-02 2024-08-06 04:08:45,344 INFO [trainer.py:765] (2/8) Epoch 4, batch 500, train_loss[loss=3.166, ArTop10Accuracy=0.6733, over 12378.00 frames. ], tot_loss[loss=3.105, ArTop10Accuracy=0.6905, over 10907.40 frames. ], batch size: 22, lr: 2.33e-02 2024-08-06 04:09:37,071 INFO [trainer.py:765] (2/8) Epoch 4, batch 600, train_loss[loss=3.147, ArTop10Accuracy=0.6757, over 11511.00 frames. ], tot_loss[loss=3.11, ArTop10Accuracy=0.6894, over 11418.96 frames. ], batch size: 18, lr: 2.32e-02 2024-08-06 04:10:13,501 INFO [trainer.py:765] (2/8) Epoch 4, batch 700, train_loss[loss=3.185, ArTop10Accuracy=0.6861, over 10217.00 frames. ], tot_loss[loss=3.117, ArTop10Accuracy=0.6885, over 11566.09 frames. ], batch size: 12, lr: 2.31e-02 2024-08-06 04:10:51,959 INFO [trainer.py:765] (2/8) Epoch 4, batch 800, train_loss[loss=3.132, ArTop10Accuracy=0.689, over 10153.00 frames. ], tot_loss[loss=3.116, ArTop10Accuracy=0.6886, over 11678.64 frames. ], batch size: 12, lr: 2.30e-02 2024-08-06 04:11:23,331 INFO [trainer.py:765] (2/8) Epoch 4, batch 900, train_loss[loss=3.056, ArTop10Accuracy=0.701, over 13147.00 frames. ], tot_loss[loss=3.106, ArTop10Accuracy=0.6906, over 11728.67 frames. ], batch size: 28, lr: 2.29e-02 2024-08-06 04:11:54,826 INFO [trainer.py:765] (2/8) Epoch 4, batch 1000, train_loss[loss=2.987, ArTop10Accuracy=0.7195, over 12903.00 frames. ], tot_loss[loss=3.102, ArTop10Accuracy=0.691, over 11945.47 frames. ], batch size: 27, lr: 2.28e-02 2024-08-06 04:12:25,959 INFO [trainer.py:765] (2/8) Epoch 4, batch 1100, train_loss[loss=3.082, ArTop10Accuracy=0.693, over 13680.00 frames. ], tot_loss[loss=3.107, ArTop10Accuracy=0.6902, over 11976.90 frames. ], batch size: 34, lr: 2.26e-02 2024-08-06 04:12:48,545 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.106e+02 1.440e+02 1.608e+02 1.893e+02 7.925e+02, threshold=3.216e+02, percent-clipped=2.0 2024-08-06 04:12:58,827 INFO [trainer.py:765] (2/8) Epoch 4, batch 1200, train_loss[loss=3.199, ArTop10Accuracy=0.675, over 11681.00 frames. ], tot_loss[loss=3.103, ArTop10Accuracy=0.6913, over 11926.91 frames. ], batch size: 98, lr: 2.25e-02 2024-08-06 04:13:24,340 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 04:14:38,685 INFO [trainer.py:765] (2/8) Epoch 5, batch 100, train_loss[loss=3.067, ArTop10Accuracy=0.6987, over 14655.00 frames. ], tot_loss[loss=3.059, ArTop10Accuracy=0.7007, over 4800.60 frames. ], batch size: 61, lr: 2.10e-02 2024-08-06 04:15:26,826 INFO [trainer.py:765] (2/8) Epoch 5, batch 200, train_loss[loss=3.074, ArTop10Accuracy=0.7022, over 13615.00 frames. ], tot_loss[loss=3.054, ArTop10Accuracy=0.7018, over 7801.70 frames. ], batch size: 34, lr: 2.09e-02 2024-08-06 04:16:08,011 INFO [trainer.py:765] (2/8) Epoch 5, batch 300, train_loss[loss=3.081, ArTop10Accuracy=0.6943, over 14161.00 frames. ], tot_loss[loss=3.047, ArTop10Accuracy=0.7033, over 9412.98 frames. ], batch size: 44, lr: 2.08e-02 2024-08-06 04:16:53,134 INFO [trainer.py:765] (2/8) Epoch 5, batch 400, train_loss[loss=3.186, ArTop10Accuracy=0.6745, over 10281.00 frames. ], tot_loss[loss=3.051, ArTop10Accuracy=0.7019, over 10329.99 frames. ], batch size: 14, lr: 2.07e-02 2024-08-06 04:17:36,638 INFO [trainer.py:765] (2/8) Epoch 5, batch 500, train_loss[loss=3.071, ArTop10Accuracy=0.6972, over 12362.00 frames. ], tot_loss[loss=3.048, ArTop10Accuracy=0.7025, over 10896.69 frames. ], batch size: 22, lr: 2.06e-02 2024-08-06 04:18:22,114 INFO [trainer.py:765] (2/8) Epoch 5, batch 600, train_loss[loss=3.095, ArTop10Accuracy=0.6862, over 11747.00 frames. ], tot_loss[loss=3.048, ArTop10Accuracy=0.7022, over 11433.37 frames. ], batch size: 18, lr: 2.05e-02 2024-08-06 04:19:17,033 INFO [trainer.py:765] (2/8) Epoch 5, batch 700, train_loss[loss=2.958, ArTop10Accuracy=0.7146, over 10286.00 frames. ], tot_loss[loss=3.058, ArTop10Accuracy=0.7003, over 11570.08 frames. ], batch size: 12, lr: 2.04e-02 2024-08-06 04:19:51,066 INFO [trainer.py:765] (2/8) Epoch 5, batch 800, train_loss[loss=3.108, ArTop10Accuracy=0.6899, over 10094.00 frames. ], tot_loss[loss=3.062, ArTop10Accuracy=0.6995, over 11685.00 frames. ], batch size: 12, lr: 2.03e-02 2024-08-06 04:20:18,214 INFO [trainer.py:803] (2/8) Computing validation loss 2024-08-06 04:20:27,476 INFO [trainer.py:811] (2/8) Epoch 5, validation: loss=2.998, ArTop10Accuracy=0.7157, over 1829298.00 frames. 2024-08-06 04:20:27,476 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33194MB 2024-08-06 04:20:27,781 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.057e+02 1.385e+02 1.542e+02 1.759e+02 7.741e+02, threshold=3.083e+02, percent-clipped=0.7 2024-08-06 04:20:31,767 INFO [trainer.py:765] (2/8) Epoch 5, batch 900, train_loss[loss=3.109, ArTop10Accuracy=0.6952, over 12959.00 frames. ], tot_loss[loss=3.053, ArTop10Accuracy=0.7008, over 11728.45 frames. ], batch size: 27, lr: 2.02e-02 2024-08-06 04:21:03,306 INFO [trainer.py:765] (2/8) Epoch 5, batch 1000, train_loss[loss=3.045, ArTop10Accuracy=0.7074, over 12927.00 frames. ], tot_loss[loss=3.053, ArTop10Accuracy=0.7014, over 11930.08 frames. ], batch size: 27, lr: 2.01e-02 2024-08-06 04:21:34,451 INFO [trainer.py:765] (2/8) Epoch 5, batch 1100, train_loss[loss=3.074, ArTop10Accuracy=0.6958, over 13779.00 frames. ], tot_loss[loss=3.058, ArTop10Accuracy=0.7003, over 11984.06 frames. ], batch size: 34, lr: 2.00e-02 2024-08-06 04:22:04,752 INFO [trainer.py:765] (2/8) Epoch 5, batch 1200, train_loss[loss=3.174, ArTop10Accuracy=0.6778, over 12273.00 frames. ], tot_loss[loss=3.057, ArTop10Accuracy=0.7005, over 11929.53 frames. ], batch size: 99, lr: 1.99e-02 2024-08-06 04:22:30,397 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 04:23:46,282 INFO [trainer.py:765] (2/8) Epoch 6, batch 100, train_loss[loss=3.099, ArTop10Accuracy=0.6947, over 14271.00 frames. ], tot_loss[loss=3.027, ArTop10Accuracy=0.7069, over 4796.55 frames. ], batch size: 61, lr: 1.85e-02 2024-08-06 04:24:35,256 INFO [trainer.py:765] (2/8) Epoch 6, batch 200, train_loss[loss=3.011, ArTop10Accuracy=0.7076, over 13806.00 frames. ], tot_loss[loss=3.02, ArTop10Accuracy=0.7086, over 7806.97 frames. ], batch size: 34, lr: 1.84e-02 2024-08-06 04:25:16,676 INFO [trainer.py:765] (2/8) Epoch 6, batch 300, train_loss[loss=3.069, ArTop10Accuracy=0.6979, over 14163.00 frames. ], tot_loss[loss=3.018, ArTop10Accuracy=0.7089, over 9428.27 frames. ], batch size: 44, lr: 1.83e-02 2024-08-06 04:26:08,924 INFO [trainer.py:765] (2/8) Epoch 6, batch 400, train_loss[loss=2.913, ArTop10Accuracy=0.7321, over 10243.00 frames. ], tot_loss[loss=3.012, ArTop10Accuracy=0.7096, over 10322.01 frames. ], batch size: 14, lr: 1.83e-02 2024-08-06 04:26:51,486 INFO [trainer.py:765] (2/8) Epoch 6, batch 500, train_loss[loss=3.093, ArTop10Accuracy=0.6874, over 12358.00 frames. ], tot_loss[loss=3.009, ArTop10Accuracy=0.71, over 10891.35 frames. ], batch size: 22, lr: 1.82e-02 2024-08-06 04:27:39,298 INFO [trainer.py:765] (2/8) Epoch 6, batch 600, train_loss[loss=2.981, ArTop10Accuracy=0.7169, over 11535.00 frames. ], tot_loss[loss=3.014, ArTop10Accuracy=0.7088, over 11414.81 frames. ], batch size: 18, lr: 1.81e-02 2024-08-06 04:27:46,370 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.054e+02 1.343e+02 1.474e+02 1.660e+02 8.574e+02, threshold=2.947e+02, percent-clipped=0.6 2024-08-06 04:28:33,239 INFO [trainer.py:765] (2/8) Epoch 6, batch 700, train_loss[loss=2.864, ArTop10Accuracy=0.7439, over 10079.00 frames. ], tot_loss[loss=3.02, ArTop10Accuracy=0.7075, over 11550.48 frames. ], batch size: 12, lr: 1.80e-02 2024-08-06 04:29:11,216 INFO [trainer.py:765] (2/8) Epoch 6, batch 800, train_loss[loss=3.082, ArTop10Accuracy=0.6921, over 10204.00 frames. ], tot_loss[loss=3.026, ArTop10Accuracy=0.7065, over 11675.56 frames. ], batch size: 12, lr: 1.79e-02 2024-08-06 04:29:42,752 INFO [trainer.py:765] (2/8) Epoch 6, batch 900, train_loss[loss=3.021, ArTop10Accuracy=0.7036, over 12896.00 frames. ], tot_loss[loss=3.021, ArTop10Accuracy=0.7078, over 11744.03 frames. ], batch size: 27, lr: 1.78e-02 2024-08-06 04:30:14,306 INFO [trainer.py:765] (2/8) Epoch 6, batch 1000, train_loss[loss=3.099, ArTop10Accuracy=0.6958, over 12892.00 frames. ], tot_loss[loss=3.025, ArTop10Accuracy=0.7068, over 11954.61 frames. ], batch size: 27, lr: 1.77e-02 2024-08-06 04:30:45,384 INFO [trainer.py:765] (2/8) Epoch 6, batch 1100, train_loss[loss=3.054, ArTop10Accuracy=0.7074, over 13657.00 frames. ], tot_loss[loss=3.031, ArTop10Accuracy=0.7058, over 11989.60 frames. ], batch size: 34, lr: 1.77e-02 2024-08-06 04:31:15,673 INFO [trainer.py:765] (2/8) Epoch 6, batch 1200, train_loss[loss=3.197, ArTop10Accuracy=0.6765, over 11611.00 frames. ], tot_loss[loss=3.026, ArTop10Accuracy=0.7065, over 11928.69 frames. ], batch size: 98, lr: 1.76e-02 2024-08-06 04:31:40,624 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 04:32:52,405 INFO [trainer.py:765] (2/8) Epoch 7, batch 100, train_loss[loss=3.098, ArTop10Accuracy=0.6977, over 14447.00 frames. ], tot_loss[loss=2.988, ArTop10Accuracy=0.7144, over 4767.78 frames. ], batch size: 61, lr: 1.64e-02 2024-08-06 04:33:38,224 INFO [trainer.py:765] (2/8) Epoch 7, batch 200, train_loss[loss=3.006, ArTop10Accuracy=0.7124, over 13782.00 frames. ], tot_loss[loss=2.984, ArTop10Accuracy=0.7154, over 7764.47 frames. ], batch size: 34, lr: 1.64e-02 2024-08-06 04:34:22,609 INFO [trainer.py:765] (2/8) Epoch 7, batch 300, train_loss[loss=3.026, ArTop10Accuracy=0.7082, over 14388.00 frames. ], tot_loss[loss=2.987, ArTop10Accuracy=0.7148, over 9405.42 frames. ], batch size: 44, lr: 1.63e-02 2024-08-06 04:34:36,848 INFO [trainer.py:803] (2/8) Computing validation loss 2024-08-06 04:34:45,809 INFO [trainer.py:811] (2/8) Epoch 7, validation: loss=2.963, ArTop10Accuracy=0.7233, over 1829298.00 frames. 2024-08-06 04:34:45,810 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33194MB 2024-08-06 04:34:46,125 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.009e+02 1.306e+02 1.435e+02 1.599e+02 8.689e+02, threshold=2.871e+02, percent-clipped=0.9 2024-08-06 04:35:17,147 INFO [trainer.py:765] (2/8) Epoch 7, batch 400, train_loss[loss=3.029, ArTop10Accuracy=0.7126, over 10839.00 frames. ], tot_loss[loss=2.988, ArTop10Accuracy=0.7148, over 10324.61 frames. ], batch size: 15, lr: 1.62e-02 2024-08-06 04:36:01,711 INFO [trainer.py:765] (2/8) Epoch 7, batch 500, train_loss[loss=2.873, ArTop10Accuracy=0.7396, over 12169.00 frames. ], tot_loss[loss=2.987, ArTop10Accuracy=0.7146, over 10888.61 frames. ], batch size: 22, lr: 1.61e-02 2024-08-06 04:36:48,812 INFO [trainer.py:765] (2/8) Epoch 7, batch 600, train_loss[loss=2.969, ArTop10Accuracy=0.7169, over 11577.00 frames. ], tot_loss[loss=2.989, ArTop10Accuracy=0.7139, over 11406.76 frames. ], batch size: 18, lr: 1.61e-02 2024-08-06 04:37:34,800 INFO [trainer.py:765] (2/8) Epoch 7, batch 700, train_loss[loss=2.885, ArTop10Accuracy=0.7319, over 10136.00 frames. ], tot_loss[loss=2.996, ArTop10Accuracy=0.7127, over 11568.04 frames. ], batch size: 12, lr: 1.60e-02 2024-08-06 04:38:13,614 INFO [trainer.py:765] (2/8) Epoch 7, batch 800, train_loss[loss=2.811, ArTop10Accuracy=0.7477, over 10693.00 frames. ], tot_loss[loss=3, ArTop10Accuracy=0.712, over 11681.76 frames. ], batch size: 13, lr: 1.59e-02 2024-08-06 04:38:45,110 INFO [trainer.py:765] (2/8) Epoch 7, batch 900, train_loss[loss=3.053, ArTop10Accuracy=0.703, over 13248.00 frames. ], tot_loss[loss=2.991, ArTop10Accuracy=0.7138, over 11741.89 frames. ], batch size: 27, lr: 1.59e-02 2024-08-06 04:39:16,575 INFO [trainer.py:765] (2/8) Epoch 7, batch 1000, train_loss[loss=2.957, ArTop10Accuracy=0.7173, over 12975.00 frames. ], tot_loss[loss=2.991, ArTop10Accuracy=0.7137, over 11959.74 frames. ], batch size: 27, lr: 1.58e-02 2024-08-06 04:39:47,571 INFO [trainer.py:765] (2/8) Epoch 7, batch 1100, train_loss[loss=3.027, ArTop10Accuracy=0.7077, over 13800.00 frames. ], tot_loss[loss=3, ArTop10Accuracy=0.712, over 12002.90 frames. ], batch size: 34, lr: 1.57e-02 2024-08-06 04:40:17,989 INFO [trainer.py:765] (2/8) Epoch 7, batch 1200, train_loss[loss=3.136, ArTop10Accuracy=0.6861, over 12887.00 frames. ], tot_loss[loss=2.996, ArTop10Accuracy=0.7127, over 11943.73 frames. ], batch size: 99, lr: 1.57e-02 2024-08-06 04:40:43,363 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 04:41:37,492 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 9.816e+01 1.295e+02 1.411e+02 1.574e+02 4.953e+02, threshold=2.821e+02, percent-clipped=1.1 2024-08-06 04:41:58,371 INFO [trainer.py:765] (2/8) Epoch 8, batch 100, train_loss[loss=3.046, ArTop10Accuracy=0.7047, over 14552.00 frames. ], tot_loss[loss=2.971, ArTop10Accuracy=0.7182, over 4792.34 frames. ], batch size: 61, lr: 1.47e-02 2024-08-06 04:42:44,986 INFO [trainer.py:765] (2/8) Epoch 8, batch 200, train_loss[loss=2.974, ArTop10Accuracy=0.7225, over 13235.00 frames. ], tot_loss[loss=2.95, ArTop10Accuracy=0.7225, over 7789.63 frames. ], batch size: 33, lr: 1.46e-02 2024-08-06 04:43:28,045 INFO [trainer.py:765] (2/8) Epoch 8, batch 300, train_loss[loss=3.041, ArTop10Accuracy=0.7051, over 14321.00 frames. ], tot_loss[loss=2.949, ArTop10Accuracy=0.7227, over 9418.52 frames. ], batch size: 44, lr: 1.46e-02 2024-08-06 04:44:14,461 INFO [trainer.py:765] (2/8) Epoch 8, batch 400, train_loss[loss=3.094, ArTop10Accuracy=0.7061, over 10312.00 frames. ], tot_loss[loss=2.954, ArTop10Accuracy=0.7218, over 10342.89 frames. ], batch size: 14, lr: 1.45e-02 2024-08-06 04:45:00,692 INFO [trainer.py:765] (2/8) Epoch 8, batch 500, train_loss[loss=2.953, ArTop10Accuracy=0.7192, over 12418.00 frames. ], tot_loss[loss=2.951, ArTop10Accuracy=0.722, over 10914.11 frames. ], batch size: 22, lr: 1.45e-02 2024-08-06 04:45:45,393 INFO [trainer.py:765] (2/8) Epoch 8, batch 600, train_loss[loss=2.951, ArTop10Accuracy=0.7179, over 11643.00 frames. ], tot_loss[loss=2.959, ArTop10Accuracy=0.7201, over 11430.71 frames. ], batch size: 18, lr: 1.44e-02 2024-08-06 04:46:34,037 INFO [trainer.py:765] (2/8) Epoch 8, batch 700, train_loss[loss=2.88, ArTop10Accuracy=0.7407, over 10089.00 frames. ], tot_loss[loss=2.97, ArTop10Accuracy=0.7179, over 11557.64 frames. ], batch size: 12, lr: 1.43e-02 2024-08-06 04:47:10,207 INFO [trainer.py:765] (2/8) Epoch 8, batch 800, train_loss[loss=2.942, ArTop10Accuracy=0.7197, over 9999.00 frames. ], tot_loss[loss=2.973, ArTop10Accuracy=0.7171, over 11684.33 frames. ], batch size: 12, lr: 1.43e-02 2024-08-06 04:47:41,605 INFO [trainer.py:765] (2/8) Epoch 8, batch 900, train_loss[loss=2.918, ArTop10Accuracy=0.7281, over 13035.00 frames. ], tot_loss[loss=2.969, ArTop10Accuracy=0.7179, over 11727.15 frames. ], batch size: 27, lr: 1.42e-02 2024-08-06 04:48:13,033 INFO [trainer.py:765] (2/8) Epoch 8, batch 1000, train_loss[loss=2.956, ArTop10Accuracy=0.7221, over 12840.00 frames. ], tot_loss[loss=2.97, ArTop10Accuracy=0.7178, over 11936.16 frames. ], batch size: 27, lr: 1.42e-02 2024-08-06 04:48:28,828 INFO [trainer.py:803] (2/8) Computing validation loss 2024-08-06 04:48:37,663 INFO [trainer.py:811] (2/8) Epoch 8, validation: loss=2.946, ArTop10Accuracy=0.7266, over 1829298.00 frames. 2024-08-06 04:48:37,664 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33194MB 2024-08-06 04:48:37,951 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.035e+02 1.289e+02 1.393e+02 1.532e+02 3.557e+02, threshold=2.786e+02, percent-clipped=0.2 2024-08-06 04:48:52,931 INFO [trainer.py:765] (2/8) Epoch 8, batch 1100, train_loss[loss=2.905, ArTop10Accuracy=0.7247, over 13471.00 frames. ], tot_loss[loss=2.974, ArTop10Accuracy=0.7167, over 11991.47 frames. ], batch size: 34, lr: 1.41e-02 2024-08-06 04:49:23,202 INFO [trainer.py:765] (2/8) Epoch 8, batch 1200, train_loss[loss=3.151, ArTop10Accuracy=0.6828, over 12836.00 frames. ], tot_loss[loss=2.978, ArTop10Accuracy=0.7161, over 11952.83 frames. ], batch size: 97, lr: 1.40e-02 2024-08-06 04:49:48,393 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 04:51:01,547 INFO [trainer.py:765] (2/8) Epoch 9, batch 100, train_loss[loss=3.072, ArTop10Accuracy=0.6984, over 14479.00 frames. ], tot_loss[loss=2.952, ArTop10Accuracy=0.7219, over 4767.15 frames. ], batch size: 61, lr: 1.32e-02 2024-08-06 04:51:45,414 INFO [trainer.py:765] (2/8) Epoch 9, batch 200, train_loss[loss=3.028, ArTop10Accuracy=0.7102, over 13817.00 frames. ], tot_loss[loss=2.94, ArTop10Accuracy=0.7243, over 7779.31 frames. ], batch size: 34, lr: 1.32e-02 2024-08-06 04:52:29,082 INFO [trainer.py:765] (2/8) Epoch 9, batch 300, train_loss[loss=2.961, ArTop10Accuracy=0.7219, over 14264.00 frames. ], tot_loss[loss=2.938, ArTop10Accuracy=0.7249, over 9398.27 frames. ], batch size: 44, lr: 1.31e-02 2024-08-06 04:53:16,431 INFO [trainer.py:765] (2/8) Epoch 9, batch 400, train_loss[loss=2.907, ArTop10Accuracy=0.7297, over 10737.00 frames. ], tot_loss[loss=2.936, ArTop10Accuracy=0.725, over 10337.25 frames. ], batch size: 15, lr: 1.31e-02 2024-08-06 04:53:58,143 INFO [trainer.py:765] (2/8) Epoch 9, batch 500, train_loss[loss=2.943, ArTop10Accuracy=0.7245, over 12110.00 frames. ], tot_loss[loss=2.932, ArTop10Accuracy=0.7254, over 10890.50 frames. ], batch size: 22, lr: 1.30e-02 2024-08-06 04:54:51,077 INFO [trainer.py:765] (2/8) Epoch 9, batch 600, train_loss[loss=2.921, ArTop10Accuracy=0.7325, over 11728.00 frames. ], tot_loss[loss=2.941, ArTop10Accuracy=0.7234, over 11423.61 frames. ], batch size: 18, lr: 1.30e-02 2024-08-06 04:55:34,399 INFO [trainer.py:765] (2/8) Epoch 9, batch 700, train_loss[loss=2.84, ArTop10Accuracy=0.7416, over 10190.00 frames. ], tot_loss[loss=2.948, ArTop10Accuracy=0.7222, over 11578.40 frames. ], batch size: 12, lr: 1.29e-02 2024-08-06 04:56:04,575 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.029e+02 1.257e+02 1.367e+02 1.507e+02 8.820e+02, threshold=2.735e+02, percent-clipped=0.5 2024-08-06 04:56:13,598 INFO [trainer.py:765] (2/8) Epoch 9, batch 800, train_loss[loss=2.872, ArTop10Accuracy=0.7408, over 10110.00 frames. ], tot_loss[loss=2.953, ArTop10Accuracy=0.7213, over 11698.31 frames. ], batch size: 12, lr: 1.29e-02 2024-08-06 04:56:44,975 INFO [trainer.py:765] (2/8) Epoch 9, batch 900, train_loss[loss=2.936, ArTop10Accuracy=0.7295, over 13064.00 frames. ], tot_loss[loss=2.949, ArTop10Accuracy=0.7222, over 11748.18 frames. ], batch size: 27, lr: 1.28e-02 2024-08-06 04:57:16,491 INFO [trainer.py:765] (2/8) Epoch 9, batch 1000, train_loss[loss=2.958, ArTop10Accuracy=0.7276, over 13016.00 frames. ], tot_loss[loss=2.955, ArTop10Accuracy=0.7213, over 11944.72 frames. ], batch size: 27, lr: 1.28e-02 2024-08-06 04:57:47,657 INFO [trainer.py:765] (2/8) Epoch 9, batch 1100, train_loss[loss=3, ArTop10Accuracy=0.7154, over 13793.00 frames. ], tot_loss[loss=2.965, ArTop10Accuracy=0.7189, over 12005.74 frames. ], batch size: 34, lr: 1.27e-02 2024-08-06 04:58:18,093 INFO [trainer.py:765] (2/8) Epoch 9, batch 1200, train_loss[loss=3.099, ArTop10Accuracy=0.6887, over 12366.00 frames. ], tot_loss[loss=2.962, ArTop10Accuracy=0.7196, over 11971.89 frames. ], batch size: 98, lr: 1.27e-02 2024-08-06 04:58:43,379 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 04:59:52,750 INFO [trainer.py:765] (2/8) Epoch 10, batch 100, train_loss[loss=2.978, ArTop10Accuracy=0.7164, over 14878.00 frames. ], tot_loss[loss=2.924, ArTop10Accuracy=0.7286, over 4780.23 frames. ], batch size: 63, lr: 1.20e-02 2024-08-06 05:00:43,730 INFO [trainer.py:765] (2/8) Epoch 10, batch 200, train_loss[loss=2.88, ArTop10Accuracy=0.7405, over 13887.00 frames. ], tot_loss[loss=2.919, ArTop10Accuracy=0.7293, over 7803.44 frames. ], batch size: 34, lr: 1.20e-02 2024-08-06 05:01:20,591 INFO [trainer.py:765] (2/8) Epoch 10, batch 300, train_loss[loss=2.964, ArTop10Accuracy=0.7188, over 14173.00 frames. ], tot_loss[loss=2.912, ArTop10Accuracy=0.7301, over 9425.87 frames. ], batch size: 44, lr: 1.19e-02 2024-08-06 05:02:10,048 INFO [trainer.py:765] (2/8) Epoch 10, batch 400, train_loss[loss=2.955, ArTop10Accuracy=0.7291, over 10429.00 frames. ], tot_loss[loss=2.914, ArTop10Accuracy=0.7295, over 10335.80 frames. ], batch size: 14, lr: 1.19e-02 2024-08-06 05:02:46,488 INFO [trainer.py:803] (2/8) Computing validation loss 2024-08-06 05:02:55,377 INFO [trainer.py:811] (2/8) Epoch 10, validation: loss=2.927, ArTop10Accuracy=0.7304, over 1829298.00 frames. 2024-08-06 05:02:55,378 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33194MB 2024-08-06 05:02:55,728 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.023e+02 1.269e+02 1.367e+02 1.518e+02 4.405e+02, threshold=2.733e+02, percent-clipped=0.4 2024-08-06 05:02:58,361 INFO [trainer.py:765] (2/8) Epoch 10, batch 500, train_loss[loss=2.875, ArTop10Accuracy=0.7371, over 12161.00 frames. ], tot_loss[loss=2.915, ArTop10Accuracy=0.7288, over 10893.26 frames. ], batch size: 22, lr: 1.19e-02 2024-08-06 05:03:48,229 INFO [trainer.py:765] (2/8) Epoch 10, batch 600, train_loss[loss=2.871, ArTop10Accuracy=0.7365, over 11569.00 frames. ], tot_loss[loss=2.918, ArTop10Accuracy=0.7282, over 11428.81 frames. ], batch size: 18, lr: 1.18e-02 2024-08-06 05:04:36,715 INFO [trainer.py:765] (2/8) Epoch 10, batch 700, train_loss[loss=2.612, ArTop10Accuracy=0.7818, over 10102.00 frames. ], tot_loss[loss=2.931, ArTop10Accuracy=0.7254, over 11580.42 frames. ], batch size: 12, lr: 1.18e-02 2024-08-06 05:05:10,726 INFO [trainer.py:765] (2/8) Epoch 10, batch 800, train_loss[loss=2.908, ArTop10Accuracy=0.7233, over 10210.00 frames. ], tot_loss[loss=2.939, ArTop10Accuracy=0.7238, over 11684.13 frames. ], batch size: 12, lr: 1.17e-02 2024-08-06 05:05:42,245 INFO [trainer.py:765] (2/8) Epoch 10, batch 900, train_loss[loss=2.944, ArTop10Accuracy=0.7194, over 12966.00 frames. ], tot_loss[loss=2.928, ArTop10Accuracy=0.7258, over 11742.25 frames. ], batch size: 27, lr: 1.17e-02 2024-08-06 05:06:13,844 INFO [trainer.py:765] (2/8) Epoch 10, batch 1000, train_loss[loss=2.983, ArTop10Accuracy=0.7219, over 12821.00 frames. ], tot_loss[loss=2.925, ArTop10Accuracy=0.7266, over 11939.45 frames. ], batch size: 27, lr: 1.16e-02 2024-08-06 05:06:45,055 INFO [trainer.py:765] (2/8) Epoch 10, batch 1100, train_loss[loss=3.068, ArTop10Accuracy=0.7031, over 13737.00 frames. ], tot_loss[loss=2.941, ArTop10Accuracy=0.7238, over 12009.40 frames. ], batch size: 34, lr: 1.16e-02 2024-08-06 05:07:15,484 INFO [trainer.py:765] (2/8) Epoch 10, batch 1200, train_loss[loss=3.01, ArTop10Accuracy=0.7078, over 12358.00 frames. ], tot_loss[loss=2.94, ArTop10Accuracy=0.724, over 11950.99 frames. ], batch size: 97, lr: 1.16e-02 2024-08-06 05:07:40,243 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 05:08:52,966 INFO [trainer.py:765] (2/8) Epoch 11, batch 100, train_loss[loss=2.925, ArTop10Accuracy=0.7288, over 14495.00 frames. ], tot_loss[loss=2.906, ArTop10Accuracy=0.7314, over 4770.40 frames. ], batch size: 61, lr: 1.10e-02 2024-08-06 05:09:41,277 INFO [trainer.py:765] (2/8) Epoch 11, batch 200, train_loss[loss=2.929, ArTop10Accuracy=0.7319, over 13841.00 frames. ], tot_loss[loss=2.905, ArTop10Accuracy=0.7314, over 7772.06 frames. ], batch size: 34, lr: 1.10e-02 2024-08-06 05:09:51,176 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.001e+02 1.278e+02 1.371e+02 1.502e+02 3.785e+02, threshold=2.743e+02, percent-clipped=0.3 2024-08-06 05:10:24,720 INFO [trainer.py:765] (2/8) Epoch 11, batch 300, train_loss[loss=2.998, ArTop10Accuracy=0.7166, over 14187.00 frames. ], tot_loss[loss=2.901, ArTop10Accuracy=0.7321, over 9389.76 frames. ], batch size: 44, lr: 1.09e-02 2024-08-06 05:11:11,784 INFO [trainer.py:765] (2/8) Epoch 11, batch 400, train_loss[loss=2.766, ArTop10Accuracy=0.7571, over 10857.00 frames. ], tot_loss[loss=2.9, ArTop10Accuracy=0.7322, over 10329.19 frames. ], batch size: 15, lr: 1.09e-02 2024-08-06 05:11:52,692 INFO [trainer.py:765] (2/8) Epoch 11, batch 500, train_loss[loss=2.9, ArTop10Accuracy=0.7279, over 12227.00 frames. ], tot_loss[loss=2.904, ArTop10Accuracy=0.7313, over 10891.57 frames. ], batch size: 22, lr: 1.09e-02 2024-08-06 05:12:40,287 INFO [trainer.py:765] (2/8) Epoch 11, batch 600, train_loss[loss=2.773, ArTop10Accuracy=0.7548, over 11681.00 frames. ], tot_loss[loss=2.909, ArTop10Accuracy=0.7303, over 11432.08 frames. ], batch size: 18, lr: 1.08e-02 2024-08-06 05:13:25,708 INFO [trainer.py:765] (2/8) Epoch 11, batch 700, train_loss[loss=2.781, ArTop10Accuracy=0.7469, over 10182.00 frames. ], tot_loss[loss=2.918, ArTop10Accuracy=0.7287, over 11569.39 frames. ], batch size: 12, lr: 1.08e-02 2024-08-06 05:14:04,206 INFO [trainer.py:765] (2/8) Epoch 11, batch 800, train_loss[loss=2.793, ArTop10Accuracy=0.7451, over 10069.00 frames. ], tot_loss[loss=2.921, ArTop10Accuracy=0.728, over 11681.47 frames. ], batch size: 12, lr: 1.07e-02 2024-08-06 05:14:35,666 INFO [trainer.py:765] (2/8) Epoch 11, batch 900, train_loss[loss=2.867, ArTop10Accuracy=0.7368, over 12934.00 frames. ], tot_loss[loss=2.914, ArTop10Accuracy=0.7294, over 11728.60 frames. ], batch size: 27, lr: 1.07e-02 2024-08-06 05:15:07,263 INFO [trainer.py:765] (2/8) Epoch 11, batch 1000, train_loss[loss=2.997, ArTop10Accuracy=0.7159, over 13352.00 frames. ], tot_loss[loss=2.914, ArTop10Accuracy=0.7291, over 11927.60 frames. ], batch size: 28, lr: 1.07e-02 2024-08-06 05:15:38,259 INFO [trainer.py:765] (2/8) Epoch 11, batch 1100, train_loss[loss=2.882, ArTop10Accuracy=0.732, over 13622.00 frames. ], tot_loss[loss=2.92, ArTop10Accuracy=0.7277, over 11984.60 frames. ], batch size: 34, lr: 1.06e-02 2024-08-06 05:16:08,497 INFO [trainer.py:765] (2/8) Epoch 11, batch 1200, train_loss[loss=3.096, ArTop10Accuracy=0.693, over 12237.00 frames. ], tot_loss[loss=2.923, ArTop10Accuracy=0.7274, over 11936.43 frames. ], batch size: 98, lr: 1.06e-02 2024-08-06 05:16:12,697 INFO [trainer.py:803] (2/8) Computing validation loss 2024-08-06 05:16:21,622 INFO [trainer.py:811] (2/8) Epoch 11, validation: loss=2.923, ArTop10Accuracy=0.7318, over 1829298.00 frames. 2024-08-06 05:16:21,623 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33194MB 2024-08-06 05:16:21,949 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.076e+02 1.268e+02 1.368e+02 1.481e+02 4.790e+02, threshold=2.736e+02, percent-clipped=0.6 2024-08-06 05:16:42,805 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 05:18:03,006 INFO [trainer.py:765] (2/8) Epoch 12, batch 100, train_loss[loss=2.985, ArTop10Accuracy=0.7173, over 14457.00 frames. ], tot_loss[loss=2.887, ArTop10Accuracy=0.7351, over 4751.58 frames. ], batch size: 62, lr: 1.01e-02 2024-08-06 05:18:46,005 INFO [trainer.py:765] (2/8) Epoch 12, batch 200, train_loss[loss=2.876, ArTop10Accuracy=0.7382, over 13936.00 frames. ], tot_loss[loss=2.878, ArTop10Accuracy=0.7367, over 7778.74 frames. ], batch size: 34, lr: 1.01e-02 2024-08-06 05:19:31,947 INFO [trainer.py:765] (2/8) Epoch 12, batch 300, train_loss[loss=2.918, ArTop10Accuracy=0.7335, over 13852.00 frames. ], tot_loss[loss=2.874, ArTop10Accuracy=0.7374, over 9395.43 frames. ], batch size: 43, lr: 1.01e-02 2024-08-06 05:20:12,431 INFO [trainer.py:765] (2/8) Epoch 12, batch 400, train_loss[loss=2.814, ArTop10Accuracy=0.7557, over 10314.00 frames. ], tot_loss[loss=2.882, ArTop10Accuracy=0.736, over 10309.33 frames. ], batch size: 14, lr: 1.00e-02 2024-08-06 05:21:00,640 INFO [trainer.py:765] (2/8) Epoch 12, batch 500, train_loss[loss=2.952, ArTop10Accuracy=0.7228, over 12138.00 frames. ], tot_loss[loss=2.884, ArTop10Accuracy=0.7356, over 10884.38 frames. ], batch size: 22, lr: 9.99e-03 2024-08-06 05:21:43,916 INFO [trainer.py:765] (2/8) Epoch 12, batch 600, train_loss[loss=2.802, ArTop10Accuracy=0.7451, over 11621.00 frames. ], tot_loss[loss=2.89, ArTop10Accuracy=0.734, over 11423.38 frames. ], batch size: 18, lr: 9.96e-03 2024-08-06 05:22:32,206 INFO [trainer.py:765] (2/8) Epoch 12, batch 700, train_loss[loss=2.767, ArTop10Accuracy=0.7568, over 9980.00 frames. ], tot_loss[loss=2.897, ArTop10Accuracy=0.7326, over 11570.51 frames. ], batch size: 12, lr: 9.93e-03 2024-08-06 05:23:08,912 INFO [trainer.py:765] (2/8) Epoch 12, batch 800, train_loss[loss=2.875, ArTop10Accuracy=0.7385, over 10064.00 frames. ], tot_loss[loss=2.905, ArTop10Accuracy=0.731, over 11669.44 frames. ], batch size: 12, lr: 9.90e-03 2024-08-06 05:23:40,460 INFO [trainer.py:765] (2/8) Epoch 12, batch 900, train_loss[loss=2.835, ArTop10Accuracy=0.7463, over 13139.00 frames. ], tot_loss[loss=2.897, ArTop10Accuracy=0.7327, over 11730.86 frames. ], batch size: 27, lr: 9.87e-03 2024-08-06 05:23:54,576 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.067e+02 1.273e+02 1.376e+02 1.503e+02 4.050e+02, threshold=2.752e+02, percent-clipped=0.4 2024-08-06 05:24:14,346 INFO [trainer.py:765] (2/8) Epoch 12, batch 1000, train_loss[loss=2.817, ArTop10Accuracy=0.7436, over 13009.00 frames. ], tot_loss[loss=2.899, ArTop10Accuracy=0.7322, over 11941.24 frames. ], batch size: 27, lr: 9.84e-03 2024-08-06 05:24:45,502 INFO [trainer.py:765] (2/8) Epoch 12, batch 1100, train_loss[loss=2.936, ArTop10Accuracy=0.7239, over 13711.00 frames. ], tot_loss[loss=2.906, ArTop10Accuracy=0.7308, over 12005.16 frames. ], batch size: 34, lr: 9.81e-03 2024-08-06 05:25:15,882 INFO [trainer.py:765] (2/8) Epoch 12, batch 1200, train_loss[loss=3.01, ArTop10Accuracy=0.7104, over 12131.00 frames. ], tot_loss[loss=2.907, ArTop10Accuracy=0.7307, over 11944.94 frames. ], batch size: 98, lr: 9.78e-03 2024-08-06 05:25:41,436 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 05:26:46,787 INFO [trainer.py:765] (2/8) Epoch 13, batch 100, train_loss[loss=2.932, ArTop10Accuracy=0.7276, over 14692.00 frames. ], tot_loss[loss=2.877, ArTop10Accuracy=0.7377, over 4774.15 frames. ], batch size: 61, lr: 9.36e-03 2024-08-06 05:27:32,552 INFO [trainer.py:765] (2/8) Epoch 13, batch 200, train_loss[loss=2.878, ArTop10Accuracy=0.7415, over 13615.00 frames. ], tot_loss[loss=2.874, ArTop10Accuracy=0.7378, over 7798.34 frames. ], batch size: 34, lr: 9.34e-03 2024-08-06 05:28:16,036 INFO [trainer.py:765] (2/8) Epoch 13, batch 300, train_loss[loss=2.861, ArTop10Accuracy=0.7421, over 14256.00 frames. ], tot_loss[loss=2.871, ArTop10Accuracy=0.7385, over 9419.42 frames. ], batch size: 44, lr: 9.31e-03 2024-08-06 05:29:00,149 INFO [trainer.py:765] (2/8) Epoch 13, batch 400, train_loss[loss=2.646, ArTop10Accuracy=0.7744, over 10261.00 frames. ], tot_loss[loss=2.872, ArTop10Accuracy=0.7378, over 10316.57 frames. ], batch size: 14, lr: 9.28e-03 2024-08-06 05:29:43,967 INFO [trainer.py:765] (2/8) Epoch 13, batch 500, train_loss[loss=2.773, ArTop10Accuracy=0.7575, over 12363.00 frames. ], tot_loss[loss=2.871, ArTop10Accuracy=0.738, over 10885.74 frames. ], batch size: 22, lr: 9.26e-03 2024-08-06 05:30:24,247 INFO [trainer.py:765] (2/8) Epoch 13, batch 600, train_loss[loss=2.885, ArTop10Accuracy=0.7434, over 11525.00 frames. ], tot_loss[loss=2.877, ArTop10Accuracy=0.7367, over 11421.59 frames. ], batch size: 18, lr: 9.23e-03 2024-08-06 05:30:58,110 INFO [trainer.py:803] (2/8) Computing validation loss 2024-08-06 05:31:07,054 INFO [trainer.py:811] (2/8) Epoch 13, validation: loss=2.918, ArTop10Accuracy=0.733, over 1829298.00 frames. 2024-08-06 05:31:07,054 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33194MB 2024-08-06 05:31:07,351 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.049e+02 1.283e+02 1.389e+02 1.496e+02 2.729e+02, threshold=2.779e+02, percent-clipped=0.0 2024-08-06 05:31:24,043 INFO [trainer.py:765] (2/8) Epoch 13, batch 700, train_loss[loss=2.894, ArTop10Accuracy=0.7257, over 9882.00 frames. ], tot_loss[loss=2.884, ArTop10Accuracy=0.7352, over 11544.89 frames. ], batch size: 12, lr: 9.20e-03 2024-08-06 05:32:00,147 INFO [trainer.py:765] (2/8) Epoch 13, batch 800, train_loss[loss=2.705, ArTop10Accuracy=0.7712, over 10220.00 frames. ], tot_loss[loss=2.887, ArTop10Accuracy=0.7348, over 11668.46 frames. ], batch size: 12, lr: 9.18e-03 2024-08-06 05:32:31,521 INFO [trainer.py:765] (2/8) Epoch 13, batch 900, train_loss[loss=2.834, ArTop10Accuracy=0.7448, over 12923.00 frames. ], tot_loss[loss=2.884, ArTop10Accuracy=0.7353, over 11731.17 frames. ], batch size: 27, lr: 9.15e-03 2024-08-06 05:33:03,043 INFO [trainer.py:765] (2/8) Epoch 13, batch 1000, train_loss[loss=2.729, ArTop10Accuracy=0.7598, over 12946.00 frames. ], tot_loss[loss=2.883, ArTop10Accuracy=0.7352, over 11939.27 frames. ], batch size: 27, lr: 9.13e-03 2024-08-06 05:33:34,233 INFO [trainer.py:765] (2/8) Epoch 13, batch 1100, train_loss[loss=2.955, ArTop10Accuracy=0.7218, over 13772.00 frames. ], tot_loss[loss=2.893, ArTop10Accuracy=0.7333, over 12016.87 frames. ], batch size: 34, lr: 9.10e-03 2024-08-06 05:34:04,519 INFO [trainer.py:765] (2/8) Epoch 13, batch 1200, train_loss[loss=3.045, ArTop10Accuracy=0.7098, over 11902.00 frames. ], tot_loss[loss=2.892, ArTop10Accuracy=0.7331, over 11935.20 frames. ], batch size: 97, lr: 9.07e-03 2024-08-06 05:34:29,356 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 05:35:39,198 INFO [trainer.py:765] (2/8) Epoch 14, batch 100, train_loss[loss=2.872, ArTop10Accuracy=0.7365, over 14687.00 frames. ], tot_loss[loss=2.862, ArTop10Accuracy=0.7403, over 4792.39 frames. ], batch size: 61, lr: 8.71e-03 2024-08-06 05:36:23,063 INFO [trainer.py:765] (2/8) Epoch 14, batch 200, train_loss[loss=2.867, ArTop10Accuracy=0.7411, over 13728.00 frames. ], tot_loss[loss=2.856, ArTop10Accuracy=0.7417, over 7798.85 frames. ], batch size: 34, lr: 8.68e-03 2024-08-06 05:37:09,309 INFO [trainer.py:765] (2/8) Epoch 14, batch 300, train_loss[loss=2.913, ArTop10Accuracy=0.7299, over 14314.00 frames. ], tot_loss[loss=2.854, ArTop10Accuracy=0.7418, over 9424.44 frames. ], batch size: 44, lr: 8.66e-03 2024-08-06 05:37:46,030 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.097e+02 1.304e+02 1.410e+02 1.531e+02 2.912e+02, threshold=2.820e+02, percent-clipped=0.2 2024-08-06 05:37:55,139 INFO [trainer.py:765] (2/8) Epoch 14, batch 400, train_loss[loss=2.777, ArTop10Accuracy=0.7459, over 10801.00 frames. ], tot_loss[loss=2.853, ArTop10Accuracy=0.7419, over 10315.88 frames. ], batch size: 15, lr: 8.64e-03 2024-08-06 05:38:42,025 INFO [trainer.py:765] (2/8) Epoch 14, batch 500, train_loss[loss=2.76, ArTop10Accuracy=0.7576, over 12396.00 frames. ], tot_loss[loss=2.852, ArTop10Accuracy=0.7416, over 10890.81 frames. ], batch size: 22, lr: 8.61e-03 2024-08-06 05:39:22,374 INFO [trainer.py:765] (2/8) Epoch 14, batch 600, train_loss[loss=2.873, ArTop10Accuracy=0.7326, over 11490.00 frames. ], tot_loss[loss=2.858, ArTop10Accuracy=0.7404, over 11418.29 frames. ], batch size: 18, lr: 8.59e-03 2024-08-06 05:40:15,143 INFO [trainer.py:765] (2/8) Epoch 14, batch 700, train_loss[loss=2.907, ArTop10Accuracy=0.7269, over 10082.00 frames. ], tot_loss[loss=2.866, ArTop10Accuracy=0.7385, over 11551.12 frames. ], batch size: 12, lr: 8.57e-03 2024-08-06 05:40:49,136 INFO [trainer.py:765] (2/8) Epoch 14, batch 800, train_loss[loss=2.667, ArTop10Accuracy=0.7775, over 9850.00 frames. ], tot_loss[loss=2.875, ArTop10Accuracy=0.7373, over 11668.77 frames. ], batch size: 12, lr: 8.55e-03 2024-08-06 05:41:20,466 INFO [trainer.py:765] (2/8) Epoch 14, batch 900, train_loss[loss=2.876, ArTop10Accuracy=0.7369, over 12986.00 frames. ], tot_loss[loss=2.867, ArTop10Accuracy=0.7386, over 11723.01 frames. ], batch size: 27, lr: 8.52e-03 2024-08-06 05:41:51,995 INFO [trainer.py:765] (2/8) Epoch 14, batch 1000, train_loss[loss=2.83, ArTop10Accuracy=0.7426, over 13044.00 frames. ], tot_loss[loss=2.871, ArTop10Accuracy=0.7377, over 11933.16 frames. ], batch size: 27, lr: 8.50e-03 2024-08-06 05:42:23,217 INFO [trainer.py:765] (2/8) Epoch 14, batch 1100, train_loss[loss=2.842, ArTop10Accuracy=0.7439, over 13773.00 frames. ], tot_loss[loss=2.879, ArTop10Accuracy=0.7363, over 11999.74 frames. ], batch size: 34, lr: 8.48e-03 2024-08-06 05:42:53,549 INFO [trainer.py:765] (2/8) Epoch 14, batch 1200, train_loss[loss=3.028, ArTop10Accuracy=0.7076, over 12237.00 frames. ], tot_loss[loss=2.877, ArTop10Accuracy=0.7364, over 11949.06 frames. ], batch size: 98, lr: 8.46e-03 2024-08-06 05:43:19,085 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 05:44:28,572 INFO [trainer.py:765] (2/8) Epoch 15, batch 100, train_loss[loss=2.963, ArTop10Accuracy=0.7197, over 14826.00 frames. ], tot_loss[loss=2.857, ArTop10Accuracy=0.7422, over 4788.38 frames. ], batch size: 62, lr: 8.14e-03 2024-08-06 05:44:29,214 INFO [trainer.py:803] (2/8) Computing validation loss 2024-08-06 05:44:38,023 INFO [trainer.py:811] (2/8) Epoch 15, validation: loss=2.913, ArTop10Accuracy=0.7339, over 1829298.00 frames. 2024-08-06 05:44:38,024 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33246MB 2024-08-06 05:44:38,413 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.100e+02 1.307e+02 1.417e+02 1.528e+02 2.981e+02, threshold=2.833e+02, percent-clipped=0.1 2024-08-06 05:45:20,185 INFO [trainer.py:765] (2/8) Epoch 15, batch 200, train_loss[loss=2.807, ArTop10Accuracy=0.7537, over 13710.00 frames. ], tot_loss[loss=2.845, ArTop10Accuracy=0.7439, over 7805.06 frames. ], batch size: 34, lr: 8.11e-03 2024-08-06 05:46:04,647 INFO [trainer.py:765] (2/8) Epoch 15, batch 300, train_loss[loss=2.942, ArTop10Accuracy=0.7258, over 14508.00 frames. ], tot_loss[loss=2.843, ArTop10Accuracy=0.7439, over 9429.96 frames. ], batch size: 44, lr: 8.09e-03 2024-08-06 05:46:51,902 INFO [trainer.py:765] (2/8) Epoch 15, batch 400, train_loss[loss=2.655, ArTop10Accuracy=0.7743, over 10217.00 frames. ], tot_loss[loss=2.842, ArTop10Accuracy=0.7438, over 10322.02 frames. ], batch size: 14, lr: 8.07e-03 2024-08-06 05:47:36,911 INFO [trainer.py:765] (2/8) Epoch 15, batch 500, train_loss[loss=2.736, ArTop10Accuracy=0.7632, over 12303.00 frames. ], tot_loss[loss=2.842, ArTop10Accuracy=0.7434, over 10881.57 frames. ], batch size: 22, lr: 8.05e-03 2024-08-06 05:48:24,723 INFO [trainer.py:765] (2/8) Epoch 15, batch 600, train_loss[loss=2.761, ArTop10Accuracy=0.7587, over 11526.00 frames. ], tot_loss[loss=2.847, ArTop10Accuracy=0.7425, over 11419.26 frames. ], batch size: 18, lr: 8.03e-03 2024-08-06 05:49:11,856 INFO [trainer.py:765] (2/8) Epoch 15, batch 700, train_loss[loss=2.846, ArTop10Accuracy=0.7335, over 10062.00 frames. ], tot_loss[loss=2.855, ArTop10Accuracy=0.7405, over 11564.98 frames. ], batch size: 12, lr: 8.01e-03 2024-08-06 05:49:45,779 INFO [trainer.py:765] (2/8) Epoch 15, batch 800, train_loss[loss=2.925, ArTop10Accuracy=0.7188, over 9224.00 frames. ], tot_loss[loss=2.864, ArTop10Accuracy=0.7391, over 11660.66 frames. ], batch size: 11, lr: 7.99e-03 2024-08-06 05:50:17,210 INFO [trainer.py:765] (2/8) Epoch 15, batch 900, train_loss[loss=3.033, ArTop10Accuracy=0.7154, over 13057.00 frames. ], tot_loss[loss=2.854, ArTop10Accuracy=0.7409, over 11730.28 frames. ], batch size: 27, lr: 7.97e-03 2024-08-06 05:50:48,830 INFO [trainer.py:765] (2/8) Epoch 15, batch 1000, train_loss[loss=2.834, ArTop10Accuracy=0.7484, over 12972.00 frames. ], tot_loss[loss=2.858, ArTop10Accuracy=0.7404, over 11937.53 frames. ], batch size: 27, lr: 7.95e-03 2024-08-06 05:51:20,070 INFO [trainer.py:765] (2/8) Epoch 15, batch 1100, train_loss[loss=2.899, ArTop10Accuracy=0.7337, over 13832.00 frames. ], tot_loss[loss=2.867, ArTop10Accuracy=0.7386, over 11992.96 frames. ], batch size: 34, lr: 7.93e-03 2024-08-06 05:51:23,515 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.123e+02 1.337e+02 1.431e+02 1.541e+02 2.784e+02, threshold=2.862e+02, percent-clipped=0.0 2024-08-06 05:51:53,082 INFO [trainer.py:765] (2/8) Epoch 15, batch 1200, train_loss[loss=2.971, ArTop10Accuracy=0.7191, over 12900.00 frames. ], tot_loss[loss=2.873, ArTop10Accuracy=0.738, over 11941.95 frames. ], batch size: 99, lr: 7.91e-03 2024-08-06 05:52:18,086 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 05:53:29,263 INFO [trainer.py:765] (2/8) Epoch 16, batch 100, train_loss[loss=2.894, ArTop10Accuracy=0.7377, over 14427.00 frames. ], tot_loss[loss=2.834, ArTop10Accuracy=0.7468, over 4785.00 frames. ], batch size: 61, lr: 7.63e-03 2024-08-06 05:54:12,877 INFO [trainer.py:765] (2/8) Epoch 16, batch 200, train_loss[loss=2.831, ArTop10Accuracy=0.7462, over 13780.00 frames. ], tot_loss[loss=2.833, ArTop10Accuracy=0.7467, over 7781.14 frames. ], batch size: 34, lr: 7.61e-03 2024-08-06 05:54:59,737 INFO [trainer.py:765] (2/8) Epoch 16, batch 300, train_loss[loss=2.906, ArTop10Accuracy=0.7335, over 14155.00 frames. ], tot_loss[loss=2.831, ArTop10Accuracy=0.7468, over 9420.48 frames. ], batch size: 44, lr: 7.59e-03 2024-08-06 05:55:41,930 INFO [trainer.py:765] (2/8) Epoch 16, batch 400, train_loss[loss=2.686, ArTop10Accuracy=0.7733, over 10961.00 frames. ], tot_loss[loss=2.831, ArTop10Accuracy=0.7466, over 10323.32 frames. ], batch size: 15, lr: 7.58e-03 2024-08-06 05:56:27,680 INFO [trainer.py:765] (2/8) Epoch 16, batch 500, train_loss[loss=2.809, ArTop10Accuracy=0.7594, over 12164.00 frames. ], tot_loss[loss=2.83, ArTop10Accuracy=0.7463, over 10905.04 frames. ], batch size: 22, lr: 7.56e-03 2024-08-06 05:57:12,439 INFO [trainer.py:765] (2/8) Epoch 16, batch 600, train_loss[loss=2.71, ArTop10Accuracy=0.7724, over 11487.00 frames. ], tot_loss[loss=2.834, ArTop10Accuracy=0.7452, over 11438.14 frames. ], batch size: 18, lr: 7.54e-03 2024-08-06 05:58:00,040 INFO [trainer.py:765] (2/8) Epoch 16, batch 700, train_loss[loss=2.843, ArTop10Accuracy=0.7465, over 9292.00 frames. ], tot_loss[loss=2.84, ArTop10Accuracy=0.7441, over 11558.69 frames. ], batch size: 11, lr: 7.52e-03 2024-08-06 05:58:34,024 INFO [trainer.py:765] (2/8) Epoch 16, batch 800, train_loss[loss=2.79, ArTop10Accuracy=0.7507, over 10225.00 frames. ], tot_loss[loss=2.85, ArTop10Accuracy=0.7423, over 11679.46 frames. ], batch size: 12, lr: 7.50e-03 2024-08-06 05:58:41,569 INFO [trainer.py:803] (2/8) Computing validation loss 2024-08-06 05:58:50,426 INFO [trainer.py:811] (2/8) Epoch 16, validation: loss=2.915, ArTop10Accuracy=0.7338, over 1829298.00 frames. 2024-08-06 05:58:50,427 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33340MB 2024-08-06 05:58:50,730 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.121e+02 1.335e+02 1.445e+02 1.570e+02 3.252e+02, threshold=2.890e+02, percent-clipped=0.1 2024-08-06 05:59:14,321 INFO [trainer.py:765] (2/8) Epoch 16, batch 900, train_loss[loss=2.791, ArTop10Accuracy=0.7515, over 12798.00 frames. ], tot_loss[loss=2.846, ArTop10Accuracy=0.7429, over 11723.23 frames. ], batch size: 27, lr: 7.49e-03 2024-08-06 05:59:45,915 INFO [trainer.py:765] (2/8) Epoch 16, batch 1000, train_loss[loss=2.776, ArTop10Accuracy=0.7557, over 12832.00 frames. ], tot_loss[loss=2.85, ArTop10Accuracy=0.7421, over 11918.58 frames. ], batch size: 27, lr: 7.47e-03 2024-08-06 06:00:17,091 INFO [trainer.py:765] (2/8) Epoch 16, batch 1100, train_loss[loss=2.883, ArTop10Accuracy=0.7371, over 13757.00 frames. ], tot_loss[loss=2.86, ArTop10Accuracy=0.7403, over 11985.14 frames. ], batch size: 34, lr: 7.45e-03 2024-08-06 06:00:47,464 INFO [trainer.py:765] (2/8) Epoch 16, batch 1200, train_loss[loss=3.013, ArTop10Accuracy=0.7111, over 12027.00 frames. ], tot_loss[loss=2.856, ArTop10Accuracy=0.7408, over 11942.83 frames. ], batch size: 97, lr: 7.43e-03 2024-08-06 06:01:12,361 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 06:02:27,261 INFO [trainer.py:765] (2/8) Epoch 17, batch 100, train_loss[loss=2.949, ArTop10Accuracy=0.7286, over 14418.00 frames. ], tot_loss[loss=2.835, ArTop10Accuracy=0.7461, over 4798.04 frames. ], batch size: 61, lr: 7.18e-03 2024-08-06 06:03:11,850 INFO [trainer.py:765] (2/8) Epoch 17, batch 200, train_loss[loss=2.77, ArTop10Accuracy=0.7507, over 13471.00 frames. ], tot_loss[loss=2.834, ArTop10Accuracy=0.746, over 7806.93 frames. ], batch size: 34, lr: 7.17e-03 2024-08-06 06:03:57,502 INFO [trainer.py:765] (2/8) Epoch 17, batch 300, train_loss[loss=2.929, ArTop10Accuracy=0.7311, over 14126.00 frames. ], tot_loss[loss=2.822, ArTop10Accuracy=0.748, over 9433.07 frames. ], batch size: 44, lr: 7.15e-03 2024-08-06 06:04:42,838 INFO [trainer.py:765] (2/8) Epoch 17, batch 400, train_loss[loss=2.643, ArTop10Accuracy=0.7722, over 10347.00 frames. ], tot_loss[loss=2.822, ArTop10Accuracy=0.7477, over 10349.35 frames. ], batch size: 14, lr: 7.13e-03 2024-08-06 06:05:29,004 INFO [trainer.py:765] (2/8) Epoch 17, batch 500, train_loss[loss=2.906, ArTop10Accuracy=0.7349, over 12205.00 frames. ], tot_loss[loss=2.818, ArTop10Accuracy=0.7484, over 10923.35 frames. ], batch size: 22, lr: 7.12e-03 2024-08-06 06:05:49,551 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.142e+02 1.359e+02 1.445e+02 1.551e+02 2.741e+02, threshold=2.891e+02, percent-clipped=0.0 2024-08-06 06:06:20,723 INFO [trainer.py:765] (2/8) Epoch 17, batch 600, train_loss[loss=2.804, ArTop10Accuracy=0.7508, over 11660.00 frames. ], tot_loss[loss=2.829, ArTop10Accuracy=0.7461, over 11452.61 frames. ], batch size: 18, lr: 7.10e-03 2024-08-06 06:07:04,695 INFO [trainer.py:765] (2/8) Epoch 17, batch 700, train_loss[loss=2.703, ArTop10Accuracy=0.7668, over 10055.00 frames. ], tot_loss[loss=2.83, ArTop10Accuracy=0.7458, over 11584.29 frames. ], batch size: 12, lr: 7.09e-03 2024-08-06 06:07:44,896 INFO [trainer.py:765] (2/8) Epoch 17, batch 800, train_loss[loss=2.682, ArTop10Accuracy=0.7718, over 9902.00 frames. ], tot_loss[loss=2.833, ArTop10Accuracy=0.7453, over 11687.60 frames. ], batch size: 12, lr: 7.07e-03 2024-08-06 06:08:16,384 INFO [trainer.py:765] (2/8) Epoch 17, batch 900, train_loss[loss=2.868, ArTop10Accuracy=0.7369, over 13122.00 frames. ], tot_loss[loss=2.831, ArTop10Accuracy=0.7456, over 11742.05 frames. ], batch size: 27, lr: 7.05e-03 2024-08-06 06:08:47,995 INFO [trainer.py:765] (2/8) Epoch 17, batch 1000, train_loss[loss=2.71, ArTop10Accuracy=0.7683, over 12741.00 frames. ], tot_loss[loss=2.838, ArTop10Accuracy=0.7443, over 11942.29 frames. ], batch size: 27, lr: 7.04e-03 2024-08-06 06:09:19,134 INFO [trainer.py:765] (2/8) Epoch 17, batch 1100, train_loss[loss=2.813, ArTop10Accuracy=0.7484, over 13705.00 frames. ], tot_loss[loss=2.845, ArTop10Accuracy=0.7428, over 11995.99 frames. ], batch size: 34, lr: 7.02e-03 2024-08-06 06:09:49,446 INFO [trainer.py:765] (2/8) Epoch 17, batch 1200, train_loss[loss=2.972, ArTop10Accuracy=0.7206, over 12667.00 frames. ], tot_loss[loss=2.843, ArTop10Accuracy=0.7434, over 11937.41 frames. ], batch size: 100, lr: 7.01e-03 2024-08-06 06:10:15,269 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 06:11:23,101 INFO [trainer.py:765] (2/8) Epoch 18, batch 100, train_loss[loss=2.985, ArTop10Accuracy=0.7228, over 14784.00 frames. ], tot_loss[loss=2.825, ArTop10Accuracy=0.7484, over 4797.01 frames. ], batch size: 62, lr: 6.78e-03 2024-08-06 06:12:16,259 INFO [trainer.py:765] (2/8) Epoch 18, batch 200, train_loss[loss=2.809, ArTop10Accuracy=0.7482, over 13618.00 frames. ], tot_loss[loss=2.821, ArTop10Accuracy=0.7491, over 7816.57 frames. ], batch size: 34, lr: 6.77e-03 2024-08-06 06:12:40,317 INFO [trainer.py:803] (2/8) Computing validation loss 2024-08-06 06:12:48,991 INFO [trainer.py:811] (2/8) Epoch 18, validation: loss=2.916, ArTop10Accuracy=0.7343, over 1829298.00 frames. 2024-08-06 06:12:48,992 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33340MB 2024-08-06 06:12:49,335 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.163e+02 1.377e+02 1.476e+02 1.588e+02 2.450e+02, threshold=2.952e+02, percent-clipped=0.0 2024-08-06 06:13:07,116 INFO [trainer.py:765] (2/8) Epoch 18, batch 300, train_loss[loss=2.862, ArTop10Accuracy=0.747, over 14437.00 frames. ], tot_loss[loss=2.811, ArTop10Accuracy=0.7507, over 9441.11 frames. ], batch size: 44, lr: 6.75e-03 2024-08-06 06:13:54,098 INFO [trainer.py:765] (2/8) Epoch 18, batch 400, train_loss[loss=2.705, ArTop10Accuracy=0.7705, over 10374.00 frames. ], tot_loss[loss=2.81, ArTop10Accuracy=0.7506, over 10324.79 frames. ], batch size: 14, lr: 6.74e-03 2024-08-06 06:14:38,488 INFO [trainer.py:765] (2/8) Epoch 18, batch 500, train_loss[loss=2.831, ArTop10Accuracy=0.7445, over 12176.00 frames. ], tot_loss[loss=2.804, ArTop10Accuracy=0.7513, over 10888.06 frames. ], batch size: 22, lr: 6.73e-03 2024-08-06 06:15:23,628 INFO [trainer.py:765] (2/8) Epoch 18, batch 600, train_loss[loss=2.787, ArTop10Accuracy=0.7518, over 12008.00 frames. ], tot_loss[loss=2.814, ArTop10Accuracy=0.7496, over 11428.27 frames. ], batch size: 19, lr: 6.71e-03 2024-08-06 06:16:17,342 INFO [trainer.py:765] (2/8) Epoch 18, batch 700, train_loss[loss=2.535, ArTop10Accuracy=0.7898, over 10077.00 frames. ], tot_loss[loss=2.822, ArTop10Accuracy=0.7477, over 11595.39 frames. ], batch size: 12, lr: 6.70e-03 2024-08-06 06:16:51,428 INFO [trainer.py:765] (2/8) Epoch 18, batch 800, train_loss[loss=2.81, ArTop10Accuracy=0.7513, over 10196.00 frames. ], tot_loss[loss=2.832, ArTop10Accuracy=0.7458, over 11703.66 frames. ], batch size: 12, lr: 6.68e-03 2024-08-06 06:17:22,913 INFO [trainer.py:765] (2/8) Epoch 18, batch 900, train_loss[loss=2.845, ArTop10Accuracy=0.7459, over 12829.00 frames. ], tot_loss[loss=2.826, ArTop10Accuracy=0.7472, over 11742.87 frames. ], batch size: 27, lr: 6.67e-03 2024-08-06 06:17:54,529 INFO [trainer.py:765] (2/8) Epoch 18, batch 1000, train_loss[loss=2.83, ArTop10Accuracy=0.7482, over 12975.00 frames. ], tot_loss[loss=2.833, ArTop10Accuracy=0.7459, over 11949.79 frames. ], batch size: 27, lr: 6.65e-03 2024-08-06 06:18:25,664 INFO [trainer.py:765] (2/8) Epoch 18, batch 1100, train_loss[loss=2.837, ArTop10Accuracy=0.7422, over 13645.00 frames. ], tot_loss[loss=2.842, ArTop10Accuracy=0.744, over 12011.69 frames. ], batch size: 34, lr: 6.64e-03 2024-08-06 06:18:55,972 INFO [trainer.py:765] (2/8) Epoch 18, batch 1200, train_loss[loss=3.045, ArTop10Accuracy=0.6986, over 11883.00 frames. ], tot_loss[loss=2.841, ArTop10Accuracy=0.7439, over 11955.71 frames. ], batch size: 97, lr: 6.63e-03 2024-08-06 06:19:19,163 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.178e+02 1.387e+02 1.492e+02 1.607e+02 2.982e+02, threshold=2.983e+02, percent-clipped=0.1 2024-08-06 06:19:23,732 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 06:20:29,727 INFO [trainer.py:765] (2/8) Epoch 19, batch 100, train_loss[loss=2.794, ArTop10Accuracy=0.7559, over 14034.00 frames. ], tot_loss[loss=2.8, ArTop10Accuracy=0.7529, over 4769.06 frames. ], batch size: 61, lr: 6.43e-03 2024-08-06 06:21:11,274 INFO [trainer.py:765] (2/8) Epoch 19, batch 200, train_loss[loss=2.774, ArTop10Accuracy=0.7534, over 13748.00 frames. ], tot_loss[loss=2.806, ArTop10Accuracy=0.752, over 7787.60 frames. ], batch size: 34, lr: 6.41e-03 2024-08-06 06:21:56,077 INFO [trainer.py:765] (2/8) Epoch 19, batch 300, train_loss[loss=2.935, ArTop10Accuracy=0.7293, over 14360.00 frames. ], tot_loss[loss=2.807, ArTop10Accuracy=0.7514, over 9416.01 frames. ], batch size: 44, lr: 6.40e-03 2024-08-06 06:22:36,014 INFO [trainer.py:765] (2/8) Epoch 19, batch 400, train_loss[loss=2.635, ArTop10Accuracy=0.7794, over 10369.00 frames. ], tot_loss[loss=2.804, ArTop10Accuracy=0.752, over 10332.28 frames. ], batch size: 14, lr: 6.39e-03 2024-08-06 06:23:18,997 INFO [trainer.py:765] (2/8) Epoch 19, batch 500, train_loss[loss=2.699, ArTop10Accuracy=0.774, over 12306.00 frames. ], tot_loss[loss=2.799, ArTop10Accuracy=0.7526, over 10926.80 frames. ], batch size: 22, lr: 6.37e-03 2024-08-06 06:24:03,685 INFO [trainer.py:765] (2/8) Epoch 19, batch 600, train_loss[loss=2.775, ArTop10Accuracy=0.7611, over 11535.00 frames. ], tot_loss[loss=2.806, ArTop10Accuracy=0.7513, over 11443.41 frames. ], batch size: 18, lr: 6.36e-03 2024-08-06 06:24:46,184 INFO [trainer.py:765] (2/8) Epoch 19, batch 700, train_loss[loss=2.588, ArTop10Accuracy=0.7913, over 10357.00 frames. ], tot_loss[loss=2.81, ArTop10Accuracy=0.7502, over 11593.46 frames. ], batch size: 12, lr: 6.35e-03 2024-08-06 06:25:22,355 INFO [trainer.py:765] (2/8) Epoch 19, batch 800, train_loss[loss=2.72, ArTop10Accuracy=0.7633, over 10066.00 frames. ], tot_loss[loss=2.816, ArTop10Accuracy=0.749, over 11715.86 frames. ], batch size: 12, lr: 6.33e-03 2024-08-06 06:25:53,624 INFO [trainer.py:765] (2/8) Epoch 19, batch 900, train_loss[loss=2.68, ArTop10Accuracy=0.7632, over 13113.00 frames. ], tot_loss[loss=2.808, ArTop10Accuracy=0.7503, over 11748.63 frames. ], batch size: 27, lr: 6.32e-03 2024-08-06 06:26:21,772 INFO [trainer.py:803] (2/8) Computing validation loss 2024-08-06 06:26:30,765 INFO [trainer.py:811] (2/8) Epoch 19, validation: loss=2.918, ArTop10Accuracy=0.733, over 1829298.00 frames. 2024-08-06 06:26:30,766 INFO [trainer.py:814] (2/8) Maximum memory allocated so far is 33340MB 2024-08-06 06:26:31,053 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.198e+02 1.416e+02 1.525e+02 1.662e+02 2.849e+02, threshold=3.050e+02, percent-clipped=0.0 2024-08-06 06:26:34,032 INFO [trainer.py:765] (2/8) Epoch 19, batch 1000, train_loss[loss=2.78, ArTop10Accuracy=0.7612, over 12888.00 frames. ], tot_loss[loss=2.817, ArTop10Accuracy=0.7487, over 11942.50 frames. ], batch size: 27, lr: 6.31e-03 2024-08-06 06:27:05,190 INFO [trainer.py:765] (2/8) Epoch 19, batch 1100, train_loss[loss=2.828, ArTop10Accuracy=0.7477, over 13837.00 frames. ], tot_loss[loss=2.827, ArTop10Accuracy=0.7468, over 11994.26 frames. ], batch size: 34, lr: 6.30e-03 2024-08-06 06:27:35,454 INFO [trainer.py:765] (2/8) Epoch 19, batch 1200, train_loss[loss=2.986, ArTop10Accuracy=0.7209, over 12040.00 frames. ], tot_loss[loss=2.827, ArTop10Accuracy=0.7469, over 11918.07 frames. ], batch size: 101, lr: 6.28e-03 2024-08-06 06:28:00,542 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 06:29:08,984 INFO [trainer.py:765] (2/8) Epoch 20, batch 100, train_loss[loss=2.887, ArTop10Accuracy=0.7307, over 14720.00 frames. ], tot_loss[loss=2.791, ArTop10Accuracy=0.7544, over 4788.15 frames. ], batch size: 61, lr: 6.10e-03 2024-08-06 06:29:50,318 INFO [trainer.py:765] (2/8) Epoch 20, batch 200, train_loss[loss=2.798, ArTop10Accuracy=0.7507, over 13550.00 frames. ], tot_loss[loss=2.788, ArTop10Accuracy=0.755, over 7785.68 frames. ], batch size: 34, lr: 6.09e-03 2024-08-06 06:30:37,105 INFO [trainer.py:765] (2/8) Epoch 20, batch 300, train_loss[loss=2.85, ArTop10Accuracy=0.746, over 14568.00 frames. ], tot_loss[loss=2.787, ArTop10Accuracy=0.7551, over 9422.94 frames. ], batch size: 44, lr: 6.08e-03 2024-08-06 06:31:16,354 INFO [trainer.py:765] (2/8) Epoch 20, batch 400, train_loss[loss=2.774, ArTop10Accuracy=0.7555, over 10346.00 frames. ], tot_loss[loss=2.785, ArTop10Accuracy=0.7553, over 10315.08 frames. ], batch size: 14, lr: 6.07e-03 2024-08-06 06:32:03,758 INFO [trainer.py:765] (2/8) Epoch 20, batch 500, train_loss[loss=2.859, ArTop10Accuracy=0.7423, over 12227.00 frames. ], tot_loss[loss=2.785, ArTop10Accuracy=0.755, over 10888.33 frames. ], batch size: 22, lr: 6.05e-03 2024-08-06 06:32:43,356 INFO [trainer.py:765] (2/8) Epoch 20, batch 600, train_loss[loss=2.668, ArTop10Accuracy=0.7744, over 11616.00 frames. ], tot_loss[loss=2.793, ArTop10Accuracy=0.7532, over 11413.38 frames. ], batch size: 18, lr: 6.04e-03 2024-08-06 06:33:36,751 INFO [trainer.py:765] (2/8) Epoch 20, batch 700, train_loss[loss=2.82, ArTop10Accuracy=0.7457, over 10272.00 frames. ], tot_loss[loss=2.804, ArTop10Accuracy=0.751, over 11570.07 frames. ], batch size: 12, lr: 6.03e-03 2024-08-06 06:33:43,830 INFO [optim.py:386] (2/8) Clipping_scale=2.0, grad-norm quartiles 1.196e+02 1.417e+02 1.526e+02 1.639e+02 3.791e+02, threshold=3.052e+02, percent-clipped=0.1 2024-08-06 06:34:13,304 INFO [trainer.py:765] (2/8) Epoch 20, batch 800, train_loss[loss=2.598, ArTop10Accuracy=0.7936, over 10115.00 frames. ], tot_loss[loss=2.807, ArTop10Accuracy=0.7505, over 11678.62 frames. ], batch size: 12, lr: 6.02e-03 2024-08-06 06:34:44,580 INFO [trainer.py:765] (2/8) Epoch 20, batch 900, train_loss[loss=2.811, ArTop10Accuracy=0.7549, over 13034.00 frames. ], tot_loss[loss=2.803, ArTop10Accuracy=0.7514, over 11749.17 frames. ], batch size: 27, lr: 6.01e-03 2024-08-06 06:35:16,139 INFO [trainer.py:765] (2/8) Epoch 20, batch 1000, train_loss[loss=2.783, ArTop10Accuracy=0.7539, over 12951.00 frames. ], tot_loss[loss=2.807, ArTop10Accuracy=0.7507, over 11955.33 frames. ], batch size: 27, lr: 6.00e-03 2024-08-06 06:35:47,214 INFO [trainer.py:765] (2/8) Epoch 20, batch 1100, train_loss[loss=2.809, ArTop10Accuracy=0.7529, over 13720.00 frames. ], tot_loss[loss=2.817, ArTop10Accuracy=0.7487, over 12010.35 frames. ], batch size: 34, lr: 5.99e-03 2024-08-06 06:36:17,439 INFO [trainer.py:765] (2/8) Epoch 20, batch 1200, train_loss[loss=2.914, ArTop10Accuracy=0.7276, over 13237.00 frames. ], tot_loss[loss=2.82, ArTop10Accuracy=0.7483, over 11953.30 frames. ], batch size: 100, lr: 5.97e-03 2024-08-06 06:36:42,597 INFO [trainer.py:650] (2/8) Reaches end of dataloader. 2024-08-06 06:36:42,600 INFO [trainer.py:1069] (2/8) Done!