2024-08-06 03:39:40,361 INFO [trainer.py:870] (1/8) Training started 2024-08-06 03:39:40,362 INFO [trainer.py:889] (1/8) Device: cuda:1 2024-08-06 03:39:40,362 INFO [trainer.py:890] (1/8) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 100, 'reset_interval': 200, 'valid_interval': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '279b0c87015a615b81b147251814d737a548f397', 'k2-git-date': 'Wed May 24 22:24:09 2023', 'lhotse-version': '1.26.0', 'torch-version': '2.0.1+cu118', 'torch-cuda-available': True, 'torch-cuda-version': '11.8', 'python-version': '3.10', 'icefall-git-branch': 'main', 'icefall-git-sha1': '7d2e5f4-dirty', 'icefall-git-date': 'Tue Aug 6 02:59:12 2024', 'icefall-path': '/workspace/icefall_llm', 'k2-path': '/usr/local/lib/python3.10/dist-packages/k2/__init__.py', 'lhotse-path': '/usr/local/lib/python3.10/dist-packages/lhotse/__init__.py', 'hostname': '6865771', 'IP address': '0.104.195.107'}, 'world_size': 8, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 20, 'start_epoch': 1, 'start_batch': 0, 'exp_dir': PosixPath('exp/valle'), 'optimizer_name': 'ScaledAdam', 'scheduler_name': 'Eden', 'base_lr': 0.03, 'warmup_steps': 200, 'seed': 42, 'inf_check': False, 'save_every_n': 1000, 'keep_last_k': 20, 'average_period': 0, 'accumulate_grad_steps': 1, 'dtype': 'bfloat16', 'filter_min_duration': 0.5, 'filter_max_duration': 14.0, 'train_stage': 1, 'visualize': False, 'oom_check': False, 'model_name': 'valle', 'decoder_dim': 1024, 'nhead': 16, 'num_decoder_layers': 12, 'scale_factor': 1.0, 'norm_first': True, 'add_prenet': False, 'prefix_mode': 1, 'share_embedding': True, 'prepend_bos': False, 'num_quantizers': 8, 'scaling_xformers': False, 'manifest_dir': PosixPath('data/tokenized'), 'max_duration': 320, 'bucketing_sampler': True, 'num_buckets': 6, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 0.1, 'on_the_fly_feats': False, 'shuffle': True, 'buffer_size': 40000, 'shuffle_buffer_size': 100000, 'drop_last': False, 'return_cuts': True, 'num_workers': 8, 'enable_spec_aug': False, 'spec_aug_time_warp_factor': 80, 'input_strategy': 'PrecomputedFeatures', 'dataset': 'libritts', 'text_tokens': 'data/tokenized/unique_text_tokens.k2symbols', 'sampling_rate': 24000} 2024-08-06 03:39:40,362 INFO [trainer.py:892] (1/8) About to create model 2024-08-06 03:39:41,123 INFO [trainer.py:899] (1/8) Number of model parameters: 367386628 2024-08-06 03:39:41,939 INFO [trainer.py:914] (1/8) Using DDP 2024-08-06 03:39:44,000 INFO [datamodule.py:427] (1/8) About to get train cuts 2024-08-06 03:39:44,002 INFO [datamodule.py:434] (1/8) About to get dev cuts 2024-08-06 03:39:44,003 INFO [datamodule.py:292] (1/8) Disable SpecAugment 2024-08-06 03:39:44,003 INFO [datamodule.py:294] (1/8) About to create train dataset 2024-08-06 03:39:44,003 INFO [datamodule.py:323] (1/8) Using DynamicBucketingSampler 2024-08-06 03:39:44,618 INFO [datamodule.py:344] (1/8) About to create train dataloader 2024-08-06 03:39:44,618 INFO [datamodule.py:367] (1/8) About to create dev dataset 2024-08-06 03:39:44,948 INFO [datamodule.py:388] (1/8) About to create dev dataloader 2024-08-06 03:40:39,569 INFO [trainer.py:765] (1/8) Epoch 1, batch 100, train_loss[loss=4.211, ArTop10Accuracy=0.4941, over 14685.00 frames. ], tot_loss[loss=4.797, ArTop10Accuracy=0.3935, over 4777.19 frames. ], batch size: 61, lr: 2.25e-02 2024-08-06 03:41:16,921 INFO [trainer.py:765] (1/8) Epoch 1, batch 200, train_loss[loss=3.868, ArTop10Accuracy=0.5479, over 13852.00 frames. ], tot_loss[loss=4.306, ArTop10Accuracy=0.4751, over 7786.72 frames. ], batch size: 34, lr: 3.00e-02 2024-08-06 03:41:57,950 INFO [trainer.py:765] (1/8) Epoch 1, batch 300, train_loss[loss=3.971, ArTop10Accuracy=0.5225, over 14357.00 frames. ], tot_loss[loss=4.088, ArTop10Accuracy=0.5102, over 9419.77 frames. ], batch size: 44, lr: 3.00e-02 2024-08-06 03:42:33,079 INFO [trainer.py:765] (1/8) Epoch 1, batch 400, train_loss[loss=3.741, ArTop10Accuracy=0.568, over 10105.00 frames. ], tot_loss[loss=3.933, ArTop10Accuracy=0.5359, over 10340.30 frames. ], batch size: 14, lr: 3.00e-02 2024-08-06 03:43:11,270 INFO [trainer.py:765] (1/8) Epoch 1, batch 500, train_loss[loss=3.502, ArTop10Accuracy=0.5975, over 12171.00 frames. ], tot_loss[loss=3.821, ArTop10Accuracy=0.5547, over 10900.33 frames. ], batch size: 22, lr: 2.99e-02 2024-08-06 03:43:46,592 INFO [trainer.py:765] (1/8) Epoch 1, batch 600, train_loss[loss=3.568, ArTop10Accuracy=0.5968, over 11385.00 frames. ], tot_loss[loss=3.742, ArTop10Accuracy=0.5685, over 11432.22 frames. ], batch size: 18, lr: 2.99e-02 2024-08-06 03:44:27,898 INFO [trainer.py:765] (1/8) Epoch 1, batch 700, train_loss[loss=3.5, ArTop10Accuracy=0.6223, over 10264.00 frames. ], tot_loss[loss=3.682, ArTop10Accuracy=0.5792, over 11565.69 frames. ], batch size: 12, lr: 2.99e-02 2024-08-06 03:45:01,514 INFO [trainer.py:765] (1/8) Epoch 1, batch 800, train_loss[loss=3.455, ArTop10Accuracy=0.6257, over 10082.00 frames. ], tot_loss[loss=3.635, ArTop10Accuracy=0.5877, over 11680.27 frames. ], batch size: 12, lr: 2.98e-02 2024-08-06 03:45:32,557 INFO [trainer.py:765] (1/8) Epoch 1, batch 900, train_loss[loss=3.517, ArTop10Accuracy=0.6102, over 12948.00 frames. ], tot_loss[loss=3.581, ArTop10Accuracy=0.5978, over 11721.27 frames. ], batch size: 27, lr: 2.98e-02 2024-08-06 03:46:03,649 INFO [trainer.py:765] (1/8) Epoch 1, batch 1000, train_loss[loss=3.477, ArTop10Accuracy=0.619, over 12944.00 frames. ], tot_loss[loss=3.553, ArTop10Accuracy=0.603, over 11943.35 frames. ], batch size: 27, lr: 2.97e-02 2024-08-06 03:46:07,989 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 8.169e+01 1.565e+02 2.239e+02 3.485e+02 9.105e+03, threshold=4.478e+02, percent-clipped=0.0 2024-08-06 03:46:38,611 INFO [trainer.py:765] (1/8) Epoch 1, batch 1100, train_loss[loss=3.417, ArTop10Accuracy=0.6279, over 13577.00 frames. ], tot_loss[loss=3.53, ArTop10Accuracy=0.607, over 11979.20 frames. ], batch size: 34, lr: 2.96e-02 2024-08-06 03:47:08,745 INFO [trainer.py:765] (1/8) Epoch 1, batch 1200, train_loss[loss=3.569, ArTop10Accuracy=0.6013, over 11880.00 frames. ], tot_loss[loss=3.502, ArTop10Accuracy=0.6125, over 11919.95 frames. ], batch size: 97, lr: 2.96e-02 2024-08-06 03:47:33,462 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 03:48:38,676 INFO [trainer.py:765] (1/8) Epoch 2, batch 100, train_loss[loss=3.514, ArTop10Accuracy=0.6095, over 14636.00 frames. ], tot_loss[loss=3.453, ArTop10Accuracy=0.6225, over 4803.63 frames. ], batch size: 61, lr: 2.90e-02 2024-08-06 03:49:14,596 INFO [trainer.py:765] (1/8) Epoch 2, batch 200, train_loss[loss=3.284, ArTop10Accuracy=0.6556, over 13788.00 frames. ], tot_loss[loss=3.433, ArTop10Accuracy=0.6259, over 7786.74 frames. ], batch size: 34, lr: 2.89e-02 2024-08-06 03:49:56,519 INFO [trainer.py:765] (1/8) Epoch 2, batch 300, train_loss[loss=3.493, ArTop10Accuracy=0.6161, over 14222.00 frames. ], tot_loss[loss=3.42, ArTop10Accuracy=0.6286, over 9427.07 frames. ], batch size: 44, lr: 2.89e-02 2024-08-06 03:50:31,999 INFO [trainer.py:765] (1/8) Epoch 2, batch 400, train_loss[loss=3.347, ArTop10Accuracy=0.6448, over 10418.00 frames. ], tot_loss[loss=3.409, ArTop10Accuracy=0.631, over 10350.14 frames. ], batch size: 14, lr: 2.88e-02 2024-08-06 03:51:17,109 INFO [trainer.py:765] (1/8) Epoch 2, batch 500, train_loss[loss=3.394, ArTop10Accuracy=0.6314, over 12365.00 frames. ], tot_loss[loss=3.396, ArTop10Accuracy=0.6334, over 10903.72 frames. ], batch size: 22, lr: 2.87e-02 2024-08-06 03:51:53,203 INFO [trainer.py:765] (1/8) Epoch 2, batch 600, train_loss[loss=3.394, ArTop10Accuracy=0.6319, over 11559.00 frames. ], tot_loss[loss=3.387, ArTop10Accuracy=0.6348, over 11422.82 frames. ], batch size: 18, lr: 2.86e-02 2024-08-06 03:52:38,993 INFO [trainer.py:765] (1/8) Epoch 2, batch 700, train_loss[loss=3.332, ArTop10Accuracy=0.6517, over 10076.00 frames. ], tot_loss[loss=3.388, ArTop10Accuracy=0.6347, over 11562.03 frames. ], batch size: 12, lr: 2.85e-02 2024-08-06 03:52:47,090 INFO [trainer.py:803] (1/8) Computing validation loss 2024-08-06 03:52:56,023 INFO [trainer.py:811] (1/8) Epoch 2, validation: loss=3.327, ArTop10Accuracy=0.6492, over 1829298.00 frames. 2024-08-06 03:52:56,024 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 28727MB 2024-08-06 03:52:56,541 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 8.181e+01 1.431e+02 1.849e+02 2.730e+02 2.344e+03, threshold=3.697e+02, percent-clipped=7.2 2024-08-06 03:53:21,881 INFO [trainer.py:765] (1/8) Epoch 2, batch 800, train_loss[loss=3.398, ArTop10Accuracy=0.6388, over 9367.00 frames. ], tot_loss[loss=3.383, ArTop10Accuracy=0.6358, over 11655.20 frames. ], batch size: 11, lr: 2.84e-02 2024-08-06 03:53:53,299 INFO [trainer.py:765] (1/8) Epoch 2, batch 900, train_loss[loss=3.391, ArTop10Accuracy=0.6392, over 12796.00 frames. ], tot_loss[loss=3.366, ArTop10Accuracy=0.639, over 11709.70 frames. ], batch size: 27, lr: 2.83e-02 2024-08-06 03:54:24,808 INFO [trainer.py:765] (1/8) Epoch 2, batch 1000, train_loss[loss=3.409, ArTop10Accuracy=0.6317, over 13004.00 frames. ], tot_loss[loss=3.362, ArTop10Accuracy=0.64, over 11923.32 frames. ], batch size: 27, lr: 2.82e-02 2024-08-06 03:54:56,006 INFO [trainer.py:765] (1/8) Epoch 2, batch 1100, train_loss[loss=3.398, ArTop10Accuracy=0.6314, over 13842.00 frames. ], tot_loss[loss=3.36, ArTop10Accuracy=0.6402, over 11993.47 frames. ], batch size: 34, lr: 2.81e-02 2024-08-06 03:55:26,228 INFO [trainer.py:765] (1/8) Epoch 2, batch 1200, train_loss[loss=3.483, ArTop10Accuracy=0.6179, over 12353.00 frames. ], tot_loss[loss=3.348, ArTop10Accuracy=0.6423, over 11942.25 frames. ], batch size: 97, lr: 2.80e-02 2024-08-06 03:55:51,272 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 03:57:04,101 INFO [trainer.py:765] (1/8) Epoch 3, batch 100, train_loss[loss=3.333, ArTop10Accuracy=0.6494, over 14623.00 frames. ], tot_loss[loss=3.311, ArTop10Accuracy=0.651, over 4773.76 frames. ], batch size: 61, lr: 2.67e-02 2024-08-06 03:57:50,979 INFO [trainer.py:765] (1/8) Epoch 3, batch 200, train_loss[loss=3.328, ArTop10Accuracy=0.65, over 13756.00 frames. ], tot_loss[loss=3.288, ArTop10Accuracy=0.6547, over 7779.65 frames. ], batch size: 34, lr: 2.66e-02 2024-08-06 03:58:26,073 INFO [trainer.py:765] (1/8) Epoch 3, batch 300, train_loss[loss=3.245, ArTop10Accuracy=0.6634, over 14083.00 frames. ], tot_loss[loss=3.274, ArTop10Accuracy=0.6572, over 9407.73 frames. ], batch size: 44, lr: 2.64e-02 2024-08-06 03:59:11,252 INFO [trainer.py:765] (1/8) Epoch 3, batch 400, train_loss[loss=3.083, ArTop10Accuracy=0.6934, over 10398.00 frames. ], tot_loss[loss=3.265, ArTop10Accuracy=0.6589, over 10326.49 frames. ], batch size: 14, lr: 2.63e-02 2024-08-06 03:59:29,675 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 8.720e+01 1.461e+02 1.775e+02 2.344e+02 9.150e+02, threshold=3.550e+02, percent-clipped=5.2 2024-08-06 03:59:49,302 INFO [trainer.py:765] (1/8) Epoch 3, batch 500, train_loss[loss=3.104, ArTop10Accuracy=0.6816, over 12161.00 frames. ], tot_loss[loss=3.253, ArTop10Accuracy=0.6614, over 10898.01 frames. ], batch size: 22, lr: 2.62e-02 2024-08-06 04:00:35,094 INFO [trainer.py:765] (1/8) Epoch 3, batch 600, train_loss[loss=3.198, ArTop10Accuracy=0.6762, over 11433.00 frames. ], tot_loss[loss=3.237, ArTop10Accuracy=0.6643, over 11425.47 frames. ], batch size: 18, lr: 2.61e-02 2024-08-06 04:01:22,057 INFO [trainer.py:765] (1/8) Epoch 3, batch 700, train_loss[loss=3.226, ArTop10Accuracy=0.6768, over 10103.00 frames. ], tot_loss[loss=3.232, ArTop10Accuracy=0.6655, over 11560.08 frames. ], batch size: 12, lr: 2.60e-02 2024-08-06 04:01:56,268 INFO [trainer.py:765] (1/8) Epoch 3, batch 800, train_loss[loss=2.973, ArTop10Accuracy=0.7149, over 10194.00 frames. ], tot_loss[loss=3.225, ArTop10Accuracy=0.6668, over 11687.74 frames. ], batch size: 12, lr: 2.59e-02 2024-08-06 04:02:27,739 INFO [trainer.py:765] (1/8) Epoch 3, batch 900, train_loss[loss=3.063, ArTop10Accuracy=0.6936, over 12768.00 frames. ], tot_loss[loss=3.206, ArTop10Accuracy=0.6706, over 11752.15 frames. ], batch size: 27, lr: 2.57e-02 2024-08-06 04:02:59,282 INFO [trainer.py:765] (1/8) Epoch 3, batch 1000, train_loss[loss=3.204, ArTop10Accuracy=0.6693, over 12887.00 frames. ], tot_loss[loss=3.196, ArTop10Accuracy=0.6724, over 11927.65 frames. ], batch size: 27, lr: 2.56e-02 2024-08-06 04:03:30,941 INFO [trainer.py:765] (1/8) Epoch 3, batch 1100, train_loss[loss=3.202, ArTop10Accuracy=0.6681, over 13940.00 frames. ], tot_loss[loss=3.191, ArTop10Accuracy=0.6736, over 12000.62 frames. ], batch size: 34, lr: 2.55e-02 2024-08-06 04:04:01,311 INFO [trainer.py:765] (1/8) Epoch 3, batch 1200, train_loss[loss=3.286, ArTop10Accuracy=0.6576, over 12261.00 frames. ], tot_loss[loss=3.183, ArTop10Accuracy=0.6754, over 11938.78 frames. ], batch size: 98, lr: 2.54e-02 2024-08-06 04:04:26,857 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 04:05:43,369 INFO [trainer.py:765] (1/8) Epoch 4, batch 100, train_loss[loss=3.22, ArTop10Accuracy=0.6675, over 14405.00 frames. ], tot_loss[loss=3.141, ArTop10Accuracy=0.6839, over 4771.98 frames. ], batch size: 61, lr: 2.38e-02 2024-08-06 04:06:07,077 INFO [trainer.py:803] (1/8) Computing validation loss 2024-08-06 04:06:16,404 INFO [trainer.py:811] (1/8) Epoch 4, validation: loss=3.063, ArTop10Accuracy=0.7031, over 1829298.00 frames. 2024-08-06 04:06:16,404 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 29537MB 2024-08-06 04:06:16,746 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.091e+02 1.493e+02 1.709e+02 2.068e+02 7.969e+02, threshold=3.418e+02, percent-clipped=2.9 2024-08-06 04:06:31,825 INFO [trainer.py:765] (1/8) Epoch 4, batch 200, train_loss[loss=3.197, ArTop10Accuracy=0.682, over 13803.00 frames. ], tot_loss[loss=3.124, ArTop10Accuracy=0.6873, over 7792.86 frames. ], batch size: 34, lr: 2.37e-02 2024-08-06 04:07:18,544 INFO [trainer.py:765] (1/8) Epoch 4, batch 300, train_loss[loss=3.211, ArTop10Accuracy=0.6781, over 14502.00 frames. ], tot_loss[loss=3.117, ArTop10Accuracy=0.6887, over 9418.04 frames. ], batch size: 45, lr: 2.36e-02 2024-08-06 04:08:01,910 INFO [trainer.py:765] (1/8) Epoch 4, batch 400, train_loss[loss=3.034, ArTop10Accuracy=0.7056, over 11117.00 frames. ], tot_loss[loss=3.116, ArTop10Accuracy=0.6889, over 10328.79 frames. ], batch size: 15, lr: 2.34e-02 2024-08-06 04:08:45,344 INFO [trainer.py:765] (1/8) Epoch 4, batch 500, train_loss[loss=3.084, ArTop10Accuracy=0.6913, over 12164.00 frames. ], tot_loss[loss=3.109, ArTop10Accuracy=0.6898, over 10898.32 frames. ], batch size: 22, lr: 2.33e-02 2024-08-06 04:09:37,072 INFO [trainer.py:765] (1/8) Epoch 4, batch 600, train_loss[loss=3.048, ArTop10Accuracy=0.6966, over 11971.00 frames. ], tot_loss[loss=3.111, ArTop10Accuracy=0.6894, over 11420.59 frames. ], batch size: 19, lr: 2.32e-02 2024-08-06 04:10:13,501 INFO [trainer.py:765] (1/8) Epoch 4, batch 700, train_loss[loss=3.067, ArTop10Accuracy=0.7017, over 10131.00 frames. ], tot_loss[loss=3.112, ArTop10Accuracy=0.6892, over 11573.58 frames. ], batch size: 12, lr: 2.31e-02 2024-08-06 04:10:51,959 INFO [trainer.py:765] (1/8) Epoch 4, batch 800, train_loss[loss=2.975, ArTop10Accuracy=0.7175, over 10435.00 frames. ], tot_loss[loss=3.111, ArTop10Accuracy=0.6893, over 11665.25 frames. ], batch size: 12, lr: 2.30e-02 2024-08-06 04:11:23,330 INFO [trainer.py:765] (1/8) Epoch 4, batch 900, train_loss[loss=3.09, ArTop10Accuracy=0.6959, over 12982.00 frames. ], tot_loss[loss=3.101, ArTop10Accuracy=0.6913, over 11731.81 frames. ], batch size: 27, lr: 2.29e-02 2024-08-06 04:11:54,826 INFO [trainer.py:765] (1/8) Epoch 4, batch 1000, train_loss[loss=3.044, ArTop10Accuracy=0.7061, over 12973.00 frames. ], tot_loss[loss=3.099, ArTop10Accuracy=0.692, over 11933.40 frames. ], batch size: 27, lr: 2.28e-02 2024-08-06 04:12:25,960 INFO [trainer.py:765] (1/8) Epoch 4, batch 1100, train_loss[loss=3.108, ArTop10Accuracy=0.6931, over 13630.00 frames. ], tot_loss[loss=3.106, ArTop10Accuracy=0.6905, over 12001.05 frames. ], batch size: 34, lr: 2.26e-02 2024-08-06 04:12:48,545 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.106e+02 1.440e+02 1.608e+02 1.893e+02 7.925e+02, threshold=3.216e+02, percent-clipped=2.0 2024-08-06 04:12:58,827 INFO [trainer.py:765] (1/8) Epoch 4, batch 1200, train_loss[loss=3.158, ArTop10Accuracy=0.678, over 12305.00 frames. ], tot_loss[loss=3.108, ArTop10Accuracy=0.6903, over 11931.65 frames. ], batch size: 98, lr: 2.25e-02 2024-08-06 04:13:24,356 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 04:14:38,685 INFO [trainer.py:765] (1/8) Epoch 5, batch 100, train_loss[loss=3.184, ArTop10Accuracy=0.6818, over 14605.00 frames. ], tot_loss[loss=3.059, ArTop10Accuracy=0.7008, over 4778.12 frames. ], batch size: 61, lr: 2.10e-02 2024-08-06 04:15:26,826 INFO [trainer.py:765] (1/8) Epoch 5, batch 200, train_loss[loss=3.144, ArTop10Accuracy=0.6805, over 13768.00 frames. ], tot_loss[loss=3.056, ArTop10Accuracy=0.7013, over 7779.65 frames. ], batch size: 34, lr: 2.09e-02 2024-08-06 04:16:08,011 INFO [trainer.py:765] (1/8) Epoch 5, batch 300, train_loss[loss=2.999, ArTop10Accuracy=0.7113, over 14653.00 frames. ], tot_loss[loss=3.052, ArTop10Accuracy=0.7016, over 9413.83 frames. ], batch size: 45, lr: 2.08e-02 2024-08-06 04:16:53,134 INFO [trainer.py:765] (1/8) Epoch 5, batch 400, train_loss[loss=3.053, ArTop10Accuracy=0.7039, over 10298.00 frames. ], tot_loss[loss=3.052, ArTop10Accuracy=0.7016, over 10330.35 frames. ], batch size: 14, lr: 2.07e-02 2024-08-06 04:17:36,638 INFO [trainer.py:765] (1/8) Epoch 5, batch 500, train_loss[loss=3.051, ArTop10Accuracy=0.6982, over 12439.00 frames. ], tot_loss[loss=3.049, ArTop10Accuracy=0.702, over 10905.78 frames. ], batch size: 22, lr: 2.06e-02 2024-08-06 04:18:22,114 INFO [trainer.py:765] (1/8) Epoch 5, batch 600, train_loss[loss=3.094, ArTop10Accuracy=0.6999, over 11559.00 frames. ], tot_loss[loss=3.054, ArTop10Accuracy=0.7011, over 11426.15 frames. ], batch size: 18, lr: 2.05e-02 2024-08-06 04:19:17,033 INFO [trainer.py:765] (1/8) Epoch 5, batch 700, train_loss[loss=2.871, ArTop10Accuracy=0.7291, over 10005.00 frames. ], tot_loss[loss=3.056, ArTop10Accuracy=0.7004, over 11568.53 frames. ], batch size: 12, lr: 2.04e-02 2024-08-06 04:19:51,066 INFO [trainer.py:765] (1/8) Epoch 5, batch 800, train_loss[loss=3, ArTop10Accuracy=0.7126, over 10694.00 frames. ], tot_loss[loss=3.062, ArTop10Accuracy=0.6992, over 11693.41 frames. ], batch size: 13, lr: 2.03e-02 2024-08-06 04:20:18,214 INFO [trainer.py:803] (1/8) Computing validation loss 2024-08-06 04:20:27,476 INFO [trainer.py:811] (1/8) Epoch 5, validation: loss=2.998, ArTop10Accuracy=0.7157, over 1829298.00 frames. 2024-08-06 04:20:27,476 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 29914MB 2024-08-06 04:20:27,781 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.057e+02 1.385e+02 1.542e+02 1.759e+02 7.741e+02, threshold=3.083e+02, percent-clipped=0.7 2024-08-06 04:20:31,766 INFO [trainer.py:765] (1/8) Epoch 5, batch 900, train_loss[loss=3.024, ArTop10Accuracy=0.7109, over 13120.00 frames. ], tot_loss[loss=3.052, ArTop10Accuracy=0.7012, over 11731.61 frames. ], batch size: 27, lr: 2.02e-02 2024-08-06 04:21:03,306 INFO [trainer.py:765] (1/8) Epoch 5, batch 1000, train_loss[loss=3.029, ArTop10Accuracy=0.7119, over 12783.00 frames. ], tot_loss[loss=3.054, ArTop10Accuracy=0.7009, over 11936.20 frames. ], batch size: 27, lr: 2.01e-02 2024-08-06 04:21:34,451 INFO [trainer.py:765] (1/8) Epoch 5, batch 1100, train_loss[loss=2.999, ArTop10Accuracy=0.7113, over 13838.00 frames. ], tot_loss[loss=3.062, ArTop10Accuracy=0.6995, over 11994.41 frames. ], batch size: 34, lr: 2.00e-02 2024-08-06 04:22:04,752 INFO [trainer.py:765] (1/8) Epoch 5, batch 1200, train_loss[loss=3.164, ArTop10Accuracy=0.6829, over 11790.00 frames. ], tot_loss[loss=3.054, ArTop10Accuracy=0.7009, over 11929.62 frames. ], batch size: 98, lr: 1.99e-02 2024-08-06 04:22:30,194 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 04:23:46,282 INFO [trainer.py:765] (1/8) Epoch 6, batch 100, train_loss[loss=3.049, ArTop10Accuracy=0.7005, over 14328.00 frames. ], tot_loss[loss=3.02, ArTop10Accuracy=0.7089, over 4777.09 frames. ], batch size: 61, lr: 1.85e-02 2024-08-06 04:24:35,255 INFO [trainer.py:765] (1/8) Epoch 6, batch 200, train_loss[loss=2.914, ArTop10Accuracy=0.7266, over 13723.00 frames. ], tot_loss[loss=3.015, ArTop10Accuracy=0.7104, over 7776.05 frames. ], batch size: 34, lr: 1.84e-02 2024-08-06 04:25:16,677 INFO [trainer.py:765] (1/8) Epoch 6, batch 300, train_loss[loss=3.049, ArTop10Accuracy=0.7045, over 14424.00 frames. ], tot_loss[loss=3.009, ArTop10Accuracy=0.711, over 9393.06 frames. ], batch size: 44, lr: 1.83e-02 2024-08-06 04:26:08,924 INFO [trainer.py:765] (1/8) Epoch 6, batch 400, train_loss[loss=2.863, ArTop10Accuracy=0.7268, over 10029.00 frames. ], tot_loss[loss=3.005, ArTop10Accuracy=0.711, over 10322.51 frames. ], batch size: 14, lr: 1.83e-02 2024-08-06 04:26:51,486 INFO [trainer.py:765] (1/8) Epoch 6, batch 500, train_loss[loss=2.995, ArTop10Accuracy=0.7079, over 12478.00 frames. ], tot_loss[loss=3.002, ArTop10Accuracy=0.7113, over 10891.41 frames. ], batch size: 22, lr: 1.82e-02 2024-08-06 04:27:39,298 INFO [trainer.py:765] (1/8) Epoch 6, batch 600, train_loss[loss=2.917, ArTop10Accuracy=0.7149, over 11750.00 frames. ], tot_loss[loss=3.007, ArTop10Accuracy=0.7099, over 11447.28 frames. ], batch size: 18, lr: 1.81e-02 2024-08-06 04:27:46,369 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.054e+02 1.343e+02 1.474e+02 1.660e+02 8.574e+02, threshold=2.947e+02, percent-clipped=0.6 2024-08-06 04:28:33,240 INFO [trainer.py:765] (1/8) Epoch 6, batch 700, train_loss[loss=2.854, ArTop10Accuracy=0.7402, over 10171.00 frames. ], tot_loss[loss=3.014, ArTop10Accuracy=0.7086, over 11581.22 frames. ], batch size: 12, lr: 1.80e-02 2024-08-06 04:29:11,216 INFO [trainer.py:765] (1/8) Epoch 6, batch 800, train_loss[loss=3.12, ArTop10Accuracy=0.6847, over 10061.00 frames. ], tot_loss[loss=3.018, ArTop10Accuracy=0.7082, over 11684.35 frames. ], batch size: 12, lr: 1.79e-02 2024-08-06 04:29:42,751 INFO [trainer.py:765] (1/8) Epoch 6, batch 900, train_loss[loss=3.003, ArTop10Accuracy=0.7076, over 13161.00 frames. ], tot_loss[loss=3.015, ArTop10Accuracy=0.7085, over 11729.41 frames. ], batch size: 27, lr: 1.78e-02 2024-08-06 04:30:14,306 INFO [trainer.py:765] (1/8) Epoch 6, batch 1000, train_loss[loss=3.037, ArTop10Accuracy=0.6993, over 12855.00 frames. ], tot_loss[loss=3.021, ArTop10Accuracy=0.7076, over 11934.75 frames. ], batch size: 27, lr: 1.77e-02 2024-08-06 04:30:45,383 INFO [trainer.py:765] (1/8) Epoch 6, batch 1100, train_loss[loss=2.996, ArTop10Accuracy=0.7164, over 14108.00 frames. ], tot_loss[loss=3.032, ArTop10Accuracy=0.7057, over 12000.46 frames. ], batch size: 34, lr: 1.77e-02 2024-08-06 04:31:15,673 INFO [trainer.py:765] (1/8) Epoch 6, batch 1200, train_loss[loss=3.169, ArTop10Accuracy=0.6784, over 11880.00 frames. ], tot_loss[loss=3.029, ArTop10Accuracy=0.706, over 11943.58 frames. ], batch size: 99, lr: 1.76e-02 2024-08-06 04:31:40,439 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 04:32:52,405 INFO [trainer.py:765] (1/8) Epoch 7, batch 100, train_loss[loss=3.047, ArTop10Accuracy=0.702, over 14422.00 frames. ], tot_loss[loss=2.987, ArTop10Accuracy=0.7148, over 4779.19 frames. ], batch size: 61, lr: 1.64e-02 2024-08-06 04:33:38,223 INFO [trainer.py:765] (1/8) Epoch 7, batch 200, train_loss[loss=2.943, ArTop10Accuracy=0.7228, over 13888.00 frames. ], tot_loss[loss=2.985, ArTop10Accuracy=0.7155, over 7795.94 frames. ], batch size: 34, lr: 1.64e-02 2024-08-06 04:34:22,609 INFO [trainer.py:765] (1/8) Epoch 7, batch 300, train_loss[loss=2.972, ArTop10Accuracy=0.7223, over 14520.00 frames. ], tot_loss[loss=2.982, ArTop10Accuracy=0.716, over 9435.04 frames. ], batch size: 44, lr: 1.63e-02 2024-08-06 04:34:36,847 INFO [trainer.py:803] (1/8) Computing validation loss 2024-08-06 04:34:45,809 INFO [trainer.py:811] (1/8) Epoch 7, validation: loss=2.963, ArTop10Accuracy=0.7233, over 1829298.00 frames. 2024-08-06 04:34:45,809 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 29914MB 2024-08-06 04:34:46,124 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.009e+02 1.306e+02 1.435e+02 1.599e+02 8.689e+02, threshold=2.871e+02, percent-clipped=0.9 2024-08-06 04:35:17,146 INFO [trainer.py:765] (1/8) Epoch 7, batch 400, train_loss[loss=2.893, ArTop10Accuracy=0.739, over 10205.00 frames. ], tot_loss[loss=2.987, ArTop10Accuracy=0.7148, over 10348.81 frames. ], batch size: 14, lr: 1.62e-02 2024-08-06 04:36:01,710 INFO [trainer.py:765] (1/8) Epoch 7, batch 500, train_loss[loss=2.904, ArTop10Accuracy=0.7278, over 12303.00 frames. ], tot_loss[loss=2.981, ArTop10Accuracy=0.7158, over 10908.06 frames. ], batch size: 22, lr: 1.61e-02 2024-08-06 04:36:48,811 INFO [trainer.py:765] (1/8) Epoch 7, batch 600, train_loss[loss=2.902, ArTop10Accuracy=0.7295, over 11582.00 frames. ], tot_loss[loss=2.982, ArTop10Accuracy=0.7153, over 11445.31 frames. ], batch size: 18, lr: 1.61e-02 2024-08-06 04:37:34,800 INFO [trainer.py:765] (1/8) Epoch 7, batch 700, train_loss[loss=2.89, ArTop10Accuracy=0.7312, over 9354.00 frames. ], tot_loss[loss=2.993, ArTop10Accuracy=0.7132, over 11556.36 frames. ], batch size: 11, lr: 1.60e-02 2024-08-06 04:38:13,613 INFO [trainer.py:765] (1/8) Epoch 7, batch 800, train_loss[loss=2.881, ArTop10Accuracy=0.7282, over 9319.00 frames. ], tot_loss[loss=2.996, ArTop10Accuracy=0.7125, over 11684.94 frames. ], batch size: 11, lr: 1.59e-02 2024-08-06 04:38:45,110 INFO [trainer.py:765] (1/8) Epoch 7, batch 900, train_loss[loss=2.983, ArTop10Accuracy=0.7156, over 13025.00 frames. ], tot_loss[loss=2.987, ArTop10Accuracy=0.7146, over 11730.77 frames. ], batch size: 27, lr: 1.59e-02 2024-08-06 04:39:16,574 INFO [trainer.py:765] (1/8) Epoch 7, batch 1000, train_loss[loss=3.091, ArTop10Accuracy=0.6947, over 12891.00 frames. ], tot_loss[loss=2.993, ArTop10Accuracy=0.7132, over 11931.30 frames. ], batch size: 27, lr: 1.58e-02 2024-08-06 04:39:47,571 INFO [trainer.py:765] (1/8) Epoch 7, batch 1100, train_loss[loss=3.078, ArTop10Accuracy=0.6948, over 13585.00 frames. ], tot_loss[loss=2.999, ArTop10Accuracy=0.7124, over 11996.35 frames. ], batch size: 34, lr: 1.57e-02 2024-08-06 04:40:17,989 INFO [trainer.py:765] (1/8) Epoch 7, batch 1200, train_loss[loss=3.127, ArTop10Accuracy=0.6918, over 12528.00 frames. ], tot_loss[loss=3, ArTop10Accuracy=0.7121, over 11948.80 frames. ], batch size: 99, lr: 1.57e-02 2024-08-06 04:40:43,222 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 04:41:37,492 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 9.816e+01 1.295e+02 1.411e+02 1.574e+02 4.953e+02, threshold=2.821e+02, percent-clipped=1.1 2024-08-06 04:41:58,371 INFO [trainer.py:765] (1/8) Epoch 8, batch 100, train_loss[loss=3.097, ArTop10Accuracy=0.6954, over 14820.00 frames. ], tot_loss[loss=2.969, ArTop10Accuracy=0.719, over 4786.78 frames. ], batch size: 61, lr: 1.47e-02 2024-08-06 04:42:44,986 INFO [trainer.py:765] (1/8) Epoch 8, batch 200, train_loss[loss=3.127, ArTop10Accuracy=0.6864, over 13833.00 frames. ], tot_loss[loss=2.967, ArTop10Accuracy=0.719, over 7780.75 frames. ], batch size: 34, lr: 1.46e-02 2024-08-06 04:43:28,045 INFO [trainer.py:765] (1/8) Epoch 8, batch 300, train_loss[loss=2.93, ArTop10Accuracy=0.7206, over 14569.00 frames. ], tot_loss[loss=2.963, ArTop10Accuracy=0.7196, over 9393.57 frames. ], batch size: 44, lr: 1.46e-02 2024-08-06 04:44:14,462 INFO [trainer.py:765] (1/8) Epoch 8, batch 400, train_loss[loss=2.853, ArTop10Accuracy=0.7256, over 10176.00 frames. ], tot_loss[loss=2.962, ArTop10Accuracy=0.7198, over 10311.66 frames. ], batch size: 14, lr: 1.45e-02 2024-08-06 04:45:00,692 INFO [trainer.py:765] (1/8) Epoch 8, batch 500, train_loss[loss=2.91, ArTop10Accuracy=0.723, over 12279.00 frames. ], tot_loss[loss=2.957, ArTop10Accuracy=0.7206, over 10909.44 frames. ], batch size: 22, lr: 1.45e-02 2024-08-06 04:45:45,394 INFO [trainer.py:765] (1/8) Epoch 8, batch 600, train_loss[loss=2.965, ArTop10Accuracy=0.7214, over 11599.00 frames. ], tot_loss[loss=2.962, ArTop10Accuracy=0.7196, over 11431.46 frames. ], batch size: 18, lr: 1.44e-02 2024-08-06 04:46:34,038 INFO [trainer.py:765] (1/8) Epoch 8, batch 700, train_loss[loss=2.924, ArTop10Accuracy=0.7204, over 9422.00 frames. ], tot_loss[loss=2.967, ArTop10Accuracy=0.7186, over 11568.95 frames. ], batch size: 11, lr: 1.43e-02 2024-08-06 04:47:10,208 INFO [trainer.py:765] (1/8) Epoch 8, batch 800, train_loss[loss=2.762, ArTop10Accuracy=0.7587, over 9953.00 frames. ], tot_loss[loss=2.971, ArTop10Accuracy=0.7178, over 11703.51 frames. ], batch size: 12, lr: 1.43e-02 2024-08-06 04:47:41,606 INFO [trainer.py:765] (1/8) Epoch 8, batch 900, train_loss[loss=2.975, ArTop10Accuracy=0.7147, over 12770.00 frames. ], tot_loss[loss=2.968, ArTop10Accuracy=0.7183, over 11749.70 frames. ], batch size: 27, lr: 1.42e-02 2024-08-06 04:48:13,032 INFO [trainer.py:765] (1/8) Epoch 8, batch 1000, train_loss[loss=2.895, ArTop10Accuracy=0.7247, over 13061.00 frames. ], tot_loss[loss=2.974, ArTop10Accuracy=0.7172, over 11954.97 frames. ], batch size: 27, lr: 1.42e-02 2024-08-06 04:48:28,827 INFO [trainer.py:803] (1/8) Computing validation loss 2024-08-06 04:48:37,663 INFO [trainer.py:811] (1/8) Epoch 8, validation: loss=2.946, ArTop10Accuracy=0.7266, over 1829298.00 frames. 2024-08-06 04:48:37,664 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 29914MB 2024-08-06 04:48:37,951 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.035e+02 1.289e+02 1.393e+02 1.532e+02 3.557e+02, threshold=2.786e+02, percent-clipped=0.2 2024-08-06 04:48:52,931 INFO [trainer.py:765] (1/8) Epoch 8, batch 1100, train_loss[loss=3.083, ArTop10Accuracy=0.7016, over 13744.00 frames. ], tot_loss[loss=2.977, ArTop10Accuracy=0.7163, over 11994.55 frames. ], batch size: 34, lr: 1.41e-02 2024-08-06 04:49:23,202 INFO [trainer.py:765] (1/8) Epoch 8, batch 1200, train_loss[loss=3.142, ArTop10Accuracy=0.6871, over 11796.00 frames. ], tot_loss[loss=2.98, ArTop10Accuracy=0.7156, over 11937.07 frames. ], batch size: 98, lr: 1.40e-02 2024-08-06 04:49:49,198 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 04:51:01,547 INFO [trainer.py:765] (1/8) Epoch 9, batch 100, train_loss[loss=3.043, ArTop10Accuracy=0.7075, over 14747.00 frames. ], tot_loss[loss=2.952, ArTop10Accuracy=0.7225, over 4779.52 frames. ], batch size: 61, lr: 1.32e-02 2024-08-06 04:51:45,414 INFO [trainer.py:765] (1/8) Epoch 9, batch 200, train_loss[loss=2.9, ArTop10Accuracy=0.7339, over 13749.00 frames. ], tot_loss[loss=2.939, ArTop10Accuracy=0.7247, over 7784.70 frames. ], batch size: 34, lr: 1.32e-02 2024-08-06 04:52:29,082 INFO [trainer.py:765] (1/8) Epoch 9, batch 300, train_loss[loss=2.947, ArTop10Accuracy=0.7237, over 14498.00 frames. ], tot_loss[loss=2.934, ArTop10Accuracy=0.7258, over 9413.47 frames. ], batch size: 44, lr: 1.31e-02 2024-08-06 04:53:16,431 INFO [trainer.py:765] (1/8) Epoch 9, batch 400, train_loss[loss=2.898, ArTop10Accuracy=0.7289, over 10396.00 frames. ], tot_loss[loss=2.937, ArTop10Accuracy=0.7248, over 10343.48 frames. ], batch size: 14, lr: 1.31e-02 2024-08-06 04:53:58,144 INFO [trainer.py:765] (1/8) Epoch 9, batch 500, train_loss[loss=3.009, ArTop10Accuracy=0.7046, over 12482.00 frames. ], tot_loss[loss=2.936, ArTop10Accuracy=0.7248, over 10908.31 frames. ], batch size: 22, lr: 1.30e-02 2024-08-06 04:54:51,077 INFO [trainer.py:765] (1/8) Epoch 9, batch 600, train_loss[loss=2.88, ArTop10Accuracy=0.7351, over 11555.00 frames. ], tot_loss[loss=2.939, ArTop10Accuracy=0.7239, over 11443.87 frames. ], batch size: 18, lr: 1.30e-02 2024-08-06 04:55:34,399 INFO [trainer.py:765] (1/8) Epoch 9, batch 700, train_loss[loss=2.952, ArTop10Accuracy=0.7199, over 9217.00 frames. ], tot_loss[loss=2.948, ArTop10Accuracy=0.7221, over 11577.60 frames. ], batch size: 11, lr: 1.29e-02 2024-08-06 04:56:04,575 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.029e+02 1.257e+02 1.367e+02 1.507e+02 8.820e+02, threshold=2.735e+02, percent-clipped=0.5 2024-08-06 04:56:13,597 INFO [trainer.py:765] (1/8) Epoch 9, batch 800, train_loss[loss=2.97, ArTop10Accuracy=0.7133, over 9305.00 frames. ], tot_loss[loss=2.947, ArTop10Accuracy=0.7222, over 11673.87 frames. ], batch size: 11, lr: 1.29e-02 2024-08-06 04:56:44,975 INFO [trainer.py:765] (1/8) Epoch 9, batch 900, train_loss[loss=2.84, ArTop10Accuracy=0.7408, over 13126.00 frames. ], tot_loss[loss=2.94, ArTop10Accuracy=0.7233, over 11726.70 frames. ], batch size: 27, lr: 1.28e-02 2024-08-06 04:57:16,491 INFO [trainer.py:765] (1/8) Epoch 9, batch 1000, train_loss[loss=2.927, ArTop10Accuracy=0.7269, over 12842.00 frames. ], tot_loss[loss=2.95, ArTop10Accuracy=0.7215, over 11924.86 frames. ], batch size: 27, lr: 1.28e-02 2024-08-06 04:57:47,656 INFO [trainer.py:765] (1/8) Epoch 9, batch 1100, train_loss[loss=2.928, ArTop10Accuracy=0.7262, over 13847.00 frames. ], tot_loss[loss=2.96, ArTop10Accuracy=0.7201, over 11978.23 frames. ], batch size: 34, lr: 1.27e-02 2024-08-06 04:58:18,093 INFO [trainer.py:765] (1/8) Epoch 9, batch 1200, train_loss[loss=3.087, ArTop10Accuracy=0.7016, over 13113.00 frames. ], tot_loss[loss=2.958, ArTop10Accuracy=0.7204, over 11958.80 frames. ], batch size: 97, lr: 1.27e-02 2024-08-06 04:58:43,245 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 04:59:52,748 INFO [trainer.py:765] (1/8) Epoch 10, batch 100, train_loss[loss=3.024, ArTop10Accuracy=0.7094, over 14484.00 frames. ], tot_loss[loss=2.922, ArTop10Accuracy=0.7288, over 4805.25 frames. ], batch size: 61, lr: 1.20e-02 2024-08-06 05:00:43,730 INFO [trainer.py:765] (1/8) Epoch 10, batch 200, train_loss[loss=2.825, ArTop10Accuracy=0.7417, over 13692.00 frames. ], tot_loss[loss=2.919, ArTop10Accuracy=0.7285, over 7805.84 frames. ], batch size: 34, lr: 1.20e-02 2024-08-06 05:01:20,591 INFO [trainer.py:765] (1/8) Epoch 10, batch 300, train_loss[loss=2.978, ArTop10Accuracy=0.7133, over 14424.00 frames. ], tot_loss[loss=2.921, ArTop10Accuracy=0.7282, over 9437.15 frames. ], batch size: 44, lr: 1.19e-02 2024-08-06 05:02:10,047 INFO [trainer.py:765] (1/8) Epoch 10, batch 400, train_loss[loss=2.874, ArTop10Accuracy=0.7338, over 10806.00 frames. ], tot_loss[loss=2.921, ArTop10Accuracy=0.7281, over 10347.74 frames. ], batch size: 15, lr: 1.19e-02 2024-08-06 05:02:46,487 INFO [trainer.py:803] (1/8) Computing validation loss 2024-08-06 05:02:55,378 INFO [trainer.py:811] (1/8) Epoch 10, validation: loss=2.927, ArTop10Accuracy=0.7304, over 1829298.00 frames. 2024-08-06 05:02:55,379 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 30932MB 2024-08-06 05:02:55,728 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.023e+02 1.269e+02 1.367e+02 1.518e+02 4.405e+02, threshold=2.733e+02, percent-clipped=0.4 2024-08-06 05:02:58,361 INFO [trainer.py:765] (1/8) Epoch 10, batch 500, train_loss[loss=2.849, ArTop10Accuracy=0.7416, over 12035.00 frames. ], tot_loss[loss=2.921, ArTop10Accuracy=0.7277, over 10890.25 frames. ], batch size: 22, lr: 1.19e-02 2024-08-06 05:03:48,229 INFO [trainer.py:765] (1/8) Epoch 10, batch 600, train_loss[loss=2.751, ArTop10Accuracy=0.7489, over 11510.00 frames. ], tot_loss[loss=2.928, ArTop10Accuracy=0.7265, over 11425.54 frames. ], batch size: 18, lr: 1.18e-02 2024-08-06 05:04:36,715 INFO [trainer.py:765] (1/8) Epoch 10, batch 700, train_loss[loss=2.867, ArTop10Accuracy=0.7277, over 10102.00 frames. ], tot_loss[loss=2.932, ArTop10Accuracy=0.7254, over 11561.23 frames. ], batch size: 12, lr: 1.18e-02 2024-08-06 05:05:10,725 INFO [trainer.py:765] (1/8) Epoch 10, batch 800, train_loss[loss=2.75, ArTop10Accuracy=0.7562, over 10174.00 frames. ], tot_loss[loss=2.937, ArTop10Accuracy=0.7242, over 11665.97 frames. ], batch size: 12, lr: 1.17e-02 2024-08-06 05:05:42,245 INFO [trainer.py:765] (1/8) Epoch 10, batch 900, train_loss[loss=2.795, ArTop10Accuracy=0.7538, over 13100.00 frames. ], tot_loss[loss=2.93, ArTop10Accuracy=0.7254, over 11717.16 frames. ], batch size: 27, lr: 1.17e-02 2024-08-06 05:06:13,844 INFO [trainer.py:765] (1/8) Epoch 10, batch 1000, train_loss[loss=2.856, ArTop10Accuracy=0.7444, over 13035.00 frames. ], tot_loss[loss=2.933, ArTop10Accuracy=0.7251, over 11937.94 frames. ], batch size: 27, lr: 1.16e-02 2024-08-06 05:06:45,056 INFO [trainer.py:765] (1/8) Epoch 10, batch 1100, train_loss[loss=3.035, ArTop10Accuracy=0.6976, over 13713.00 frames. ], tot_loss[loss=2.938, ArTop10Accuracy=0.724, over 12000.95 frames. ], batch size: 34, lr: 1.16e-02 2024-08-06 05:07:15,484 INFO [trainer.py:765] (1/8) Epoch 10, batch 1200, train_loss[loss=3.091, ArTop10Accuracy=0.688, over 11661.00 frames. ], tot_loss[loss=2.944, ArTop10Accuracy=0.723, over 11942.60 frames. ], batch size: 98, lr: 1.16e-02 2024-08-06 05:07:40,804 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 05:08:52,966 INFO [trainer.py:765] (1/8) Epoch 11, batch 100, train_loss[loss=2.952, ArTop10Accuracy=0.7234, over 14658.00 frames. ], tot_loss[loss=2.909, ArTop10Accuracy=0.7313, over 4795.02 frames. ], batch size: 61, lr: 1.10e-02 2024-08-06 05:09:41,277 INFO [trainer.py:765] (1/8) Epoch 11, batch 200, train_loss[loss=2.837, ArTop10Accuracy=0.7443, over 13437.00 frames. ], tot_loss[loss=2.909, ArTop10Accuracy=0.7308, over 7799.60 frames. ], batch size: 34, lr: 1.10e-02 2024-08-06 05:09:51,176 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.001e+02 1.278e+02 1.371e+02 1.502e+02 3.785e+02, threshold=2.743e+02, percent-clipped=0.3 2024-08-06 05:10:24,720 INFO [trainer.py:765] (1/8) Epoch 11, batch 300, train_loss[loss=2.882, ArTop10Accuracy=0.7337, over 14214.00 frames. ], tot_loss[loss=2.908, ArTop10Accuracy=0.7312, over 9428.44 frames. ], batch size: 44, lr: 1.09e-02 2024-08-06 05:11:11,784 INFO [trainer.py:765] (1/8) Epoch 11, batch 400, train_loss[loss=2.776, ArTop10Accuracy=0.7525, over 10812.00 frames. ], tot_loss[loss=2.908, ArTop10Accuracy=0.7311, over 10349.47 frames. ], batch size: 15, lr: 1.09e-02 2024-08-06 05:11:52,691 INFO [trainer.py:765] (1/8) Epoch 11, batch 500, train_loss[loss=2.832, ArTop10Accuracy=0.7447, over 12256.00 frames. ], tot_loss[loss=2.905, ArTop10Accuracy=0.7312, over 10905.08 frames. ], batch size: 22, lr: 1.09e-02 2024-08-06 05:12:40,287 INFO [trainer.py:765] (1/8) Epoch 11, batch 600, train_loss[loss=2.758, ArTop10Accuracy=0.7534, over 11557.00 frames. ], tot_loss[loss=2.91, ArTop10Accuracy=0.7301, over 11431.45 frames. ], batch size: 18, lr: 1.08e-02 2024-08-06 05:13:25,708 INFO [trainer.py:765] (1/8) Epoch 11, batch 700, train_loss[loss=2.842, ArTop10Accuracy=0.7449, over 10276.00 frames. ], tot_loss[loss=2.917, ArTop10Accuracy=0.7284, over 11603.02 frames. ], batch size: 12, lr: 1.08e-02 2024-08-06 05:14:04,205 INFO [trainer.py:765] (1/8) Epoch 11, batch 800, train_loss[loss=2.886, ArTop10Accuracy=0.7415, over 10103.00 frames. ], tot_loss[loss=2.92, ArTop10Accuracy=0.7281, over 11708.31 frames. ], batch size: 12, lr: 1.07e-02 2024-08-06 05:14:35,667 INFO [trainer.py:765] (1/8) Epoch 11, batch 900, train_loss[loss=2.983, ArTop10Accuracy=0.7167, over 12986.00 frames. ], tot_loss[loss=2.912, ArTop10Accuracy=0.7294, over 11763.77 frames. ], batch size: 27, lr: 1.07e-02 2024-08-06 05:15:07,263 INFO [trainer.py:765] (1/8) Epoch 11, batch 1000, train_loss[loss=2.839, ArTop10Accuracy=0.7413, over 12999.00 frames. ], tot_loss[loss=2.914, ArTop10Accuracy=0.7289, over 11951.15 frames. ], batch size: 27, lr: 1.07e-02 2024-08-06 05:15:38,260 INFO [trainer.py:765] (1/8) Epoch 11, batch 1100, train_loss[loss=3, ArTop10Accuracy=0.7166, over 13743.00 frames. ], tot_loss[loss=2.919, ArTop10Accuracy=0.7277, over 11998.49 frames. ], batch size: 34, lr: 1.06e-02 2024-08-06 05:16:08,498 INFO [trainer.py:765] (1/8) Epoch 11, batch 1200, train_loss[loss=3.05, ArTop10Accuracy=0.7029, over 11881.00 frames. ], tot_loss[loss=2.921, ArTop10Accuracy=0.7275, over 11939.46 frames. ], batch size: 99, lr: 1.06e-02 2024-08-06 05:16:12,697 INFO [trainer.py:803] (1/8) Computing validation loss 2024-08-06 05:16:21,623 INFO [trainer.py:811] (1/8) Epoch 11, validation: loss=2.923, ArTop10Accuracy=0.7318, over 1829298.00 frames. 2024-08-06 05:16:21,623 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 30932MB 2024-08-06 05:16:21,949 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.076e+02 1.268e+02 1.368e+02 1.481e+02 4.790e+02, threshold=2.736e+02, percent-clipped=0.6 2024-08-06 05:16:42,524 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 05:18:03,004 INFO [trainer.py:765] (1/8) Epoch 12, batch 100, train_loss[loss=2.948, ArTop10Accuracy=0.7227, over 14766.00 frames. ], tot_loss[loss=2.893, ArTop10Accuracy=0.7349, over 4803.10 frames. ], batch size: 61, lr: 1.01e-02 2024-08-06 05:18:46,003 INFO [trainer.py:765] (1/8) Epoch 12, batch 200, train_loss[loss=2.996, ArTop10Accuracy=0.7162, over 13847.00 frames. ], tot_loss[loss=2.89, ArTop10Accuracy=0.7347, over 7795.71 frames. ], batch size: 34, lr: 1.01e-02 2024-08-06 05:19:31,946 INFO [trainer.py:765] (1/8) Epoch 12, batch 300, train_loss[loss=2.957, ArTop10Accuracy=0.7282, over 14435.00 frames. ], tot_loss[loss=2.889, ArTop10Accuracy=0.7351, over 9411.90 frames. ], batch size: 44, lr: 1.01e-02 2024-08-06 05:20:12,430 INFO [trainer.py:765] (1/8) Epoch 12, batch 400, train_loss[loss=2.886, ArTop10Accuracy=0.7416, over 10391.00 frames. ], tot_loss[loss=2.887, ArTop10Accuracy=0.7348, over 10326.32 frames. ], batch size: 14, lr: 1.00e-02 2024-08-06 05:21:00,639 INFO [trainer.py:765] (1/8) Epoch 12, batch 500, train_loss[loss=2.952, ArTop10Accuracy=0.7296, over 12311.00 frames. ], tot_loss[loss=2.886, ArTop10Accuracy=0.7353, over 10893.08 frames. ], batch size: 22, lr: 9.99e-03 2024-08-06 05:21:43,915 INFO [trainer.py:765] (1/8) Epoch 12, batch 600, train_loss[loss=2.877, ArTop10Accuracy=0.7314, over 11609.00 frames. ], tot_loss[loss=2.89, ArTop10Accuracy=0.734, over 11424.95 frames. ], batch size: 18, lr: 9.96e-03 2024-08-06 05:22:32,205 INFO [trainer.py:765] (1/8) Epoch 12, batch 700, train_loss[loss=2.803, ArTop10Accuracy=0.7459, over 9975.00 frames. ], tot_loss[loss=2.893, ArTop10Accuracy=0.7333, over 11573.77 frames. ], batch size: 12, lr: 9.93e-03 2024-08-06 05:23:08,911 INFO [trainer.py:765] (1/8) Epoch 12, batch 800, train_loss[loss=2.761, ArTop10Accuracy=0.7474, over 10005.00 frames. ], tot_loss[loss=2.9, ArTop10Accuracy=0.732, over 11684.31 frames. ], batch size: 12, lr: 9.90e-03 2024-08-06 05:23:40,459 INFO [trainer.py:765] (1/8) Epoch 12, batch 900, train_loss[loss=2.892, ArTop10Accuracy=0.7327, over 13289.00 frames. ], tot_loss[loss=2.893, ArTop10Accuracy=0.7331, over 11737.04 frames. ], batch size: 27, lr: 9.87e-03 2024-08-06 05:23:54,575 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.067e+02 1.273e+02 1.376e+02 1.503e+02 4.050e+02, threshold=2.752e+02, percent-clipped=0.4 2024-08-06 05:24:14,344 INFO [trainer.py:765] (1/8) Epoch 12, batch 1000, train_loss[loss=2.82, ArTop10Accuracy=0.7526, over 12919.00 frames. ], tot_loss[loss=2.904, ArTop10Accuracy=0.7313, over 11937.90 frames. ], batch size: 27, lr: 9.84e-03 2024-08-06 05:24:45,501 INFO [trainer.py:765] (1/8) Epoch 12, batch 1100, train_loss[loss=2.954, ArTop10Accuracy=0.7211, over 13661.00 frames. ], tot_loss[loss=2.908, ArTop10Accuracy=0.7302, over 11999.56 frames. ], batch size: 34, lr: 9.81e-03 2024-08-06 05:25:15,881 INFO [trainer.py:765] (1/8) Epoch 12, batch 1200, train_loss[loss=3.106, ArTop10Accuracy=0.6906, over 11901.00 frames. ], tot_loss[loss=2.913, ArTop10Accuracy=0.7293, over 11945.42 frames. ], batch size: 97, lr: 9.78e-03 2024-08-06 05:25:41,043 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 05:26:46,787 INFO [trainer.py:765] (1/8) Epoch 13, batch 100, train_loss[loss=2.906, ArTop10Accuracy=0.731, over 14602.00 frames. ], tot_loss[loss=2.873, ArTop10Accuracy=0.7385, over 4787.21 frames. ], batch size: 61, lr: 9.36e-03 2024-08-06 05:27:32,553 INFO [trainer.py:765] (1/8) Epoch 13, batch 200, train_loss[loss=2.885, ArTop10Accuracy=0.7358, over 13786.00 frames. ], tot_loss[loss=2.873, ArTop10Accuracy=0.7384, over 7785.28 frames. ], batch size: 34, lr: 9.34e-03 2024-08-06 05:28:16,036 INFO [trainer.py:765] (1/8) Epoch 13, batch 300, train_loss[loss=2.867, ArTop10Accuracy=0.7418, over 14393.00 frames. ], tot_loss[loss=2.869, ArTop10Accuracy=0.7391, over 9414.48 frames. ], batch size: 44, lr: 9.31e-03 2024-08-06 05:29:00,149 INFO [trainer.py:765] (1/8) Epoch 13, batch 400, train_loss[loss=2.824, ArTop10Accuracy=0.7474, over 10475.00 frames. ], tot_loss[loss=2.868, ArTop10Accuracy=0.7393, over 10313.15 frames. ], batch size: 14, lr: 9.28e-03 2024-08-06 05:29:43,967 INFO [trainer.py:765] (1/8) Epoch 13, batch 500, train_loss[loss=2.729, ArTop10Accuracy=0.7527, over 12343.00 frames. ], tot_loss[loss=2.864, ArTop10Accuracy=0.7396, over 10873.10 frames. ], batch size: 22, lr: 9.26e-03 2024-08-06 05:30:24,248 INFO [trainer.py:765] (1/8) Epoch 13, batch 600, train_loss[loss=2.75, ArTop10Accuracy=0.7631, over 11456.00 frames. ], tot_loss[loss=2.871, ArTop10Accuracy=0.7379, over 11414.87 frames. ], batch size: 18, lr: 9.23e-03 2024-08-06 05:30:58,110 INFO [trainer.py:803] (1/8) Computing validation loss 2024-08-06 05:31:07,054 INFO [trainer.py:811] (1/8) Epoch 13, validation: loss=2.918, ArTop10Accuracy=0.733, over 1829298.00 frames. 2024-08-06 05:31:07,054 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 30932MB 2024-08-06 05:31:07,351 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.049e+02 1.283e+02 1.389e+02 1.496e+02 2.729e+02, threshold=2.779e+02, percent-clipped=0.0 2024-08-06 05:31:24,042 INFO [trainer.py:765] (1/8) Epoch 13, batch 700, train_loss[loss=2.841, ArTop10Accuracy=0.7457, over 10367.00 frames. ], tot_loss[loss=2.882, ArTop10Accuracy=0.7353, over 11552.22 frames. ], batch size: 12, lr: 9.20e-03 2024-08-06 05:32:00,146 INFO [trainer.py:765] (1/8) Epoch 13, batch 800, train_loss[loss=2.727, ArTop10Accuracy=0.7702, over 10091.00 frames. ], tot_loss[loss=2.885, ArTop10Accuracy=0.7351, over 11685.22 frames. ], batch size: 12, lr: 9.18e-03 2024-08-06 05:32:31,520 INFO [trainer.py:765] (1/8) Epoch 13, batch 900, train_loss[loss=2.896, ArTop10Accuracy=0.7348, over 12939.00 frames. ], tot_loss[loss=2.878, ArTop10Accuracy=0.7363, over 11743.46 frames. ], batch size: 27, lr: 9.15e-03 2024-08-06 05:33:03,042 INFO [trainer.py:765] (1/8) Epoch 13, batch 1000, train_loss[loss=2.882, ArTop10Accuracy=0.7395, over 13052.00 frames. ], tot_loss[loss=2.888, ArTop10Accuracy=0.7343, over 11933.66 frames. ], batch size: 27, lr: 9.13e-03 2024-08-06 05:33:34,231 INFO [trainer.py:765] (1/8) Epoch 13, batch 1100, train_loss[loss=2.868, ArTop10Accuracy=0.7404, over 13751.00 frames. ], tot_loss[loss=2.895, ArTop10Accuracy=0.7331, over 12013.64 frames. ], batch size: 34, lr: 9.10e-03 2024-08-06 05:34:04,518 INFO [trainer.py:765] (1/8) Epoch 13, batch 1200, train_loss[loss=3.042, ArTop10Accuracy=0.7104, over 12299.00 frames. ], tot_loss[loss=2.897, ArTop10Accuracy=0.7327, over 11945.18 frames. ], batch size: 98, lr: 9.07e-03 2024-08-06 05:34:30,235 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 05:35:39,198 INFO [trainer.py:765] (1/8) Epoch 14, batch 100, train_loss[loss=2.937, ArTop10Accuracy=0.7268, over 14650.00 frames. ], tot_loss[loss=2.871, ArTop10Accuracy=0.7388, over 4773.88 frames. ], batch size: 61, lr: 8.71e-03 2024-08-06 05:36:23,063 INFO [trainer.py:765] (1/8) Epoch 14, batch 200, train_loss[loss=2.889, ArTop10Accuracy=0.7419, over 13718.00 frames. ], tot_loss[loss=2.864, ArTop10Accuracy=0.7399, over 7789.92 frames. ], batch size: 34, lr: 8.68e-03 2024-08-06 05:37:09,309 INFO [trainer.py:765] (1/8) Epoch 14, batch 300, train_loss[loss=2.92, ArTop10Accuracy=0.7316, over 14309.00 frames. ], tot_loss[loss=2.857, ArTop10Accuracy=0.741, over 9421.38 frames. ], batch size: 44, lr: 8.66e-03 2024-08-06 05:37:46,029 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.097e+02 1.304e+02 1.410e+02 1.531e+02 2.912e+02, threshold=2.820e+02, percent-clipped=0.2 2024-08-06 05:37:55,139 INFO [trainer.py:765] (1/8) Epoch 14, batch 400, train_loss[loss=2.894, ArTop10Accuracy=0.7352, over 10361.00 frames. ], tot_loss[loss=2.857, ArTop10Accuracy=0.7409, over 10335.77 frames. ], batch size: 14, lr: 8.64e-03 2024-08-06 05:38:42,025 INFO [trainer.py:765] (1/8) Epoch 14, batch 500, train_loss[loss=2.838, ArTop10Accuracy=0.7473, over 12248.00 frames. ], tot_loss[loss=2.854, ArTop10Accuracy=0.7416, over 10916.19 frames. ], batch size: 22, lr: 8.61e-03 2024-08-06 05:39:22,375 INFO [trainer.py:765] (1/8) Epoch 14, batch 600, train_loss[loss=2.725, ArTop10Accuracy=0.7621, over 11564.00 frames. ], tot_loss[loss=2.859, ArTop10Accuracy=0.7405, over 11425.60 frames. ], batch size: 18, lr: 8.59e-03 2024-08-06 05:40:15,143 INFO [trainer.py:765] (1/8) Epoch 14, batch 700, train_loss[loss=2.855, ArTop10Accuracy=0.7373, over 10051.00 frames. ], tot_loss[loss=2.873, ArTop10Accuracy=0.7376, over 11567.50 frames. ], batch size: 12, lr: 8.57e-03 2024-08-06 05:40:49,136 INFO [trainer.py:765] (1/8) Epoch 14, batch 800, train_loss[loss=2.86, ArTop10Accuracy=0.7426, over 10108.00 frames. ], tot_loss[loss=2.877, ArTop10Accuracy=0.7369, over 11685.36 frames. ], batch size: 12, lr: 8.55e-03 2024-08-06 05:41:20,467 INFO [trainer.py:765] (1/8) Epoch 14, batch 900, train_loss[loss=2.889, ArTop10Accuracy=0.7261, over 13003.00 frames. ], tot_loss[loss=2.87, ArTop10Accuracy=0.7382, over 11732.11 frames. ], batch size: 27, lr: 8.52e-03 2024-08-06 05:41:51,996 INFO [trainer.py:765] (1/8) Epoch 14, batch 1000, train_loss[loss=2.853, ArTop10Accuracy=0.739, over 12932.00 frames. ], tot_loss[loss=2.878, ArTop10Accuracy=0.7365, over 11926.03 frames. ], batch size: 27, lr: 8.50e-03 2024-08-06 05:42:23,220 INFO [trainer.py:765] (1/8) Epoch 14, batch 1100, train_loss[loss=2.838, ArTop10Accuracy=0.7475, over 13897.00 frames. ], tot_loss[loss=2.877, ArTop10Accuracy=0.7366, over 11991.45 frames. ], batch size: 34, lr: 8.48e-03 2024-08-06 05:42:53,549 INFO [trainer.py:765] (1/8) Epoch 14, batch 1200, train_loss[loss=3.052, ArTop10Accuracy=0.7097, over 12008.00 frames. ], tot_loss[loss=2.878, ArTop10Accuracy=0.7363, over 11941.69 frames. ], batch size: 98, lr: 8.46e-03 2024-08-06 05:43:18,869 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 05:44:28,571 INFO [trainer.py:765] (1/8) Epoch 15, batch 100, train_loss[loss=2.988, ArTop10Accuracy=0.7149, over 14408.00 frames. ], tot_loss[loss=2.851, ArTop10Accuracy=0.7418, over 4787.97 frames. ], batch size: 61, lr: 8.14e-03 2024-08-06 05:44:29,213 INFO [trainer.py:803] (1/8) Computing validation loss 2024-08-06 05:44:38,024 INFO [trainer.py:811] (1/8) Epoch 15, validation: loss=2.913, ArTop10Accuracy=0.7339, over 1829298.00 frames. 2024-08-06 05:44:38,024 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 30932MB 2024-08-06 05:44:38,413 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.100e+02 1.307e+02 1.417e+02 1.528e+02 2.981e+02, threshold=2.833e+02, percent-clipped=0.1 2024-08-06 05:45:20,185 INFO [trainer.py:765] (1/8) Epoch 15, batch 200, train_loss[loss=2.832, ArTop10Accuracy=0.7372, over 13619.00 frames. ], tot_loss[loss=2.841, ArTop10Accuracy=0.7438, over 7792.95 frames. ], batch size: 34, lr: 8.11e-03 2024-08-06 05:46:04,647 INFO [trainer.py:765] (1/8) Epoch 15, batch 300, train_loss[loss=2.883, ArTop10Accuracy=0.7322, over 14337.00 frames. ], tot_loss[loss=2.843, ArTop10Accuracy=0.7434, over 9432.06 frames. ], batch size: 44, lr: 8.09e-03 2024-08-06 05:46:51,902 INFO [trainer.py:765] (1/8) Epoch 15, batch 400, train_loss[loss=2.75, ArTop10Accuracy=0.7746, over 10929.00 frames. ], tot_loss[loss=2.847, ArTop10Accuracy=0.743, over 10339.39 frames. ], batch size: 15, lr: 8.07e-03 2024-08-06 05:47:36,911 INFO [trainer.py:765] (1/8) Epoch 15, batch 500, train_loss[loss=2.897, ArTop10Accuracy=0.7295, over 12244.00 frames. ], tot_loss[loss=2.842, ArTop10Accuracy=0.7439, over 10898.15 frames. ], batch size: 22, lr: 8.05e-03 2024-08-06 05:48:24,723 INFO [trainer.py:765] (1/8) Epoch 15, batch 600, train_loss[loss=2.708, ArTop10Accuracy=0.7705, over 11604.00 frames. ], tot_loss[loss=2.847, ArTop10Accuracy=0.7425, over 11441.84 frames. ], batch size: 18, lr: 8.03e-03 2024-08-06 05:49:11,855 INFO [trainer.py:765] (1/8) Epoch 15, batch 700, train_loss[loss=2.931, ArTop10Accuracy=0.7207, over 9923.00 frames. ], tot_loss[loss=2.857, ArTop10Accuracy=0.7409, over 11584.21 frames. ], batch size: 12, lr: 8.01e-03 2024-08-06 05:49:45,778 INFO [trainer.py:765] (1/8) Epoch 15, batch 800, train_loss[loss=2.799, ArTop10Accuracy=0.751, over 9435.00 frames. ], tot_loss[loss=2.865, ArTop10Accuracy=0.7392, over 11680.50 frames. ], batch size: 11, lr: 7.99e-03 2024-08-06 05:50:17,210 INFO [trainer.py:765] (1/8) Epoch 15, batch 900, train_loss[loss=2.885, ArTop10Accuracy=0.7425, over 13101.00 frames. ], tot_loss[loss=2.855, ArTop10Accuracy=0.7409, over 11741.27 frames. ], batch size: 27, lr: 7.97e-03 2024-08-06 05:50:48,829 INFO [trainer.py:765] (1/8) Epoch 15, batch 1000, train_loss[loss=2.879, ArTop10Accuracy=0.7285, over 12940.00 frames. ], tot_loss[loss=2.859, ArTop10Accuracy=0.7401, over 11939.29 frames. ], batch size: 27, lr: 7.95e-03 2024-08-06 05:51:20,069 INFO [trainer.py:765] (1/8) Epoch 15, batch 1100, train_loss[loss=2.92, ArTop10Accuracy=0.7312, over 13428.00 frames. ], tot_loss[loss=2.872, ArTop10Accuracy=0.7378, over 12005.19 frames. ], batch size: 34, lr: 7.93e-03 2024-08-06 05:51:23,515 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.123e+02 1.337e+02 1.431e+02 1.541e+02 2.784e+02, threshold=2.862e+02, percent-clipped=0.0 2024-08-06 05:51:53,082 INFO [trainer.py:765] (1/8) Epoch 15, batch 1200, train_loss[loss=2.979, ArTop10Accuracy=0.7201, over 12276.00 frames. ], tot_loss[loss=2.873, ArTop10Accuracy=0.7373, over 11940.75 frames. ], batch size: 97, lr: 7.91e-03 2024-08-06 05:52:18,078 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 05:53:29,262 INFO [trainer.py:765] (1/8) Epoch 16, batch 100, train_loss[loss=2.95, ArTop10Accuracy=0.7261, over 14794.00 frames. ], tot_loss[loss=2.839, ArTop10Accuracy=0.7458, over 4788.63 frames. ], batch size: 61, lr: 7.63e-03 2024-08-06 05:54:12,876 INFO [trainer.py:765] (1/8) Epoch 16, batch 200, train_loss[loss=2.867, ArTop10Accuracy=0.7372, over 13845.00 frames. ], tot_loss[loss=2.831, ArTop10Accuracy=0.7468, over 7795.41 frames. ], batch size: 34, lr: 7.61e-03 2024-08-06 05:54:59,736 INFO [trainer.py:765] (1/8) Epoch 16, batch 300, train_loss[loss=2.835, ArTop10Accuracy=0.7487, over 14196.00 frames. ], tot_loss[loss=2.83, ArTop10Accuracy=0.7466, over 9437.07 frames. ], batch size: 44, lr: 7.59e-03 2024-08-06 05:55:41,930 INFO [trainer.py:765] (1/8) Epoch 16, batch 400, train_loss[loss=2.825, ArTop10Accuracy=0.7417, over 10353.00 frames. ], tot_loss[loss=2.83, ArTop10Accuracy=0.7463, over 10343.85 frames. ], batch size: 14, lr: 7.58e-03 2024-08-06 05:56:27,679 INFO [trainer.py:765] (1/8) Epoch 16, batch 500, train_loss[loss=2.891, ArTop10Accuracy=0.7377, over 12075.00 frames. ], tot_loss[loss=2.835, ArTop10Accuracy=0.7454, over 10902.38 frames. ], batch size: 22, lr: 7.56e-03 2024-08-06 05:57:12,439 INFO [trainer.py:765] (1/8) Epoch 16, batch 600, train_loss[loss=2.745, ArTop10Accuracy=0.7592, over 11620.00 frames. ], tot_loss[loss=2.843, ArTop10Accuracy=0.7439, over 11438.59 frames. ], batch size: 18, lr: 7.54e-03 2024-08-06 05:58:00,039 INFO [trainer.py:765] (1/8) Epoch 16, batch 700, train_loss[loss=2.872, ArTop10Accuracy=0.7347, over 10090.00 frames. ], tot_loss[loss=2.846, ArTop10Accuracy=0.7427, over 11567.21 frames. ], batch size: 12, lr: 7.52e-03 2024-08-06 05:58:34,023 INFO [trainer.py:765] (1/8) Epoch 16, batch 800, train_loss[loss=2.702, ArTop10Accuracy=0.7621, over 9941.00 frames. ], tot_loss[loss=2.852, ArTop10Accuracy=0.7416, over 11677.75 frames. ], batch size: 12, lr: 7.50e-03 2024-08-06 05:58:41,568 INFO [trainer.py:803] (1/8) Computing validation loss 2024-08-06 05:58:50,426 INFO [trainer.py:811] (1/8) Epoch 16, validation: loss=2.915, ArTop10Accuracy=0.7338, over 1829298.00 frames. 2024-08-06 05:58:50,427 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 30932MB 2024-08-06 05:58:50,730 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.121e+02 1.335e+02 1.445e+02 1.570e+02 3.252e+02, threshold=2.890e+02, percent-clipped=0.1 2024-08-06 05:59:14,320 INFO [trainer.py:765] (1/8) Epoch 16, batch 900, train_loss[loss=2.841, ArTop10Accuracy=0.7391, over 12885.00 frames. ], tot_loss[loss=2.845, ArTop10Accuracy=0.7426, over 11730.07 frames. ], batch size: 27, lr: 7.49e-03 2024-08-06 05:59:45,915 INFO [trainer.py:765] (1/8) Epoch 16, batch 1000, train_loss[loss=2.772, ArTop10Accuracy=0.7538, over 12877.00 frames. ], tot_loss[loss=2.85, ArTop10Accuracy=0.7417, over 11925.18 frames. ], batch size: 27, lr: 7.47e-03 2024-08-06 06:00:17,091 INFO [trainer.py:765] (1/8) Epoch 16, batch 1100, train_loss[loss=2.963, ArTop10Accuracy=0.7273, over 13734.00 frames. ], tot_loss[loss=2.862, ArTop10Accuracy=0.7394, over 11985.18 frames. ], batch size: 34, lr: 7.45e-03 2024-08-06 06:00:47,464 INFO [trainer.py:765] (1/8) Epoch 16, batch 1200, train_loss[loss=2.979, ArTop10Accuracy=0.7205, over 11777.00 frames. ], tot_loss[loss=2.861, ArTop10Accuracy=0.7397, over 11930.28 frames. ], batch size: 97, lr: 7.43e-03 2024-08-06 06:01:12,268 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 06:02:27,261 INFO [trainer.py:765] (1/8) Epoch 17, batch 100, train_loss[loss=2.891, ArTop10Accuracy=0.7388, over 14342.00 frames. ], tot_loss[loss=2.826, ArTop10Accuracy=0.7468, over 4773.07 frames. ], batch size: 61, lr: 7.18e-03 2024-08-06 06:03:11,850 INFO [trainer.py:765] (1/8) Epoch 17, batch 200, train_loss[loss=2.898, ArTop10Accuracy=0.7334, over 13627.00 frames. ], tot_loss[loss=2.823, ArTop10Accuracy=0.7479, over 7781.00 frames. ], batch size: 34, lr: 7.17e-03 2024-08-06 06:03:57,502 INFO [trainer.py:765] (1/8) Epoch 17, batch 300, train_loss[loss=2.856, ArTop10Accuracy=0.7417, over 14336.00 frames. ], tot_loss[loss=2.822, ArTop10Accuracy=0.7477, over 9420.34 frames. ], batch size: 44, lr: 7.15e-03 2024-08-06 06:04:42,838 INFO [trainer.py:765] (1/8) Epoch 17, batch 400, train_loss[loss=2.741, ArTop10Accuracy=0.7622, over 10507.00 frames. ], tot_loss[loss=2.823, ArTop10Accuracy=0.7477, over 10322.09 frames. ], batch size: 14, lr: 7.13e-03 2024-08-06 06:05:29,004 INFO [trainer.py:765] (1/8) Epoch 17, batch 500, train_loss[loss=2.815, ArTop10Accuracy=0.7536, over 12257.00 frames. ], tot_loss[loss=2.818, ArTop10Accuracy=0.7488, over 10888.13 frames. ], batch size: 22, lr: 7.12e-03 2024-08-06 06:05:49,551 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.142e+02 1.359e+02 1.445e+02 1.551e+02 2.741e+02, threshold=2.891e+02, percent-clipped=0.0 2024-08-06 06:06:20,723 INFO [trainer.py:765] (1/8) Epoch 17, batch 600, train_loss[loss=2.835, ArTop10Accuracy=0.7372, over 11792.00 frames. ], tot_loss[loss=2.827, ArTop10Accuracy=0.747, over 11409.04 frames. ], batch size: 18, lr: 7.10e-03 2024-08-06 06:07:04,695 INFO [trainer.py:765] (1/8) Epoch 17, batch 700, train_loss[loss=2.82, ArTop10Accuracy=0.7605, over 10070.00 frames. ], tot_loss[loss=2.832, ArTop10Accuracy=0.7457, over 11578.57 frames. ], batch size: 12, lr: 7.09e-03 2024-08-06 06:07:44,896 INFO [trainer.py:765] (1/8) Epoch 17, batch 800, train_loss[loss=2.747, ArTop10Accuracy=0.761, over 9348.00 frames. ], tot_loss[loss=2.841, ArTop10Accuracy=0.7441, over 11680.49 frames. ], batch size: 11, lr: 7.07e-03 2024-08-06 06:08:16,384 INFO [trainer.py:765] (1/8) Epoch 17, batch 900, train_loss[loss=2.798, ArTop10Accuracy=0.7556, over 12914.00 frames. ], tot_loss[loss=2.831, ArTop10Accuracy=0.7459, over 11743.50 frames. ], batch size: 27, lr: 7.05e-03 2024-08-06 06:08:47,995 INFO [trainer.py:765] (1/8) Epoch 17, batch 1000, train_loss[loss=2.76, ArTop10Accuracy=0.76, over 13013.00 frames. ], tot_loss[loss=2.836, ArTop10Accuracy=0.7452, over 11941.09 frames. ], batch size: 27, lr: 7.04e-03 2024-08-06 06:09:19,134 INFO [trainer.py:765] (1/8) Epoch 17, batch 1100, train_loss[loss=2.882, ArTop10Accuracy=0.7352, over 13801.00 frames. ], tot_loss[loss=2.848, ArTop10Accuracy=0.7428, over 12001.59 frames. ], batch size: 34, lr: 7.02e-03 2024-08-06 06:09:49,445 INFO [trainer.py:765] (1/8) Epoch 17, batch 1200, train_loss[loss=2.963, ArTop10Accuracy=0.7218, over 12260.00 frames. ], tot_loss[loss=2.848, ArTop10Accuracy=0.7427, over 11943.38 frames. ], batch size: 97, lr: 7.01e-03 2024-08-06 06:10:15,027 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 06:11:23,102 INFO [trainer.py:765] (1/8) Epoch 18, batch 100, train_loss[loss=2.965, ArTop10Accuracy=0.7235, over 14660.00 frames. ], tot_loss[loss=2.812, ArTop10Accuracy=0.7508, over 4790.75 frames. ], batch size: 61, lr: 6.78e-03 2024-08-06 06:12:16,260 INFO [trainer.py:765] (1/8) Epoch 18, batch 200, train_loss[loss=2.711, ArTop10Accuracy=0.7669, over 13790.00 frames. ], tot_loss[loss=2.807, ArTop10Accuracy=0.7517, over 7796.93 frames. ], batch size: 34, lr: 6.77e-03 2024-08-06 06:12:40,317 INFO [trainer.py:803] (1/8) Computing validation loss 2024-08-06 06:12:48,991 INFO [trainer.py:811] (1/8) Epoch 18, validation: loss=2.916, ArTop10Accuracy=0.7343, over 1829298.00 frames. 2024-08-06 06:12:48,992 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 30932MB 2024-08-06 06:12:49,335 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.163e+02 1.377e+02 1.476e+02 1.588e+02 2.450e+02, threshold=2.952e+02, percent-clipped=0.0 2024-08-06 06:13:07,115 INFO [trainer.py:765] (1/8) Epoch 18, batch 300, train_loss[loss=2.858, ArTop10Accuracy=0.7411, over 14602.00 frames. ], tot_loss[loss=2.805, ArTop10Accuracy=0.7514, over 9444.02 frames. ], batch size: 44, lr: 6.75e-03 2024-08-06 06:13:54,097 INFO [trainer.py:765] (1/8) Epoch 18, batch 400, train_loss[loss=2.723, ArTop10Accuracy=0.7649, over 10493.00 frames. ], tot_loss[loss=2.807, ArTop10Accuracy=0.7511, over 10353.84 frames. ], batch size: 14, lr: 6.74e-03 2024-08-06 06:14:38,487 INFO [trainer.py:765] (1/8) Epoch 18, batch 500, train_loss[loss=2.816, ArTop10Accuracy=0.7537, over 12205.00 frames. ], tot_loss[loss=2.808, ArTop10Accuracy=0.7506, over 10904.51 frames. ], batch size: 22, lr: 6.73e-03 2024-08-06 06:15:23,627 INFO [trainer.py:765] (1/8) Epoch 18, batch 600, train_loss[loss=2.738, ArTop10Accuracy=0.7657, over 11528.00 frames. ], tot_loss[loss=2.813, ArTop10Accuracy=0.7495, over 11416.40 frames. ], batch size: 18, lr: 6.71e-03 2024-08-06 06:16:17,342 INFO [trainer.py:765] (1/8) Epoch 18, batch 700, train_loss[loss=2.747, ArTop10Accuracy=0.7664, over 10033.00 frames. ], tot_loss[loss=2.827, ArTop10Accuracy=0.747, over 11580.43 frames. ], batch size: 12, lr: 6.70e-03 2024-08-06 06:16:51,427 INFO [trainer.py:765] (1/8) Epoch 18, batch 800, train_loss[loss=2.85, ArTop10Accuracy=0.7552, over 9297.00 frames. ], tot_loss[loss=2.832, ArTop10Accuracy=0.7458, over 11673.28 frames. ], batch size: 11, lr: 6.68e-03 2024-08-06 06:17:22,912 INFO [trainer.py:765] (1/8) Epoch 18, batch 900, train_loss[loss=2.729, ArTop10Accuracy=0.7666, over 12886.00 frames. ], tot_loss[loss=2.819, ArTop10Accuracy=0.7482, over 11720.71 frames. ], batch size: 27, lr: 6.67e-03 2024-08-06 06:17:54,528 INFO [trainer.py:765] (1/8) Epoch 18, batch 1000, train_loss[loss=2.824, ArTop10Accuracy=0.7476, over 12904.00 frames. ], tot_loss[loss=2.821, ArTop10Accuracy=0.7477, over 11932.81 frames. ], batch size: 27, lr: 6.65e-03 2024-08-06 06:18:25,662 INFO [trainer.py:765] (1/8) Epoch 18, batch 1100, train_loss[loss=2.767, ArTop10Accuracy=0.7569, over 13569.00 frames. ], tot_loss[loss=2.829, ArTop10Accuracy=0.746, over 11989.03 frames. ], batch size: 34, lr: 6.64e-03 2024-08-06 06:18:55,971 INFO [trainer.py:765] (1/8) Epoch 18, batch 1200, train_loss[loss=2.946, ArTop10Accuracy=0.7211, over 11982.00 frames. ], tot_loss[loss=2.833, ArTop10Accuracy=0.7452, over 11972.69 frames. ], batch size: 97, lr: 6.63e-03 2024-08-06 06:19:19,163 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.178e+02 1.387e+02 1.492e+02 1.607e+02 2.982e+02, threshold=2.983e+02, percent-clipped=0.1 2024-08-06 06:19:23,696 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 06:20:29,728 INFO [trainer.py:765] (1/8) Epoch 19, batch 100, train_loss[loss=2.858, ArTop10Accuracy=0.7397, over 14846.00 frames. ], tot_loss[loss=2.808, ArTop10Accuracy=0.7512, over 4786.28 frames. ], batch size: 61, lr: 6.43e-03 2024-08-06 06:21:11,274 INFO [trainer.py:765] (1/8) Epoch 19, batch 200, train_loss[loss=2.735, ArTop10Accuracy=0.7636, over 13973.00 frames. ], tot_loss[loss=2.796, ArTop10Accuracy=0.7534, over 7791.32 frames. ], batch size: 35, lr: 6.41e-03 2024-08-06 06:21:56,078 INFO [trainer.py:765] (1/8) Epoch 19, batch 300, train_loss[loss=2.783, ArTop10Accuracy=0.7603, over 14131.00 frames. ], tot_loss[loss=2.804, ArTop10Accuracy=0.752, over 9428.20 frames. ], batch size: 44, lr: 6.40e-03 2024-08-06 06:22:36,013 INFO [trainer.py:765] (1/8) Epoch 19, batch 400, train_loss[loss=2.814, ArTop10Accuracy=0.7476, over 10290.00 frames. ], tot_loss[loss=2.799, ArTop10Accuracy=0.7524, over 10336.07 frames. ], batch size: 14, lr: 6.39e-03 2024-08-06 06:23:18,997 INFO [trainer.py:765] (1/8) Epoch 19, batch 500, train_loss[loss=2.787, ArTop10Accuracy=0.7602, over 12204.00 frames. ], tot_loss[loss=2.796, ArTop10Accuracy=0.7527, over 10904.92 frames. ], batch size: 22, lr: 6.37e-03 2024-08-06 06:24:03,685 INFO [trainer.py:765] (1/8) Epoch 19, batch 600, train_loss[loss=2.633, ArTop10Accuracy=0.7812, over 11618.00 frames. ], tot_loss[loss=2.803, ArTop10Accuracy=0.7514, over 11425.50 frames. ], batch size: 18, lr: 6.36e-03 2024-08-06 06:24:46,185 INFO [trainer.py:765] (1/8) Epoch 19, batch 700, train_loss[loss=2.762, ArTop10Accuracy=0.756, over 9400.00 frames. ], tot_loss[loss=2.808, ArTop10Accuracy=0.7503, over 11575.14 frames. ], batch size: 11, lr: 6.35e-03 2024-08-06 06:25:22,355 INFO [trainer.py:765] (1/8) Epoch 19, batch 800, train_loss[loss=2.787, ArTop10Accuracy=0.7534, over 10182.00 frames. ], tot_loss[loss=2.817, ArTop10Accuracy=0.7486, over 11704.71 frames. ], batch size: 12, lr: 6.33e-03 2024-08-06 06:25:53,624 INFO [trainer.py:765] (1/8) Epoch 19, batch 900, train_loss[loss=2.798, ArTop10Accuracy=0.7465, over 12857.00 frames. ], tot_loss[loss=2.815, ArTop10Accuracy=0.749, over 11749.94 frames. ], batch size: 27, lr: 6.32e-03 2024-08-06 06:26:21,772 INFO [trainer.py:803] (1/8) Computing validation loss 2024-08-06 06:26:30,765 INFO [trainer.py:811] (1/8) Epoch 19, validation: loss=2.918, ArTop10Accuracy=0.733, over 1829298.00 frames. 2024-08-06 06:26:30,766 INFO [trainer.py:814] (1/8) Maximum memory allocated so far is 30932MB 2024-08-06 06:26:31,053 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.198e+02 1.416e+02 1.525e+02 1.662e+02 2.849e+02, threshold=3.050e+02, percent-clipped=0.0 2024-08-06 06:26:34,030 INFO [trainer.py:765] (1/8) Epoch 19, batch 1000, train_loss[loss=2.831, ArTop10Accuracy=0.7502, over 12917.00 frames. ], tot_loss[loss=2.817, ArTop10Accuracy=0.7485, over 11946.58 frames. ], batch size: 27, lr: 6.31e-03 2024-08-06 06:27:05,190 INFO [trainer.py:765] (1/8) Epoch 19, batch 1100, train_loss[loss=2.865, ArTop10Accuracy=0.7391, over 13739.00 frames. ], tot_loss[loss=2.827, ArTop10Accuracy=0.7469, over 12006.14 frames. ], batch size: 34, lr: 6.30e-03 2024-08-06 06:27:35,454 INFO [trainer.py:765] (1/8) Epoch 19, batch 1200, train_loss[loss=2.875, ArTop10Accuracy=0.7384, over 12216.00 frames. ], tot_loss[loss=2.827, ArTop10Accuracy=0.7467, over 11964.15 frames. ], batch size: 98, lr: 6.28e-03 2024-08-06 06:28:00,649 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 06:29:08,985 INFO [trainer.py:765] (1/8) Epoch 20, batch 100, train_loss[loss=2.812, ArTop10Accuracy=0.7492, over 14594.00 frames. ], tot_loss[loss=2.794, ArTop10Accuracy=0.7537, over 4778.56 frames. ], batch size: 61, lr: 6.10e-03 2024-08-06 06:29:50,318 INFO [trainer.py:765] (1/8) Epoch 20, batch 200, train_loss[loss=2.711, ArTop10Accuracy=0.761, over 13932.00 frames. ], tot_loss[loss=2.795, ArTop10Accuracy=0.7537, over 7787.63 frames. ], batch size: 34, lr: 6.09e-03 2024-08-06 06:30:37,106 INFO [trainer.py:765] (1/8) Epoch 20, batch 300, train_loss[loss=2.765, ArTop10Accuracy=0.7561, over 14305.00 frames. ], tot_loss[loss=2.792, ArTop10Accuracy=0.7542, over 9419.91 frames. ], batch size: 44, lr: 6.08e-03 2024-08-06 06:31:16,354 INFO [trainer.py:765] (1/8) Epoch 20, batch 400, train_loss[loss=2.805, ArTop10Accuracy=0.7542, over 10359.00 frames. ], tot_loss[loss=2.789, ArTop10Accuracy=0.7546, over 10335.06 frames. ], batch size: 14, lr: 6.07e-03 2024-08-06 06:32:03,759 INFO [trainer.py:765] (1/8) Epoch 20, batch 500, train_loss[loss=2.784, ArTop10Accuracy=0.7469, over 12345.00 frames. ], tot_loss[loss=2.786, ArTop10Accuracy=0.7548, over 10906.17 frames. ], batch size: 22, lr: 6.05e-03 2024-08-06 06:32:43,357 INFO [trainer.py:765] (1/8) Epoch 20, batch 600, train_loss[loss=2.663, ArTop10Accuracy=0.7778, over 11678.00 frames. ], tot_loss[loss=2.791, ArTop10Accuracy=0.7537, over 11425.12 frames. ], batch size: 18, lr: 6.04e-03 2024-08-06 06:33:36,752 INFO [trainer.py:765] (1/8) Epoch 20, batch 700, train_loss[loss=2.749, ArTop10Accuracy=0.7658, over 10069.00 frames. ], tot_loss[loss=2.8, ArTop10Accuracy=0.752, over 11566.64 frames. ], batch size: 12, lr: 6.03e-03 2024-08-06 06:33:43,829 INFO [optim.py:386] (1/8) Clipping_scale=2.0, grad-norm quartiles 1.196e+02 1.417e+02 1.526e+02 1.639e+02 3.791e+02, threshold=3.052e+02, percent-clipped=0.1 2024-08-06 06:34:13,304 INFO [trainer.py:765] (1/8) Epoch 20, batch 800, train_loss[loss=2.735, ArTop10Accuracy=0.765, over 9984.00 frames. ], tot_loss[loss=2.805, ArTop10Accuracy=0.7511, over 11687.19 frames. ], batch size: 12, lr: 6.02e-03 2024-08-06 06:34:44,580 INFO [trainer.py:765] (1/8) Epoch 20, batch 900, train_loss[loss=2.91, ArTop10Accuracy=0.7331, over 12961.00 frames. ], tot_loss[loss=2.804, ArTop10Accuracy=0.7511, over 11731.76 frames. ], batch size: 27, lr: 6.01e-03 2024-08-06 06:35:16,139 INFO [trainer.py:765] (1/8) Epoch 20, batch 1000, train_loss[loss=2.738, ArTop10Accuracy=0.7649, over 13143.00 frames. ], tot_loss[loss=2.808, ArTop10Accuracy=0.7502, over 11926.52 frames. ], batch size: 27, lr: 6.00e-03 2024-08-06 06:35:47,214 INFO [trainer.py:765] (1/8) Epoch 20, batch 1100, train_loss[loss=2.702, ArTop10Accuracy=0.7708, over 13670.00 frames. ], tot_loss[loss=2.817, ArTop10Accuracy=0.7488, over 11972.22 frames. ], batch size: 34, lr: 5.99e-03 2024-08-06 06:36:17,439 INFO [trainer.py:765] (1/8) Epoch 20, batch 1200, train_loss[loss=2.978, ArTop10Accuracy=0.7192, over 12645.00 frames. ], tot_loss[loss=2.82, ArTop10Accuracy=0.7482, over 11919.35 frames. ], batch size: 98, lr: 5.97e-03 2024-08-06 06:36:42,651 INFO [trainer.py:650] (1/8) Reaches end of dataloader. 2024-08-06 06:36:42,654 INFO [trainer.py:1069] (1/8) Done!