2024-08-06 03:39:40,356 INFO [trainer.py:870] (0/8) Training started 2024-08-06 03:39:40,361 INFO [trainer.py:889] (0/8) Device: cuda:0 2024-08-06 03:39:40,361 INFO [trainer.py:890] (0/8) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 100, 'reset_interval': 200, 'valid_interval': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '279b0c87015a615b81b147251814d737a548f397', 'k2-git-date': 'Wed May 24 22:24:09 2023', 'lhotse-version': '1.26.0', 'torch-version': '2.0.1+cu118', 'torch-cuda-available': True, 'torch-cuda-version': '11.8', 'python-version': '3.10', 'icefall-git-branch': 'main', 'icefall-git-sha1': '7d2e5f4-dirty', 'icefall-git-date': 'Tue Aug 6 02:59:12 2024', 'icefall-path': '/workspace/icefall_llm', 'k2-path': '/usr/local/lib/python3.10/dist-packages/k2/__init__.py', 'lhotse-path': '/usr/local/lib/python3.10/dist-packages/lhotse/__init__.py', 'hostname': '6865771', 'IP address': '0.104.195.107'}, 'world_size': 8, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 20, 'start_epoch': 1, 'start_batch': 0, 'exp_dir': PosixPath('exp/valle'), 'optimizer_name': 'ScaledAdam', 'scheduler_name': 'Eden', 'base_lr': 0.03, 'warmup_steps': 200, 'seed': 42, 'inf_check': False, 'save_every_n': 1000, 'keep_last_k': 20, 'average_period': 0, 'accumulate_grad_steps': 1, 'dtype': 'bfloat16', 'filter_min_duration': 0.5, 'filter_max_duration': 14.0, 'train_stage': 1, 'visualize': False, 'oom_check': False, 'model_name': 'valle', 'decoder_dim': 1024, 'nhead': 16, 'num_decoder_layers': 12, 'scale_factor': 1.0, 'norm_first': True, 'add_prenet': False, 'prefix_mode': 1, 'share_embedding': True, 'prepend_bos': False, 'num_quantizers': 8, 'scaling_xformers': False, 'manifest_dir': PosixPath('data/tokenized'), 'max_duration': 320, 'bucketing_sampler': True, 'num_buckets': 6, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 0.1, 'on_the_fly_feats': False, 'shuffle': True, 'buffer_size': 40000, 'shuffle_buffer_size': 100000, 'drop_last': False, 'return_cuts': True, 'num_workers': 8, 'enable_spec_aug': False, 'spec_aug_time_warp_factor': 80, 'input_strategy': 'PrecomputedFeatures', 'dataset': 'libritts', 'text_tokens': 'data/tokenized/unique_text_tokens.k2symbols', 'sampling_rate': 24000} 2024-08-06 03:39:40,361 INFO [trainer.py:892] (0/8) About to create model 2024-08-06 03:39:41,123 INFO [trainer.py:899] (0/8) Number of model parameters: 367386628 2024-08-06 03:39:41,894 INFO [trainer.py:914] (0/8) Using DDP 2024-08-06 03:39:44,001 INFO [datamodule.py:427] (0/8) About to get train cuts 2024-08-06 03:39:44,003 INFO [datamodule.py:434] (0/8) About to get dev cuts 2024-08-06 03:39:44,004 INFO [datamodule.py:292] (0/8) Disable SpecAugment 2024-08-06 03:39:44,004 INFO [datamodule.py:294] (0/8) About to create train dataset 2024-08-06 03:39:44,004 INFO [datamodule.py:323] (0/8) Using DynamicBucketingSampler 2024-08-06 03:39:44,636 INFO [datamodule.py:344] (0/8) About to create train dataloader 2024-08-06 03:39:44,637 INFO [datamodule.py:367] (0/8) About to create dev dataset 2024-08-06 03:39:44,977 INFO [datamodule.py:388] (0/8) About to create dev dataloader 2024-08-06 03:40:39,570 INFO [trainer.py:765] (0/8) Epoch 1, batch 100, train_loss[loss=4.115, ArTop10Accuracy=0.5062, over 14477.00 frames. ], tot_loss[loss=4.774, ArTop10Accuracy=0.3971, over 4773.61 frames. ], batch size: 61, lr: 2.25e-02 2024-08-06 03:41:16,921 INFO [trainer.py:765] (0/8) Epoch 1, batch 200, train_loss[loss=3.944, ArTop10Accuracy=0.5332, over 13991.00 frames. ], tot_loss[loss=4.286, ArTop10Accuracy=0.4784, over 7781.74 frames. ], batch size: 34, lr: 3.00e-02 2024-08-06 03:41:57,950 INFO [trainer.py:765] (0/8) Epoch 1, batch 300, train_loss[loss=3.794, ArTop10Accuracy=0.5671, over 14114.00 frames. ], tot_loss[loss=4.077, ArTop10Accuracy=0.5126, over 9422.32 frames. ], batch size: 44, lr: 3.00e-02 2024-08-06 03:42:33,079 INFO [trainer.py:765] (0/8) Epoch 1, batch 400, train_loss[loss=3.666, ArTop10Accuracy=0.5856, over 10228.00 frames. ], tot_loss[loss=3.938, ArTop10Accuracy=0.5357, over 10323.30 frames. ], batch size: 14, lr: 3.00e-02 2024-08-06 03:43:11,270 INFO [trainer.py:765] (0/8) Epoch 1, batch 500, train_loss[loss=3.655, ArTop10Accuracy=0.5704, over 12481.00 frames. ], tot_loss[loss=3.83, ArTop10Accuracy=0.5534, over 10894.29 frames. ], batch size: 22, lr: 2.99e-02 2024-08-06 03:43:46,592 INFO [trainer.py:765] (0/8) Epoch 1, batch 600, train_loss[loss=3.507, ArTop10Accuracy=0.6035, over 11709.00 frames. ], tot_loss[loss=3.741, ArTop10Accuracy=0.5685, over 11416.88 frames. ], batch size: 18, lr: 2.99e-02 2024-08-06 03:44:27,900 INFO [trainer.py:765] (0/8) Epoch 1, batch 700, train_loss[loss=3.642, ArTop10Accuracy=0.5851, over 10365.00 frames. ], tot_loss[loss=3.687, ArTop10Accuracy=0.5778, over 11552.26 frames. ], batch size: 12, lr: 2.99e-02 2024-08-06 03:45:01,514 INFO [trainer.py:765] (0/8) Epoch 1, batch 800, train_loss[loss=3.56, ArTop10Accuracy=0.5988, over 10184.00 frames. ], tot_loss[loss=3.636, ArTop10Accuracy=0.5871, over 11671.99 frames. ], batch size: 12, lr: 2.98e-02 2024-08-06 03:45:32,557 INFO [trainer.py:765] (0/8) Epoch 1, batch 900, train_loss[loss=3.628, ArTop10Accuracy=0.5887, over 13127.00 frames. ], tot_loss[loss=3.588, ArTop10Accuracy=0.5963, over 11724.75 frames. ], batch size: 27, lr: 2.98e-02 2024-08-06 03:46:03,649 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-1000.pt 2024-08-06 03:46:07,682 INFO [trainer.py:765] (0/8) Epoch 1, batch 1000, train_loss[loss=3.561, ArTop10Accuracy=0.6028, over 13088.00 frames. ], tot_loss[loss=3.555, ArTop10Accuracy=0.6025, over 11917.89 frames. ], batch size: 27, lr: 2.97e-02 2024-08-06 03:46:07,988 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 8.169e+01 1.565e+02 2.239e+02 3.485e+02 9.105e+03, threshold=4.478e+02, percent-clipped=0.0 2024-08-06 03:46:38,612 INFO [trainer.py:765] (0/8) Epoch 1, batch 1100, train_loss[loss=3.407, ArTop10Accuracy=0.628, over 13767.00 frames. ], tot_loss[loss=3.528, ArTop10Accuracy=0.6074, over 11986.86 frames. ], batch size: 34, lr: 2.96e-02 2024-08-06 03:47:08,745 INFO [trainer.py:765] (0/8) Epoch 1, batch 1200, train_loss[loss=3.476, ArTop10Accuracy=0.6176, over 12273.00 frames. ], tot_loss[loss=3.508, ArTop10Accuracy=0.6112, over 11918.03 frames. ], batch size: 98, lr: 2.96e-02 2024-08-06 03:47:33,694 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 03:47:33,697 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-1.pt 2024-08-06 03:48:38,675 INFO [trainer.py:765] (0/8) Epoch 2, batch 100, train_loss[loss=3.512, ArTop10Accuracy=0.6097, over 14549.00 frames. ], tot_loss[loss=3.449, ArTop10Accuracy=0.6227, over 4791.40 frames. ], batch size: 61, lr: 2.90e-02 2024-08-06 03:49:14,596 INFO [trainer.py:765] (0/8) Epoch 2, batch 200, train_loss[loss=3.348, ArTop10Accuracy=0.6383, over 13985.00 frames. ], tot_loss[loss=3.428, ArTop10Accuracy=0.6269, over 7797.72 frames. ], batch size: 34, lr: 2.89e-02 2024-08-06 03:49:56,519 INFO [trainer.py:765] (0/8) Epoch 2, batch 300, train_loss[loss=3.485, ArTop10Accuracy=0.6182, over 14325.00 frames. ], tot_loss[loss=3.418, ArTop10Accuracy=0.6287, over 9426.43 frames. ], batch size: 44, lr: 2.89e-02 2024-08-06 03:50:32,000 INFO [trainer.py:765] (0/8) Epoch 2, batch 400, train_loss[loss=3.38, ArTop10Accuracy=0.6332, over 10121.00 frames. ], tot_loss[loss=3.408, ArTop10Accuracy=0.6308, over 10355.26 frames. ], batch size: 14, lr: 2.88e-02 2024-08-06 03:51:17,109 INFO [trainer.py:765] (0/8) Epoch 2, batch 500, train_loss[loss=3.381, ArTop10Accuracy=0.6257, over 12257.00 frames. ], tot_loss[loss=3.401, ArTop10Accuracy=0.632, over 10944.40 frames. ], batch size: 22, lr: 2.87e-02 2024-08-06 03:51:53,202 INFO [trainer.py:765] (0/8) Epoch 2, batch 600, train_loss[loss=3.191, ArTop10Accuracy=0.6737, over 11680.00 frames. ], tot_loss[loss=3.397, ArTop10Accuracy=0.6329, over 11458.95 frames. ], batch size: 18, lr: 2.86e-02 2024-08-06 03:52:38,993 INFO [trainer.py:765] (0/8) Epoch 2, batch 700, train_loss[loss=3.445, ArTop10Accuracy=0.616, over 10029.00 frames. ], tot_loss[loss=3.389, ArTop10Accuracy=0.6345, over 11598.56 frames. ], batch size: 12, lr: 2.85e-02 2024-08-06 03:52:47,089 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-2000.pt 2024-08-06 03:52:50,254 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 03:52:56,023 INFO [trainer.py:811] (0/8) Epoch 2, validation: loss=3.327, ArTop10Accuracy=0.6492, over 1829298.00 frames. 2024-08-06 03:52:56,024 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 29280MB 2024-08-06 03:52:56,542 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 8.181e+01 1.431e+02 1.849e+02 2.730e+02 2.344e+03, threshold=3.697e+02, percent-clipped=7.2 2024-08-06 03:53:21,882 INFO [trainer.py:765] (0/8) Epoch 2, batch 800, train_loss[loss=3.419, ArTop10Accuracy=0.6281, over 10166.00 frames. ], tot_loss[loss=3.379, ArTop10Accuracy=0.6362, over 11712.71 frames. ], batch size: 12, lr: 2.84e-02 2024-08-06 03:53:53,299 INFO [trainer.py:765] (0/8) Epoch 2, batch 900, train_loss[loss=3.226, ArTop10Accuracy=0.6545, over 12935.00 frames. ], tot_loss[loss=3.365, ArTop10Accuracy=0.6387, over 11751.69 frames. ], batch size: 27, lr: 2.83e-02 2024-08-06 03:54:24,809 INFO [trainer.py:765] (0/8) Epoch 2, batch 1000, train_loss[loss=3.293, ArTop10Accuracy=0.6561, over 13065.00 frames. ], tot_loss[loss=3.361, ArTop10Accuracy=0.6399, over 11949.49 frames. ], batch size: 27, lr: 2.82e-02 2024-08-06 03:54:56,006 INFO [trainer.py:765] (0/8) Epoch 2, batch 1100, train_loss[loss=3.227, ArTop10Accuracy=0.6616, over 13740.00 frames. ], tot_loss[loss=3.359, ArTop10Accuracy=0.6402, over 11992.45 frames. ], batch size: 34, lr: 2.81e-02 2024-08-06 03:55:26,229 INFO [trainer.py:765] (0/8) Epoch 2, batch 1200, train_loss[loss=3.337, ArTop10Accuracy=0.6505, over 12307.00 frames. ], tot_loss[loss=3.354, ArTop10Accuracy=0.6413, over 11931.70 frames. ], batch size: 99, lr: 2.80e-02 2024-08-06 03:55:51,293 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 03:55:51,296 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-2.pt 2024-08-06 03:57:04,102 INFO [trainer.py:765] (0/8) Epoch 3, batch 100, train_loss[loss=3.263, ArTop10Accuracy=0.6638, over 14527.00 frames. ], tot_loss[loss=3.307, ArTop10Accuracy=0.6505, over 4779.02 frames. ], batch size: 61, lr: 2.67e-02 2024-08-06 03:57:50,980 INFO [trainer.py:765] (0/8) Epoch 3, batch 200, train_loss[loss=3.322, ArTop10Accuracy=0.6496, over 13716.00 frames. ], tot_loss[loss=3.295, ArTop10Accuracy=0.6534, over 7792.17 frames. ], batch size: 34, lr: 2.66e-02 2024-08-06 03:58:26,075 INFO [trainer.py:765] (0/8) Epoch 3, batch 300, train_loss[loss=3.167, ArTop10Accuracy=0.6692, over 14278.00 frames. ], tot_loss[loss=3.276, ArTop10Accuracy=0.6566, over 9431.63 frames. ], batch size: 44, lr: 2.64e-02 2024-08-06 03:59:11,253 INFO [trainer.py:765] (0/8) Epoch 3, batch 400, train_loss[loss=3.069, ArTop10Accuracy=0.6916, over 10973.00 frames. ], tot_loss[loss=3.269, ArTop10Accuracy=0.6584, over 10353.81 frames. ], batch size: 15, lr: 2.63e-02 2024-08-06 03:59:26,365 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-3000.pt 2024-08-06 03:59:29,674 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 8.720e+01 1.461e+02 1.775e+02 2.344e+02 9.150e+02, threshold=3.550e+02, percent-clipped=5.2 2024-08-06 03:59:49,303 INFO [trainer.py:765] (0/8) Epoch 3, batch 500, train_loss[loss=3.139, ArTop10Accuracy=0.6852, over 12539.00 frames. ], tot_loss[loss=3.258, ArTop10Accuracy=0.6609, over 10927.22 frames. ], batch size: 22, lr: 2.62e-02 2024-08-06 04:00:35,096 INFO [trainer.py:765] (0/8) Epoch 3, batch 600, train_loss[loss=3.245, ArTop10Accuracy=0.6688, over 11633.00 frames. ], tot_loss[loss=3.248, ArTop10Accuracy=0.6623, over 11458.63 frames. ], batch size: 18, lr: 2.61e-02 2024-08-06 04:01:22,058 INFO [trainer.py:765] (0/8) Epoch 3, batch 700, train_loss[loss=3.032, ArTop10Accuracy=0.7001, over 10188.00 frames. ], tot_loss[loss=3.241, ArTop10Accuracy=0.6634, over 11600.56 frames. ], batch size: 12, lr: 2.60e-02 2024-08-06 04:01:56,269 INFO [trainer.py:765] (0/8) Epoch 3, batch 800, train_loss[loss=3.122, ArTop10Accuracy=0.6856, over 10334.00 frames. ], tot_loss[loss=3.225, ArTop10Accuracy=0.6665, over 11685.71 frames. ], batch size: 12, lr: 2.59e-02 2024-08-06 04:02:27,741 INFO [trainer.py:765] (0/8) Epoch 3, batch 900, train_loss[loss=3.193, ArTop10Accuracy=0.6785, over 12852.00 frames. ], tot_loss[loss=3.204, ArTop10Accuracy=0.6708, over 11736.38 frames. ], batch size: 27, lr: 2.57e-02 2024-08-06 04:02:59,283 INFO [trainer.py:765] (0/8) Epoch 3, batch 1000, train_loss[loss=3.144, ArTop10Accuracy=0.6877, over 13191.00 frames. ], tot_loss[loss=3.2, ArTop10Accuracy=0.6719, over 11929.19 frames. ], batch size: 27, lr: 2.56e-02 2024-08-06 04:03:30,942 INFO [trainer.py:765] (0/8) Epoch 3, batch 1100, train_loss[loss=3.19, ArTop10Accuracy=0.6685, over 13637.00 frames. ], tot_loss[loss=3.192, ArTop10Accuracy=0.6736, over 11995.14 frames. ], batch size: 34, lr: 2.55e-02 2024-08-06 04:04:01,312 INFO [trainer.py:765] (0/8) Epoch 3, batch 1200, train_loss[loss=3.308, ArTop10Accuracy=0.6544, over 12230.00 frames. ], tot_loss[loss=3.183, ArTop10Accuracy=0.6752, over 11923.50 frames. ], batch size: 97, lr: 2.54e-02 2024-08-06 04:04:26,675 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 04:04:26,677 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-3.pt 2024-08-06 04:05:43,369 INFO [trainer.py:765] (0/8) Epoch 4, batch 100, train_loss[loss=3.056, ArTop10Accuracy=0.6973, over 14180.00 frames. ], tot_loss[loss=3.14, ArTop10Accuracy=0.6851, over 4786.19 frames. ], batch size: 61, lr: 2.38e-02 2024-08-06 04:06:07,078 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-4000.pt 2024-08-06 04:06:10,595 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 04:06:16,404 INFO [trainer.py:811] (0/8) Epoch 4, validation: loss=3.063, ArTop10Accuracy=0.7031, over 1829298.00 frames. 2024-08-06 04:06:16,404 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 29681MB 2024-08-06 04:06:16,746 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.091e+02 1.493e+02 1.709e+02 2.068e+02 7.969e+02, threshold=3.418e+02, percent-clipped=2.9 2024-08-06 04:06:31,826 INFO [trainer.py:765] (0/8) Epoch 4, batch 200, train_loss[loss=3.064, ArTop10Accuracy=0.7017, over 13834.00 frames. ], tot_loss[loss=3.13, ArTop10Accuracy=0.6866, over 7801.49 frames. ], batch size: 34, lr: 2.37e-02 2024-08-06 04:07:18,545 INFO [trainer.py:765] (0/8) Epoch 4, batch 300, train_loss[loss=3.176, ArTop10Accuracy=0.6745, over 14344.00 frames. ], tot_loss[loss=3.121, ArTop10Accuracy=0.6879, over 9440.54 frames. ], batch size: 44, lr: 2.36e-02 2024-08-06 04:08:01,910 INFO [trainer.py:765] (0/8) Epoch 4, batch 400, train_loss[loss=3.104, ArTop10Accuracy=0.6971, over 11013.00 frames. ], tot_loss[loss=3.115, ArTop10Accuracy=0.6889, over 10362.70 frames. ], batch size: 15, lr: 2.34e-02 2024-08-06 04:08:45,344 INFO [trainer.py:765] (0/8) Epoch 4, batch 500, train_loss[loss=2.908, ArTop10Accuracy=0.7181, over 12354.00 frames. ], tot_loss[loss=3.112, ArTop10Accuracy=0.6894, over 10930.04 frames. ], batch size: 22, lr: 2.33e-02 2024-08-06 04:09:37,072 INFO [trainer.py:765] (0/8) Epoch 4, batch 600, train_loss[loss=3.091, ArTop10Accuracy=0.697, over 11648.00 frames. ], tot_loss[loss=3.111, ArTop10Accuracy=0.6895, over 11445.30 frames. ], batch size: 18, lr: 2.32e-02 2024-08-06 04:10:13,501 INFO [trainer.py:765] (0/8) Epoch 4, batch 700, train_loss[loss=2.829, ArTop10Accuracy=0.7409, over 10140.00 frames. ], tot_loss[loss=3.115, ArTop10Accuracy=0.6885, over 11581.61 frames. ], batch size: 12, lr: 2.31e-02 2024-08-06 04:10:51,959 INFO [trainer.py:765] (0/8) Epoch 4, batch 800, train_loss[loss=3.144, ArTop10Accuracy=0.6847, over 10271.00 frames. ], tot_loss[loss=3.121, ArTop10Accuracy=0.6873, over 11687.48 frames. ], batch size: 12, lr: 2.30e-02 2024-08-06 04:11:23,330 INFO [trainer.py:765] (0/8) Epoch 4, batch 900, train_loss[loss=2.983, ArTop10Accuracy=0.712, over 12966.00 frames. ], tot_loss[loss=3.104, ArTop10Accuracy=0.6906, over 11732.98 frames. ], batch size: 27, lr: 2.29e-02 2024-08-06 04:11:54,826 INFO [trainer.py:765] (0/8) Epoch 4, batch 1000, train_loss[loss=3.082, ArTop10Accuracy=0.6988, over 12857.00 frames. ], tot_loss[loss=3.104, ArTop10Accuracy=0.691, over 11938.75 frames. ], batch size: 27, lr: 2.28e-02 2024-08-06 04:12:25,960 INFO [trainer.py:765] (0/8) Epoch 4, batch 1100, train_loss[loss=3.137, ArTop10Accuracy=0.683, over 13604.00 frames. ], tot_loss[loss=3.105, ArTop10Accuracy=0.6909, over 11980.29 frames. ], batch size: 34, lr: 2.26e-02 2024-08-06 04:12:45,699 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-5000.pt 2024-08-06 04:12:48,544 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.106e+02 1.440e+02 1.608e+02 1.893e+02 7.925e+02, threshold=3.216e+02, percent-clipped=2.0 2024-08-06 04:12:58,828 INFO [trainer.py:765] (0/8) Epoch 4, batch 1200, train_loss[loss=3.254, ArTop10Accuracy=0.6703, over 12869.00 frames. ], tot_loss[loss=3.102, ArTop10Accuracy=0.6914, over 11925.65 frames. ], batch size: 97, lr: 2.25e-02 2024-08-06 04:13:23,893 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 04:13:23,895 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-4.pt 2024-08-06 04:14:38,685 INFO [trainer.py:765] (0/8) Epoch 5, batch 100, train_loss[loss=3.128, ArTop10Accuracy=0.6899, over 14565.00 frames. ], tot_loss[loss=3.067, ArTop10Accuracy=0.699, over 4772.29 frames. ], batch size: 61, lr: 2.10e-02 2024-08-06 04:15:26,826 INFO [trainer.py:765] (0/8) Epoch 5, batch 200, train_loss[loss=2.968, ArTop10Accuracy=0.719, over 13647.00 frames. ], tot_loss[loss=3.056, ArTop10Accuracy=0.7015, over 7783.15 frames. ], batch size: 34, lr: 2.09e-02 2024-08-06 04:16:08,011 INFO [trainer.py:765] (0/8) Epoch 5, batch 300, train_loss[loss=3.154, ArTop10Accuracy=0.6844, over 14238.00 frames. ], tot_loss[loss=3.056, ArTop10Accuracy=0.7015, over 9403.87 frames. ], batch size: 44, lr: 2.08e-02 2024-08-06 04:16:53,133 INFO [trainer.py:765] (0/8) Epoch 5, batch 400, train_loss[loss=3.091, ArTop10Accuracy=0.6908, over 10385.00 frames. ], tot_loss[loss=3.056, ArTop10Accuracy=0.7008, over 10321.36 frames. ], batch size: 14, lr: 2.07e-02 2024-08-06 04:17:36,638 INFO [trainer.py:765] (0/8) Epoch 5, batch 500, train_loss[loss=3.144, ArTop10Accuracy=0.6915, over 12162.00 frames. ], tot_loss[loss=3.052, ArTop10Accuracy=0.7013, over 10879.21 frames. ], batch size: 22, lr: 2.06e-02 2024-08-06 04:18:22,113 INFO [trainer.py:765] (0/8) Epoch 5, batch 600, train_loss[loss=3.041, ArTop10Accuracy=0.7091, over 11478.00 frames. ], tot_loss[loss=3.058, ArTop10Accuracy=0.6998, over 11418.18 frames. ], batch size: 18, lr: 2.05e-02 2024-08-06 04:19:17,033 INFO [trainer.py:765] (0/8) Epoch 5, batch 700, train_loss[loss=3.009, ArTop10Accuracy=0.7107, over 10038.00 frames. ], tot_loss[loss=3.061, ArTop10Accuracy=0.6994, over 11585.15 frames. ], batch size: 12, lr: 2.04e-02 2024-08-06 04:19:51,066 INFO [trainer.py:765] (0/8) Epoch 5, batch 800, train_loss[loss=3.071, ArTop10Accuracy=0.6936, over 10125.00 frames. ], tot_loss[loss=3.06, ArTop10Accuracy=0.6995, over 11685.86 frames. ], batch size: 12, lr: 2.03e-02 2024-08-06 04:20:18,214 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-6000.pt 2024-08-06 04:20:21,954 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 04:20:27,475 INFO [trainer.py:811] (0/8) Epoch 5, validation: loss=2.998, ArTop10Accuracy=0.7157, over 1829298.00 frames. 2024-08-06 04:20:27,476 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30166MB 2024-08-06 04:20:27,781 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.057e+02 1.385e+02 1.542e+02 1.759e+02 7.741e+02, threshold=3.083e+02, percent-clipped=0.7 2024-08-06 04:20:31,767 INFO [trainer.py:765] (0/8) Epoch 5, batch 900, train_loss[loss=3.019, ArTop10Accuracy=0.709, over 13026.00 frames. ], tot_loss[loss=3.053, ArTop10Accuracy=0.7007, over 11740.81 frames. ], batch size: 27, lr: 2.02e-02 2024-08-06 04:21:03,306 INFO [trainer.py:765] (0/8) Epoch 5, batch 1000, train_loss[loss=3.032, ArTop10Accuracy=0.7024, over 12918.00 frames. ], tot_loss[loss=3.053, ArTop10Accuracy=0.701, over 11956.52 frames. ], batch size: 27, lr: 2.01e-02 2024-08-06 04:21:34,451 INFO [trainer.py:765] (0/8) Epoch 5, batch 1100, train_loss[loss=2.969, ArTop10Accuracy=0.7173, over 13590.00 frames. ], tot_loss[loss=3.055, ArTop10Accuracy=0.7007, over 12007.67 frames. ], batch size: 34, lr: 2.00e-02 2024-08-06 04:22:04,752 INFO [trainer.py:765] (0/8) Epoch 5, batch 1200, train_loss[loss=3.204, ArTop10Accuracy=0.6734, over 12969.00 frames. ], tot_loss[loss=3.059, ArTop10Accuracy=0.7002, over 11947.57 frames. ], batch size: 99, lr: 1.99e-02 2024-08-06 04:22:30,431 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 04:22:30,435 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-5.pt 2024-08-06 04:23:46,282 INFO [trainer.py:765] (0/8) Epoch 6, batch 100, train_loss[loss=3.133, ArTop10Accuracy=0.6927, over 14358.00 frames. ], tot_loss[loss=3.013, ArTop10Accuracy=0.7093, over 4797.85 frames. ], batch size: 61, lr: 1.85e-02 2024-08-06 04:24:35,256 INFO [trainer.py:765] (0/8) Epoch 6, batch 200, train_loss[loss=3.1, ArTop10Accuracy=0.6943, over 13607.00 frames. ], tot_loss[loss=3.012, ArTop10Accuracy=0.7099, over 7796.71 frames. ], batch size: 34, lr: 1.84e-02 2024-08-06 04:25:16,676 INFO [trainer.py:765] (0/8) Epoch 6, batch 300, train_loss[loss=3.032, ArTop10Accuracy=0.706, over 14227.00 frames. ], tot_loss[loss=3.015, ArTop10Accuracy=0.709, over 9423.91 frames. ], batch size: 44, lr: 1.83e-02 2024-08-06 04:26:08,924 INFO [trainer.py:765] (0/8) Epoch 6, batch 400, train_loss[loss=2.856, ArTop10Accuracy=0.7436, over 10157.00 frames. ], tot_loss[loss=3.016, ArTop10Accuracy=0.7092, over 10328.64 frames. ], batch size: 14, lr: 1.83e-02 2024-08-06 04:26:51,485 INFO [trainer.py:765] (0/8) Epoch 6, batch 500, train_loss[loss=2.92, ArTop10Accuracy=0.73, over 12394.00 frames. ], tot_loss[loss=3.014, ArTop10Accuracy=0.7096, over 10896.45 frames. ], batch size: 22, lr: 1.82e-02 2024-08-06 04:27:39,298 INFO [trainer.py:765] (0/8) Epoch 6, batch 600, train_loss[loss=3.119, ArTop10Accuracy=0.6874, over 11516.00 frames. ], tot_loss[loss=3.02, ArTop10Accuracy=0.7085, over 11416.22 frames. ], batch size: 18, lr: 1.81e-02 2024-08-06 04:27:41,654 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-7000.pt 2024-08-06 04:27:46,369 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.054e+02 1.343e+02 1.474e+02 1.660e+02 8.574e+02, threshold=2.947e+02, percent-clipped=0.6 2024-08-06 04:28:33,240 INFO [trainer.py:765] (0/8) Epoch 6, batch 700, train_loss[loss=3.091, ArTop10Accuracy=0.6931, over 10181.00 frames. ], tot_loss[loss=3.022, ArTop10Accuracy=0.7079, over 11583.44 frames. ], batch size: 12, lr: 1.80e-02 2024-08-06 04:29:11,216 INFO [trainer.py:765] (0/8) Epoch 6, batch 800, train_loss[loss=2.962, ArTop10Accuracy=0.7174, over 10098.00 frames. ], tot_loss[loss=3.025, ArTop10Accuracy=0.7068, over 11703.27 frames. ], batch size: 12, lr: 1.79e-02 2024-08-06 04:29:42,751 INFO [trainer.py:765] (0/8) Epoch 6, batch 900, train_loss[loss=2.938, ArTop10Accuracy=0.7197, over 13251.00 frames. ], tot_loss[loss=3.024, ArTop10Accuracy=0.7069, over 11757.31 frames. ], batch size: 28, lr: 1.78e-02 2024-08-06 04:30:14,306 INFO [trainer.py:765] (0/8) Epoch 6, batch 1000, train_loss[loss=3.008, ArTop10Accuracy=0.7192, over 12801.00 frames. ], tot_loss[loss=3.026, ArTop10Accuracy=0.7069, over 11964.72 frames. ], batch size: 27, lr: 1.77e-02 2024-08-06 04:30:45,383 INFO [trainer.py:765] (0/8) Epoch 6, batch 1100, train_loss[loss=3.029, ArTop10Accuracy=0.703, over 13928.00 frames. ], tot_loss[loss=3.03, ArTop10Accuracy=0.7059, over 11998.77 frames. ], batch size: 34, lr: 1.77e-02 2024-08-06 04:31:15,674 INFO [trainer.py:765] (0/8) Epoch 6, batch 1200, train_loss[loss=3.148, ArTop10Accuracy=0.6807, over 13222.00 frames. ], tot_loss[loss=3.03, ArTop10Accuracy=0.7057, over 11968.48 frames. ], batch size: 98, lr: 1.76e-02 2024-08-06 04:31:40,595 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 04:31:40,598 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-6.pt 2024-08-06 04:32:52,405 INFO [trainer.py:765] (0/8) Epoch 7, batch 100, train_loss[loss=3.019, ArTop10Accuracy=0.7107, over 14649.00 frames. ], tot_loss[loss=2.98, ArTop10Accuracy=0.7165, over 4784.46 frames. ], batch size: 62, lr: 1.64e-02 2024-08-06 04:33:38,223 INFO [trainer.py:765] (0/8) Epoch 7, batch 200, train_loss[loss=2.923, ArTop10Accuracy=0.7315, over 13779.00 frames. ], tot_loss[loss=2.983, ArTop10Accuracy=0.7158, over 7796.21 frames. ], batch size: 34, lr: 1.64e-02 2024-08-06 04:34:22,609 INFO [trainer.py:765] (0/8) Epoch 7, batch 300, train_loss[loss=3.033, ArTop10Accuracy=0.7063, over 14454.00 frames. ], tot_loss[loss=2.982, ArTop10Accuracy=0.7162, over 9419.13 frames. ], batch size: 44, lr: 1.63e-02 2024-08-06 04:34:36,847 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-8000.pt 2024-08-06 04:34:39,927 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 04:34:45,809 INFO [trainer.py:811] (0/8) Epoch 7, validation: loss=2.963, ArTop10Accuracy=0.7233, over 1829298.00 frames. 2024-08-06 04:34:45,809 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30166MB 2024-08-06 04:34:46,124 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.009e+02 1.306e+02 1.435e+02 1.599e+02 8.689e+02, threshold=2.871e+02, percent-clipped=0.9 2024-08-06 04:35:17,147 INFO [trainer.py:765] (0/8) Epoch 7, batch 400, train_loss[loss=3.007, ArTop10Accuracy=0.7141, over 10939.00 frames. ], tot_loss[loss=2.984, ArTop10Accuracy=0.7154, over 10333.36 frames. ], batch size: 15, lr: 1.62e-02 2024-08-06 04:36:01,711 INFO [trainer.py:765] (0/8) Epoch 7, batch 500, train_loss[loss=3.046, ArTop10Accuracy=0.7014, over 12358.00 frames. ], tot_loss[loss=2.983, ArTop10Accuracy=0.7154, over 10900.99 frames. ], batch size: 22, lr: 1.61e-02 2024-08-06 04:36:48,812 INFO [trainer.py:765] (0/8) Epoch 7, batch 600, train_loss[loss=2.922, ArTop10Accuracy=0.7226, over 11642.00 frames. ], tot_loss[loss=2.991, ArTop10Accuracy=0.7138, over 11416.97 frames. ], batch size: 18, lr: 1.61e-02 2024-08-06 04:37:34,800 INFO [trainer.py:765] (0/8) Epoch 7, batch 700, train_loss[loss=2.882, ArTop10Accuracy=0.7363, over 10099.00 frames. ], tot_loss[loss=2.993, ArTop10Accuracy=0.7129, over 11576.53 frames. ], batch size: 12, lr: 1.60e-02 2024-08-06 04:38:13,614 INFO [trainer.py:765] (0/8) Epoch 7, batch 800, train_loss[loss=3.023, ArTop10Accuracy=0.7031, over 10117.00 frames. ], tot_loss[loss=2.998, ArTop10Accuracy=0.7118, over 11686.69 frames. ], batch size: 12, lr: 1.59e-02 2024-08-06 04:38:45,110 INFO [trainer.py:765] (0/8) Epoch 7, batch 900, train_loss[loss=2.892, ArTop10Accuracy=0.7333, over 12890.00 frames. ], tot_loss[loss=2.985, ArTop10Accuracy=0.714, over 11734.77 frames. ], batch size: 27, lr: 1.59e-02 2024-08-06 04:39:16,575 INFO [trainer.py:765] (0/8) Epoch 7, batch 1000, train_loss[loss=2.88, ArTop10Accuracy=0.7392, over 12922.00 frames. ], tot_loss[loss=2.992, ArTop10Accuracy=0.7132, over 11960.08 frames. ], batch size: 27, lr: 1.58e-02 2024-08-06 04:39:47,571 INFO [trainer.py:765] (0/8) Epoch 7, batch 1100, train_loss[loss=3.04, ArTop10Accuracy=0.7042, over 13794.00 frames. ], tot_loss[loss=2.997, ArTop10Accuracy=0.7125, over 11992.82 frames. ], batch size: 34, lr: 1.57e-02 2024-08-06 04:40:17,990 INFO [trainer.py:765] (0/8) Epoch 7, batch 1200, train_loss[loss=3.169, ArTop10Accuracy=0.6834, over 11938.00 frames. ], tot_loss[loss=3, ArTop10Accuracy=0.7122, over 11941.88 frames. ], batch size: 99, lr: 1.57e-02 2024-08-06 04:40:43,386 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 04:40:43,389 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-7.pt 2024-08-06 04:41:34,479 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-9000.pt 2024-08-06 04:41:37,491 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 9.816e+01 1.295e+02 1.411e+02 1.574e+02 4.953e+02, threshold=2.821e+02, percent-clipped=1.1 2024-08-06 04:41:58,371 INFO [trainer.py:765] (0/8) Epoch 8, batch 100, train_loss[loss=2.969, ArTop10Accuracy=0.7203, over 14773.00 frames. ], tot_loss[loss=2.966, ArTop10Accuracy=0.7197, over 4791.07 frames. ], batch size: 61, lr: 1.47e-02 2024-08-06 04:42:44,986 INFO [trainer.py:765] (0/8) Epoch 8, batch 200, train_loss[loss=2.851, ArTop10Accuracy=0.7433, over 13565.00 frames. ], tot_loss[loss=2.968, ArTop10Accuracy=0.7191, over 7794.06 frames. ], batch size: 34, lr: 1.46e-02 2024-08-06 04:43:28,045 INFO [trainer.py:765] (0/8) Epoch 8, batch 300, train_loss[loss=3.05, ArTop10Accuracy=0.7073, over 14269.00 frames. ], tot_loss[loss=2.961, ArTop10Accuracy=0.7203, over 9405.56 frames. ], batch size: 44, lr: 1.46e-02 2024-08-06 04:44:14,461 INFO [trainer.py:765] (0/8) Epoch 8, batch 400, train_loss[loss=3.076, ArTop10Accuracy=0.6892, over 10975.00 frames. ], tot_loss[loss=2.966, ArTop10Accuracy=0.7194, over 10330.46 frames. ], batch size: 15, lr: 1.45e-02 2024-08-06 04:45:00,692 INFO [trainer.py:765] (0/8) Epoch 8, batch 500, train_loss[loss=2.887, ArTop10Accuracy=0.7299, over 12258.00 frames. ], tot_loss[loss=2.961, ArTop10Accuracy=0.7199, over 10893.93 frames. ], batch size: 22, lr: 1.45e-02 2024-08-06 04:45:45,393 INFO [trainer.py:765] (0/8) Epoch 8, batch 600, train_loss[loss=2.882, ArTop10Accuracy=0.7336, over 11537.00 frames. ], tot_loss[loss=2.961, ArTop10Accuracy=0.7193, over 11429.99 frames. ], batch size: 18, lr: 1.44e-02 2024-08-06 04:46:34,038 INFO [trainer.py:765] (0/8) Epoch 8, batch 700, train_loss[loss=2.913, ArTop10Accuracy=0.7296, over 9320.00 frames. ], tot_loss[loss=2.963, ArTop10Accuracy=0.7189, over 11568.50 frames. ], batch size: 11, lr: 1.43e-02 2024-08-06 04:47:10,207 INFO [trainer.py:765] (0/8) Epoch 8, batch 800, train_loss[loss=2.918, ArTop10Accuracy=0.7321, over 10097.00 frames. ], tot_loss[loss=2.969, ArTop10Accuracy=0.7178, over 11680.89 frames. ], batch size: 12, lr: 1.43e-02 2024-08-06 04:47:41,605 INFO [trainer.py:765] (0/8) Epoch 8, batch 900, train_loss[loss=2.832, ArTop10Accuracy=0.7399, over 12865.00 frames. ], tot_loss[loss=2.96, ArTop10Accuracy=0.7193, over 11730.79 frames. ], batch size: 27, lr: 1.42e-02 2024-08-06 04:48:13,033 INFO [trainer.py:765] (0/8) Epoch 8, batch 1000, train_loss[loss=2.932, ArTop10Accuracy=0.7198, over 12777.00 frames. ], tot_loss[loss=2.968, ArTop10Accuracy=0.7179, over 11920.09 frames. ], batch size: 27, lr: 1.42e-02 2024-08-06 04:48:28,826 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-10000.pt 2024-08-06 04:48:31,904 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 04:48:37,663 INFO [trainer.py:811] (0/8) Epoch 8, validation: loss=2.946, ArTop10Accuracy=0.7266, over 1829298.00 frames. 2024-08-06 04:48:37,664 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30166MB 2024-08-06 04:48:37,951 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.035e+02 1.289e+02 1.393e+02 1.532e+02 3.557e+02, threshold=2.786e+02, percent-clipped=0.2 2024-08-06 04:48:52,932 INFO [trainer.py:765] (0/8) Epoch 8, batch 1100, train_loss[loss=2.994, ArTop10Accuracy=0.7165, over 13708.00 frames. ], tot_loss[loss=2.977, ArTop10Accuracy=0.7162, over 11999.59 frames. ], batch size: 34, lr: 1.41e-02 2024-08-06 04:49:23,202 INFO [trainer.py:765] (0/8) Epoch 8, batch 1200, train_loss[loss=3.071, ArTop10Accuracy=0.6969, over 12970.00 frames. ], tot_loss[loss=2.982, ArTop10Accuracy=0.7159, over 11928.39 frames. ], batch size: 97, lr: 1.40e-02 2024-08-06 04:49:49,140 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 04:49:49,143 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-8.pt 2024-08-06 04:51:01,547 INFO [trainer.py:765] (0/8) Epoch 9, batch 100, train_loss[loss=2.966, ArTop10Accuracy=0.7242, over 14839.00 frames. ], tot_loss[loss=2.942, ArTop10Accuracy=0.7244, over 4775.07 frames. ], batch size: 62, lr: 1.32e-02 2024-08-06 04:51:45,414 INFO [trainer.py:765] (0/8) Epoch 9, batch 200, train_loss[loss=3.012, ArTop10Accuracy=0.7159, over 13736.00 frames. ], tot_loss[loss=2.938, ArTop10Accuracy=0.7251, over 7778.10 frames. ], batch size: 34, lr: 1.32e-02 2024-08-06 04:52:29,082 INFO [trainer.py:765] (0/8) Epoch 9, batch 300, train_loss[loss=2.872, ArTop10Accuracy=0.7323, over 14581.00 frames. ], tot_loss[loss=2.936, ArTop10Accuracy=0.7253, over 9412.20 frames. ], batch size: 44, lr: 1.31e-02 2024-08-06 04:53:16,431 INFO [trainer.py:765] (0/8) Epoch 9, batch 400, train_loss[loss=2.944, ArTop10Accuracy=0.721, over 11103.00 frames. ], tot_loss[loss=2.937, ArTop10Accuracy=0.725, over 10325.61 frames. ], batch size: 15, lr: 1.31e-02 2024-08-06 04:53:58,143 INFO [trainer.py:765] (0/8) Epoch 9, batch 500, train_loss[loss=3.011, ArTop10Accuracy=0.7143, over 12100.00 frames. ], tot_loss[loss=2.943, ArTop10Accuracy=0.7242, over 10915.61 frames. ], batch size: 22, lr: 1.30e-02 2024-08-06 04:54:51,077 INFO [trainer.py:765] (0/8) Epoch 9, batch 600, train_loss[loss=2.959, ArTop10Accuracy=0.716, over 11554.00 frames. ], tot_loss[loss=2.941, ArTop10Accuracy=0.7236, over 11424.40 frames. ], batch size: 18, lr: 1.30e-02 2024-08-06 04:55:34,400 INFO [trainer.py:765] (0/8) Epoch 9, batch 700, train_loss[loss=2.819, ArTop10Accuracy=0.7445, over 9911.00 frames. ], tot_loss[loss=2.949, ArTop10Accuracy=0.722, over 11549.12 frames. ], batch size: 12, lr: 1.29e-02 2024-08-06 04:56:01,421 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-11000.pt 2024-08-06 04:56:04,573 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.029e+02 1.257e+02 1.367e+02 1.507e+02 8.820e+02, threshold=2.735e+02, percent-clipped=0.5 2024-08-06 04:56:13,597 INFO [trainer.py:765] (0/8) Epoch 9, batch 800, train_loss[loss=2.794, ArTop10Accuracy=0.756, over 10111.00 frames. ], tot_loss[loss=2.954, ArTop10Accuracy=0.7212, over 11662.03 frames. ], batch size: 12, lr: 1.29e-02 2024-08-06 04:56:44,975 INFO [trainer.py:765] (0/8) Epoch 9, batch 900, train_loss[loss=2.956, ArTop10Accuracy=0.7281, over 12998.00 frames. ], tot_loss[loss=2.947, ArTop10Accuracy=0.7224, over 11722.20 frames. ], batch size: 27, lr: 1.28e-02 2024-08-06 04:57:16,491 INFO [trainer.py:765] (0/8) Epoch 9, batch 1000, train_loss[loss=2.988, ArTop10Accuracy=0.711, over 13043.00 frames. ], tot_loss[loss=2.952, ArTop10Accuracy=0.7215, over 11938.11 frames. ], batch size: 27, lr: 1.28e-02 2024-08-06 04:57:47,657 INFO [trainer.py:765] (0/8) Epoch 9, batch 1100, train_loss[loss=3.061, ArTop10Accuracy=0.6999, over 13675.00 frames. ], tot_loss[loss=2.958, ArTop10Accuracy=0.7204, over 11999.12 frames. ], batch size: 34, lr: 1.27e-02 2024-08-06 04:58:18,094 INFO [trainer.py:765] (0/8) Epoch 9, batch 1200, train_loss[loss=3.116, ArTop10Accuracy=0.6897, over 12997.00 frames. ], tot_loss[loss=2.955, ArTop10Accuracy=0.7208, over 11961.99 frames. ], batch size: 99, lr: 1.27e-02 2024-08-06 04:58:43,366 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 04:58:43,369 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-9.pt 2024-08-06 04:59:52,749 INFO [trainer.py:765] (0/8) Epoch 10, batch 100, train_loss[loss=2.958, ArTop10Accuracy=0.7209, over 14615.00 frames. ], tot_loss[loss=2.924, ArTop10Accuracy=0.7278, over 4781.40 frames. ], batch size: 61, lr: 1.20e-02 2024-08-06 05:00:43,729 INFO [trainer.py:765] (0/8) Epoch 10, batch 200, train_loss[loss=2.939, ArTop10Accuracy=0.7309, over 13773.00 frames. ], tot_loss[loss=2.922, ArTop10Accuracy=0.7284, over 7792.17 frames. ], batch size: 34, lr: 1.20e-02 2024-08-06 05:01:20,592 INFO [trainer.py:765] (0/8) Epoch 10, batch 300, train_loss[loss=2.999, ArTop10Accuracy=0.712, over 14139.00 frames. ], tot_loss[loss=2.914, ArTop10Accuracy=0.7294, over 9407.34 frames. ], batch size: 44, lr: 1.19e-02 2024-08-06 05:02:10,048 INFO [trainer.py:765] (0/8) Epoch 10, batch 400, train_loss[loss=2.946, ArTop10Accuracy=0.7272, over 10816.00 frames. ], tot_loss[loss=2.918, ArTop10Accuracy=0.7286, over 10322.17 frames. ], batch size: 15, lr: 1.19e-02 2024-08-06 05:02:46,487 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-12000.pt 2024-08-06 05:02:49,613 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 05:02:55,377 INFO [trainer.py:811] (0/8) Epoch 10, validation: loss=2.927, ArTop10Accuracy=0.7304, over 1829298.00 frames. 2024-08-06 05:02:55,378 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30166MB 2024-08-06 05:02:55,728 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.023e+02 1.269e+02 1.367e+02 1.518e+02 4.405e+02, threshold=2.733e+02, percent-clipped=0.4 2024-08-06 05:02:58,361 INFO [trainer.py:765] (0/8) Epoch 10, batch 500, train_loss[loss=2.952, ArTop10Accuracy=0.7206, over 12376.00 frames. ], tot_loss[loss=2.917, ArTop10Accuracy=0.7287, over 10889.85 frames. ], batch size: 22, lr: 1.19e-02 2024-08-06 05:03:48,229 INFO [trainer.py:765] (0/8) Epoch 10, batch 600, train_loss[loss=2.982, ArTop10Accuracy=0.7126, over 11661.00 frames. ], tot_loss[loss=2.925, ArTop10Accuracy=0.7269, over 11429.77 frames. ], batch size: 18, lr: 1.18e-02 2024-08-06 05:04:36,715 INFO [trainer.py:765] (0/8) Epoch 10, batch 700, train_loss[loss=2.898, ArTop10Accuracy=0.7335, over 9294.00 frames. ], tot_loss[loss=2.933, ArTop10Accuracy=0.7257, over 11585.14 frames. ], batch size: 11, lr: 1.18e-02 2024-08-06 05:05:10,725 INFO [trainer.py:765] (0/8) Epoch 10, batch 800, train_loss[loss=2.655, ArTop10Accuracy=0.7708, over 9951.00 frames. ], tot_loss[loss=2.934, ArTop10Accuracy=0.7254, over 11695.16 frames. ], batch size: 12, lr: 1.17e-02 2024-08-06 05:05:42,245 INFO [trainer.py:765] (0/8) Epoch 10, batch 900, train_loss[loss=2.921, ArTop10Accuracy=0.7253, over 13015.00 frames. ], tot_loss[loss=2.926, ArTop10Accuracy=0.7267, over 11744.64 frames. ], batch size: 27, lr: 1.17e-02 2024-08-06 05:06:13,843 INFO [trainer.py:765] (0/8) Epoch 10, batch 1000, train_loss[loss=2.892, ArTop10Accuracy=0.7333, over 13293.00 frames. ], tot_loss[loss=2.933, ArTop10Accuracy=0.7253, over 11946.61 frames. ], batch size: 28, lr: 1.16e-02 2024-08-06 05:06:45,055 INFO [trainer.py:765] (0/8) Epoch 10, batch 1100, train_loss[loss=2.847, ArTop10Accuracy=0.7367, over 13715.00 frames. ], tot_loss[loss=2.938, ArTop10Accuracy=0.7245, over 12011.58 frames. ], batch size: 34, lr: 1.16e-02 2024-08-06 05:07:15,484 INFO [trainer.py:765] (0/8) Epoch 10, batch 1200, train_loss[loss=3.095, ArTop10Accuracy=0.6982, over 12601.00 frames. ], tot_loss[loss=2.939, ArTop10Accuracy=0.7243, over 11964.58 frames. ], batch size: 99, lr: 1.16e-02 2024-08-06 05:07:40,412 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 05:07:40,415 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-10.pt 2024-08-06 05:08:52,966 INFO [trainer.py:765] (0/8) Epoch 11, batch 100, train_loss[loss=2.961, ArTop10Accuracy=0.7253, over 14781.00 frames. ], tot_loss[loss=2.907, ArTop10Accuracy=0.7314, over 4781.48 frames. ], batch size: 61, lr: 1.10e-02 2024-08-06 05:09:41,278 INFO [trainer.py:765] (0/8) Epoch 11, batch 200, train_loss[loss=2.871, ArTop10Accuracy=0.7337, over 13697.00 frames. ], tot_loss[loss=2.902, ArTop10Accuracy=0.7323, over 7786.93 frames. ], batch size: 34, lr: 1.10e-02 2024-08-06 05:09:48,259 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-13000.pt 2024-08-06 05:09:51,175 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.001e+02 1.278e+02 1.371e+02 1.502e+02 3.785e+02, threshold=2.743e+02, percent-clipped=0.3 2024-08-06 05:10:24,721 INFO [trainer.py:765] (0/8) Epoch 11, batch 300, train_loss[loss=3.031, ArTop10Accuracy=0.7094, over 14213.00 frames. ], tot_loss[loss=2.898, ArTop10Accuracy=0.7329, over 9403.86 frames. ], batch size: 44, lr: 1.09e-02 2024-08-06 05:11:11,784 INFO [trainer.py:765] (0/8) Epoch 11, batch 400, train_loss[loss=2.805, ArTop10Accuracy=0.7457, over 11071.00 frames. ], tot_loss[loss=2.903, ArTop10Accuracy=0.7318, over 10327.46 frames. ], batch size: 15, lr: 1.09e-02 2024-08-06 05:11:52,692 INFO [trainer.py:765] (0/8) Epoch 11, batch 500, train_loss[loss=2.807, ArTop10Accuracy=0.7449, over 12207.00 frames. ], tot_loss[loss=2.899, ArTop10Accuracy=0.7324, over 10884.53 frames. ], batch size: 22, lr: 1.09e-02 2024-08-06 05:12:40,288 INFO [trainer.py:765] (0/8) Epoch 11, batch 600, train_loss[loss=2.794, ArTop10Accuracy=0.7465, over 11657.00 frames. ], tot_loss[loss=2.902, ArTop10Accuracy=0.7317, over 11423.62 frames. ], batch size: 18, lr: 1.08e-02 2024-08-06 05:13:25,709 INFO [trainer.py:765] (0/8) Epoch 11, batch 700, train_loss[loss=2.926, ArTop10Accuracy=0.7354, over 9995.00 frames. ], tot_loss[loss=2.914, ArTop10Accuracy=0.7292, over 11567.13 frames. ], batch size: 12, lr: 1.08e-02 2024-08-06 05:14:04,207 INFO [trainer.py:765] (0/8) Epoch 11, batch 800, train_loss[loss=2.831, ArTop10Accuracy=0.7404, over 10248.00 frames. ], tot_loss[loss=2.919, ArTop10Accuracy=0.7279, over 11678.48 frames. ], batch size: 12, lr: 1.07e-02 2024-08-06 05:14:35,668 INFO [trainer.py:765] (0/8) Epoch 11, batch 900, train_loss[loss=2.947, ArTop10Accuracy=0.7274, over 13060.00 frames. ], tot_loss[loss=2.913, ArTop10Accuracy=0.7292, over 11736.16 frames. ], batch size: 27, lr: 1.07e-02 2024-08-06 05:15:07,264 INFO [trainer.py:765] (0/8) Epoch 11, batch 1000, train_loss[loss=2.848, ArTop10Accuracy=0.7434, over 12927.00 frames. ], tot_loss[loss=2.919, ArTop10Accuracy=0.7282, over 11933.03 frames. ], batch size: 27, lr: 1.07e-02 2024-08-06 05:15:38,260 INFO [trainer.py:765] (0/8) Epoch 11, batch 1100, train_loss[loss=2.958, ArTop10Accuracy=0.7163, over 13775.00 frames. ], tot_loss[loss=2.924, ArTop10Accuracy=0.7271, over 11979.36 frames. ], batch size: 34, lr: 1.06e-02 2024-08-06 05:16:08,499 INFO [trainer.py:765] (0/8) Epoch 11, batch 1200, train_loss[loss=3.093, ArTop10Accuracy=0.6956, over 11746.00 frames. ], tot_loss[loss=2.925, ArTop10Accuracy=0.7269, over 11943.88 frames. ], batch size: 98, lr: 1.06e-02 2024-08-06 05:16:12,697 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-14000.pt 2024-08-06 05:16:15,740 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 05:16:21,622 INFO [trainer.py:811] (0/8) Epoch 11, validation: loss=2.923, ArTop10Accuracy=0.7318, over 1829298.00 frames. 2024-08-06 05:16:21,623 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30166MB 2024-08-06 05:16:21,949 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.076e+02 1.268e+02 1.368e+02 1.481e+02 4.790e+02, threshold=2.736e+02, percent-clipped=0.6 2024-08-06 05:16:42,750 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 05:16:42,754 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-11.pt 2024-08-06 05:18:03,005 INFO [trainer.py:765] (0/8) Epoch 12, batch 100, train_loss[loss=2.927, ArTop10Accuracy=0.7272, over 14269.00 frames. ], tot_loss[loss=2.889, ArTop10Accuracy=0.7349, over 4792.73 frames. ], batch size: 61, lr: 1.01e-02 2024-08-06 05:18:46,004 INFO [trainer.py:765] (0/8) Epoch 12, batch 200, train_loss[loss=2.944, ArTop10Accuracy=0.7207, over 13567.00 frames. ], tot_loss[loss=2.889, ArTop10Accuracy=0.7351, over 7793.59 frames. ], batch size: 34, lr: 1.01e-02 2024-08-06 05:19:31,946 INFO [trainer.py:765] (0/8) Epoch 12, batch 300, train_loss[loss=2.915, ArTop10Accuracy=0.7271, over 14300.00 frames. ], tot_loss[loss=2.887, ArTop10Accuracy=0.7358, over 9416.87 frames. ], batch size: 44, lr: 1.01e-02 2024-08-06 05:20:12,431 INFO [trainer.py:765] (0/8) Epoch 12, batch 400, train_loss[loss=2.867, ArTop10Accuracy=0.7323, over 10417.00 frames. ], tot_loss[loss=2.888, ArTop10Accuracy=0.7349, over 10344.80 frames. ], batch size: 14, lr: 1.00e-02 2024-08-06 05:21:00,640 INFO [trainer.py:765] (0/8) Epoch 12, batch 500, train_loss[loss=2.862, ArTop10Accuracy=0.7405, over 12390.00 frames. ], tot_loss[loss=2.889, ArTop10Accuracy=0.7344, over 10925.04 frames. ], batch size: 22, lr: 9.99e-03 2024-08-06 05:21:43,915 INFO [trainer.py:765] (0/8) Epoch 12, batch 600, train_loss[loss=2.917, ArTop10Accuracy=0.7277, over 12081.00 frames. ], tot_loss[loss=2.89, ArTop10Accuracy=0.7339, over 11450.54 frames. ], batch size: 19, lr: 9.96e-03 2024-08-06 05:22:32,206 INFO [trainer.py:765] (0/8) Epoch 12, batch 700, train_loss[loss=2.869, ArTop10Accuracy=0.7396, over 9986.00 frames. ], tot_loss[loss=2.901, ArTop10Accuracy=0.7318, over 11571.92 frames. ], batch size: 12, lr: 9.93e-03 2024-08-06 05:23:08,911 INFO [trainer.py:765] (0/8) Epoch 12, batch 800, train_loss[loss=2.894, ArTop10Accuracy=0.7277, over 10152.00 frames. ], tot_loss[loss=2.902, ArTop10Accuracy=0.7313, over 11684.18 frames. ], batch size: 12, lr: 9.90e-03 2024-08-06 05:23:40,460 INFO [trainer.py:765] (0/8) Epoch 12, batch 900, train_loss[loss=2.897, ArTop10Accuracy=0.731, over 13401.00 frames. ], tot_loss[loss=2.89, ArTop10Accuracy=0.7337, over 11754.03 frames. ], batch size: 28, lr: 9.87e-03 2024-08-06 05:23:51,883 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-15000.pt 2024-08-06 05:23:54,576 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.067e+02 1.273e+02 1.376e+02 1.503e+02 4.050e+02, threshold=2.752e+02, percent-clipped=0.4 2024-08-06 05:24:14,345 INFO [trainer.py:765] (0/8) Epoch 12, batch 1000, train_loss[loss=2.754, ArTop10Accuracy=0.7565, over 12959.00 frames. ], tot_loss[loss=2.893, ArTop10Accuracy=0.7331, over 11947.69 frames. ], batch size: 27, lr: 9.84e-03 2024-08-06 05:24:45,501 INFO [trainer.py:765] (0/8) Epoch 12, batch 1100, train_loss[loss=2.945, ArTop10Accuracy=0.7268, over 13596.00 frames. ], tot_loss[loss=2.905, ArTop10Accuracy=0.731, over 11995.15 frames. ], batch size: 34, lr: 9.81e-03 2024-08-06 05:25:15,881 INFO [trainer.py:765] (0/8) Epoch 12, batch 1200, train_loss[loss=3.04, ArTop10Accuracy=0.7013, over 12046.00 frames. ], tot_loss[loss=2.908, ArTop10Accuracy=0.7301, over 11944.18 frames. ], batch size: 98, lr: 9.78e-03 2024-08-06 05:25:40,807 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 05:25:40,811 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-12.pt 2024-08-06 05:26:46,787 INFO [trainer.py:765] (0/8) Epoch 13, batch 100, train_loss[loss=2.926, ArTop10Accuracy=0.7308, over 14510.00 frames. ], tot_loss[loss=2.874, ArTop10Accuracy=0.7383, over 4779.02 frames. ], batch size: 61, lr: 9.36e-03 2024-08-06 05:27:32,553 INFO [trainer.py:765] (0/8) Epoch 13, batch 200, train_loss[loss=2.903, ArTop10Accuracy=0.7277, over 13742.00 frames. ], tot_loss[loss=2.869, ArTop10Accuracy=0.7386, over 7793.82 frames. ], batch size: 34, lr: 9.34e-03 2024-08-06 05:28:16,036 INFO [trainer.py:765] (0/8) Epoch 13, batch 300, train_loss[loss=2.966, ArTop10Accuracy=0.714, over 14193.00 frames. ], tot_loss[loss=2.87, ArTop10Accuracy=0.7383, over 9442.19 frames. ], batch size: 44, lr: 9.31e-03 2024-08-06 05:29:00,149 INFO [trainer.py:765] (0/8) Epoch 13, batch 400, train_loss[loss=2.841, ArTop10Accuracy=0.7461, over 10234.00 frames. ], tot_loss[loss=2.871, ArTop10Accuracy=0.7381, over 10371.04 frames. ], batch size: 14, lr: 9.28e-03 2024-08-06 05:29:43,967 INFO [trainer.py:765] (0/8) Epoch 13, batch 500, train_loss[loss=2.821, ArTop10Accuracy=0.7487, over 12354.00 frames. ], tot_loss[loss=2.867, ArTop10Accuracy=0.7386, over 10918.12 frames. ], batch size: 22, lr: 9.26e-03 2024-08-06 05:30:24,247 INFO [trainer.py:765] (0/8) Epoch 13, batch 600, train_loss[loss=2.837, ArTop10Accuracy=0.746, over 11500.00 frames. ], tot_loss[loss=2.874, ArTop10Accuracy=0.7371, over 11428.61 frames. ], batch size: 18, lr: 9.23e-03 2024-08-06 05:30:58,110 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-16000.pt 2024-08-06 05:31:01,160 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 05:31:07,054 INFO [trainer.py:811] (0/8) Epoch 13, validation: loss=2.918, ArTop10Accuracy=0.733, over 1829298.00 frames. 2024-08-06 05:31:07,054 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30166MB 2024-08-06 05:31:07,351 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.049e+02 1.283e+02 1.389e+02 1.496e+02 2.729e+02, threshold=2.779e+02, percent-clipped=0.0 2024-08-06 05:31:24,043 INFO [trainer.py:765] (0/8) Epoch 13, batch 700, train_loss[loss=2.64, ArTop10Accuracy=0.7729, over 10296.00 frames. ], tot_loss[loss=2.878, ArTop10Accuracy=0.7363, over 11548.15 frames. ], batch size: 12, lr: 9.20e-03 2024-08-06 05:32:00,147 INFO [trainer.py:765] (0/8) Epoch 13, batch 800, train_loss[loss=2.782, ArTop10Accuracy=0.753, over 10040.00 frames. ], tot_loss[loss=2.885, ArTop10Accuracy=0.735, over 11660.34 frames. ], batch size: 12, lr: 9.18e-03 2024-08-06 05:32:31,521 INFO [trainer.py:765] (0/8) Epoch 13, batch 900, train_loss[loss=2.874, ArTop10Accuracy=0.736, over 12905.00 frames. ], tot_loss[loss=2.879, ArTop10Accuracy=0.7363, over 11710.03 frames. ], batch size: 27, lr: 9.15e-03 2024-08-06 05:33:03,043 INFO [trainer.py:765] (0/8) Epoch 13, batch 1000, train_loss[loss=2.811, ArTop10Accuracy=0.7524, over 13056.00 frames. ], tot_loss[loss=2.887, ArTop10Accuracy=0.7346, over 11935.03 frames. ], batch size: 27, lr: 9.13e-03 2024-08-06 05:33:34,232 INFO [trainer.py:765] (0/8) Epoch 13, batch 1100, train_loss[loss=2.976, ArTop10Accuracy=0.7186, over 13991.00 frames. ], tot_loss[loss=2.894, ArTop10Accuracy=0.7331, over 11998.86 frames. ], batch size: 34, lr: 9.10e-03 2024-08-06 05:34:04,519 INFO [trainer.py:765] (0/8) Epoch 13, batch 1200, train_loss[loss=3.129, ArTop10Accuracy=0.6866, over 12041.00 frames. ], tot_loss[loss=2.894, ArTop10Accuracy=0.7331, over 11937.14 frames. ], batch size: 97, lr: 9.07e-03 2024-08-06 05:34:29,796 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 05:34:29,802 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-13.pt 2024-08-06 05:35:39,198 INFO [trainer.py:765] (0/8) Epoch 14, batch 100, train_loss[loss=2.972, ArTop10Accuracy=0.7162, over 14273.00 frames. ], tot_loss[loss=2.868, ArTop10Accuracy=0.7393, over 4770.11 frames. ], batch size: 61, lr: 8.71e-03 2024-08-06 05:36:23,063 INFO [trainer.py:765] (0/8) Epoch 14, batch 200, train_loss[loss=2.915, ArTop10Accuracy=0.728, over 13756.00 frames. ], tot_loss[loss=2.859, ArTop10Accuracy=0.7411, over 7792.21 frames. ], batch size: 34, lr: 8.68e-03 2024-08-06 05:37:09,309 INFO [trainer.py:765] (0/8) Epoch 14, batch 300, train_loss[loss=2.876, ArTop10Accuracy=0.7367, over 14623.00 frames. ], tot_loss[loss=2.855, ArTop10Accuracy=0.742, over 9429.87 frames. ], batch size: 44, lr: 8.66e-03 2024-08-06 05:37:43,059 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-17000.pt 2024-08-06 05:37:46,029 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.097e+02 1.304e+02 1.410e+02 1.531e+02 2.912e+02, threshold=2.820e+02, percent-clipped=0.2 2024-08-06 05:37:55,138 INFO [trainer.py:765] (0/8) Epoch 14, batch 400, train_loss[loss=2.872, ArTop10Accuracy=0.7444, over 10346.00 frames. ], tot_loss[loss=2.86, ArTop10Accuracy=0.7407, over 10345.81 frames. ], batch size: 14, lr: 8.64e-03 2024-08-06 05:38:42,025 INFO [trainer.py:765] (0/8) Epoch 14, batch 500, train_loss[loss=2.799, ArTop10Accuracy=0.7546, over 12241.00 frames. ], tot_loss[loss=2.859, ArTop10Accuracy=0.7405, over 10894.02 frames. ], batch size: 22, lr: 8.61e-03 2024-08-06 05:39:22,374 INFO [trainer.py:765] (0/8) Epoch 14, batch 600, train_loss[loss=2.745, ArTop10Accuracy=0.7663, over 11480.00 frames. ], tot_loss[loss=2.865, ArTop10Accuracy=0.7394, over 11401.19 frames. ], batch size: 18, lr: 8.59e-03 2024-08-06 05:40:15,143 INFO [trainer.py:765] (0/8) Epoch 14, batch 700, train_loss[loss=2.941, ArTop10Accuracy=0.7243, over 10235.00 frames. ], tot_loss[loss=2.871, ArTop10Accuracy=0.7379, over 11560.07 frames. ], batch size: 12, lr: 8.57e-03 2024-08-06 05:40:49,135 INFO [trainer.py:765] (0/8) Epoch 14, batch 800, train_loss[loss=2.794, ArTop10Accuracy=0.7562, over 10051.00 frames. ], tot_loss[loss=2.875, ArTop10Accuracy=0.7371, over 11692.18 frames. ], batch size: 12, lr: 8.55e-03 2024-08-06 05:41:20,466 INFO [trainer.py:765] (0/8) Epoch 14, batch 900, train_loss[loss=3.02, ArTop10Accuracy=0.7158, over 12967.00 frames. ], tot_loss[loss=2.873, ArTop10Accuracy=0.7373, over 11731.37 frames. ], batch size: 27, lr: 8.52e-03 2024-08-06 05:41:51,995 INFO [trainer.py:765] (0/8) Epoch 14, batch 1000, train_loss[loss=2.853, ArTop10Accuracy=0.7427, over 12793.00 frames. ], tot_loss[loss=2.877, ArTop10Accuracy=0.7366, over 11927.26 frames. ], batch size: 27, lr: 8.50e-03 2024-08-06 05:42:23,216 INFO [trainer.py:765] (0/8) Epoch 14, batch 1100, train_loss[loss=2.877, ArTop10Accuracy=0.7408, over 13594.00 frames. ], tot_loss[loss=2.883, ArTop10Accuracy=0.7357, over 11998.57 frames. ], batch size: 34, lr: 8.48e-03 2024-08-06 05:42:53,548 INFO [trainer.py:765] (0/8) Epoch 14, batch 1200, train_loss[loss=3.008, ArTop10Accuracy=0.7143, over 12112.00 frames. ], tot_loss[loss=2.881, ArTop10Accuracy=0.736, over 11938.39 frames. ], batch size: 98, lr: 8.46e-03 2024-08-06 05:43:18,778 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 05:43:18,781 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-14.pt 2024-08-06 05:44:28,572 INFO [trainer.py:765] (0/8) Epoch 15, batch 100, train_loss[loss=2.976, ArTop10Accuracy=0.7179, over 14284.00 frames. ], tot_loss[loss=2.852, ArTop10Accuracy=0.7428, over 4799.60 frames. ], batch size: 61, lr: 8.14e-03 2024-08-06 05:44:29,213 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-18000.pt 2024-08-06 05:44:32,247 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 05:44:38,023 INFO [trainer.py:811] (0/8) Epoch 15, validation: loss=2.913, ArTop10Accuracy=0.7339, over 1829298.00 frames. 2024-08-06 05:44:38,024 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30166MB 2024-08-06 05:44:38,413 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.100e+02 1.307e+02 1.417e+02 1.528e+02 2.981e+02, threshold=2.833e+02, percent-clipped=0.1 2024-08-06 05:45:20,184 INFO [trainer.py:765] (0/8) Epoch 15, batch 200, train_loss[loss=2.777, ArTop10Accuracy=0.7586, over 13661.00 frames. ], tot_loss[loss=2.848, ArTop10Accuracy=0.7436, over 7799.16 frames. ], batch size: 34, lr: 8.11e-03 2024-08-06 05:46:04,647 INFO [trainer.py:765] (0/8) Epoch 15, batch 300, train_loss[loss=2.952, ArTop10Accuracy=0.7216, over 14181.00 frames. ], tot_loss[loss=2.845, ArTop10Accuracy=0.7438, over 9411.69 frames. ], batch size: 44, lr: 8.09e-03 2024-08-06 05:46:51,902 INFO [trainer.py:765] (0/8) Epoch 15, batch 400, train_loss[loss=2.744, ArTop10Accuracy=0.7581, over 11069.00 frames. ], tot_loss[loss=2.845, ArTop10Accuracy=0.7435, over 10330.54 frames. ], batch size: 15, lr: 8.07e-03 2024-08-06 05:47:36,911 INFO [trainer.py:765] (0/8) Epoch 15, batch 500, train_loss[loss=2.844, ArTop10Accuracy=0.7437, over 12380.00 frames. ], tot_loss[loss=2.839, ArTop10Accuracy=0.7444, over 10912.75 frames. ], batch size: 22, lr: 8.05e-03 2024-08-06 05:48:24,723 INFO [trainer.py:765] (0/8) Epoch 15, batch 600, train_loss[loss=2.821, ArTop10Accuracy=0.7484, over 11770.00 frames. ], tot_loss[loss=2.847, ArTop10Accuracy=0.7425, over 11446.31 frames. ], batch size: 18, lr: 8.03e-03 2024-08-06 05:49:11,855 INFO [trainer.py:765] (0/8) Epoch 15, batch 700, train_loss[loss=2.719, ArTop10Accuracy=0.7657, over 9994.00 frames. ], tot_loss[loss=2.851, ArTop10Accuracy=0.7416, over 11574.34 frames. ], batch size: 12, lr: 8.01e-03 2024-08-06 05:49:45,778 INFO [trainer.py:765] (0/8) Epoch 15, batch 800, train_loss[loss=2.705, ArTop10Accuracy=0.7624, over 10154.00 frames. ], tot_loss[loss=2.861, ArTop10Accuracy=0.7397, over 11692.18 frames. ], batch size: 12, lr: 7.99e-03 2024-08-06 05:50:17,210 INFO [trainer.py:765] (0/8) Epoch 15, batch 900, train_loss[loss=2.784, ArTop10Accuracy=0.7587, over 12947.00 frames. ], tot_loss[loss=2.85, ArTop10Accuracy=0.7417, over 11739.90 frames. ], batch size: 27, lr: 7.97e-03 2024-08-06 05:50:48,829 INFO [trainer.py:765] (0/8) Epoch 15, batch 1000, train_loss[loss=2.884, ArTop10Accuracy=0.7331, over 13346.00 frames. ], tot_loss[loss=2.858, ArTop10Accuracy=0.7404, over 11938.83 frames. ], batch size: 28, lr: 7.95e-03 2024-08-06 05:51:20,069 INFO [trainer.py:765] (0/8) Epoch 15, batch 1100, train_loss[loss=2.742, ArTop10Accuracy=0.7603, over 13562.00 frames. ], tot_loss[loss=2.867, ArTop10Accuracy=0.7386, over 11992.08 frames. ], batch size: 34, lr: 7.93e-03 2024-08-06 05:51:20,660 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-19000.pt 2024-08-06 05:51:23,514 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.123e+02 1.337e+02 1.431e+02 1.541e+02 2.784e+02, threshold=2.862e+02, percent-clipped=0.0 2024-08-06 05:51:53,082 INFO [trainer.py:765] (0/8) Epoch 15, batch 1200, train_loss[loss=2.987, ArTop10Accuracy=0.7103, over 12660.00 frames. ], tot_loss[loss=2.871, ArTop10Accuracy=0.7375, over 11952.32 frames. ], batch size: 97, lr: 7.91e-03 2024-08-06 05:52:18,231 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 05:52:18,235 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-15.pt 2024-08-06 05:53:29,263 INFO [trainer.py:765] (0/8) Epoch 16, batch 100, train_loss[loss=2.786, ArTop10Accuracy=0.7581, over 14670.00 frames. ], tot_loss[loss=2.833, ArTop10Accuracy=0.7457, over 4787.17 frames. ], batch size: 62, lr: 7.63e-03 2024-08-06 05:54:12,878 INFO [trainer.py:765] (0/8) Epoch 16, batch 200, train_loss[loss=2.788, ArTop10Accuracy=0.7446, over 13685.00 frames. ], tot_loss[loss=2.833, ArTop10Accuracy=0.746, over 7782.81 frames. ], batch size: 34, lr: 7.61e-03 2024-08-06 05:54:59,737 INFO [trainer.py:765] (0/8) Epoch 16, batch 300, train_loss[loss=2.858, ArTop10Accuracy=0.7479, over 14147.00 frames. ], tot_loss[loss=2.836, ArTop10Accuracy=0.7455, over 9396.12 frames. ], batch size: 44, lr: 7.59e-03 2024-08-06 05:55:41,931 INFO [trainer.py:765] (0/8) Epoch 16, batch 400, train_loss[loss=2.62, ArTop10Accuracy=0.7767, over 11460.00 frames. ], tot_loss[loss=2.827, ArTop10Accuracy=0.7468, over 10327.04 frames. ], batch size: 16, lr: 7.58e-03 2024-08-06 05:56:27,680 INFO [trainer.py:765] (0/8) Epoch 16, batch 500, train_loss[loss=2.753, ArTop10Accuracy=0.7555, over 12118.00 frames. ], tot_loss[loss=2.828, ArTop10Accuracy=0.7463, over 10902.66 frames. ], batch size: 22, lr: 7.56e-03 2024-08-06 05:57:12,440 INFO [trainer.py:765] (0/8) Epoch 16, batch 600, train_loss[loss=2.775, ArTop10Accuracy=0.7635, over 11534.00 frames. ], tot_loss[loss=2.834, ArTop10Accuracy=0.7452, over 11429.74 frames. ], batch size: 18, lr: 7.54e-03 2024-08-06 05:58:00,040 INFO [trainer.py:765] (0/8) Epoch 16, batch 700, train_loss[loss=2.576, ArTop10Accuracy=0.784, over 9930.00 frames. ], tot_loss[loss=2.838, ArTop10Accuracy=0.7444, over 11584.59 frames. ], batch size: 12, lr: 7.52e-03 2024-08-06 05:58:34,024 INFO [trainer.py:765] (0/8) Epoch 16, batch 800, train_loss[loss=2.774, ArTop10Accuracy=0.7459, over 10149.00 frames. ], tot_loss[loss=2.848, ArTop10Accuracy=0.7424, over 11688.61 frames. ], batch size: 12, lr: 7.50e-03 2024-08-06 05:58:41,568 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-20000.pt 2024-08-06 05:58:44,584 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 05:58:50,426 INFO [trainer.py:811] (0/8) Epoch 16, validation: loss=2.915, ArTop10Accuracy=0.7338, over 1829298.00 frames. 2024-08-06 05:58:50,426 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30166MB 2024-08-06 05:58:50,730 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.121e+02 1.335e+02 1.445e+02 1.570e+02 3.252e+02, threshold=2.890e+02, percent-clipped=0.1 2024-08-06 05:59:14,321 INFO [trainer.py:765] (0/8) Epoch 16, batch 900, train_loss[loss=2.881, ArTop10Accuracy=0.7322, over 12900.00 frames. ], tot_loss[loss=2.842, ArTop10Accuracy=0.7434, over 11731.10 frames. ], batch size: 27, lr: 7.49e-03 2024-08-06 05:59:45,915 INFO [trainer.py:765] (0/8) Epoch 16, batch 1000, train_loss[loss=2.944, ArTop10Accuracy=0.7268, over 12922.00 frames. ], tot_loss[loss=2.852, ArTop10Accuracy=0.7421, over 11938.72 frames. ], batch size: 27, lr: 7.47e-03 2024-08-06 06:00:17,091 INFO [trainer.py:765] (0/8) Epoch 16, batch 1100, train_loss[loss=2.844, ArTop10Accuracy=0.7466, over 13655.00 frames. ], tot_loss[loss=2.856, ArTop10Accuracy=0.7411, over 11990.87 frames. ], batch size: 34, lr: 7.45e-03 2024-08-06 06:00:47,464 INFO [trainer.py:765] (0/8) Epoch 16, batch 1200, train_loss[loss=3.016, ArTop10Accuracy=0.7036, over 12054.00 frames. ], tot_loss[loss=2.857, ArTop10Accuracy=0.7408, over 11926.29 frames. ], batch size: 98, lr: 7.43e-03 2024-08-06 06:01:13,238 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 06:01:13,241 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-16.pt 2024-08-06 06:02:27,260 INFO [trainer.py:765] (0/8) Epoch 17, batch 100, train_loss[loss=2.87, ArTop10Accuracy=0.7406, over 14582.00 frames. ], tot_loss[loss=2.826, ArTop10Accuracy=0.7477, over 4775.99 frames. ], batch size: 61, lr: 7.18e-03 2024-08-06 06:03:11,850 INFO [trainer.py:765] (0/8) Epoch 17, batch 200, train_loss[loss=2.891, ArTop10Accuracy=0.7364, over 13562.00 frames. ], tot_loss[loss=2.825, ArTop10Accuracy=0.7477, over 7777.46 frames. ], batch size: 34, lr: 7.17e-03 2024-08-06 06:03:57,502 INFO [trainer.py:765] (0/8) Epoch 17, batch 300, train_loss[loss=2.914, ArTop10Accuracy=0.7359, over 14401.00 frames. ], tot_loss[loss=2.821, ArTop10Accuracy=0.7486, over 9401.38 frames. ], batch size: 44, lr: 7.15e-03 2024-08-06 06:04:42,838 INFO [trainer.py:765] (0/8) Epoch 17, batch 400, train_loss[loss=2.705, ArTop10Accuracy=0.7711, over 10744.00 frames. ], tot_loss[loss=2.821, ArTop10Accuracy=0.7486, over 10325.29 frames. ], batch size: 15, lr: 7.13e-03 2024-08-06 06:05:29,004 INFO [trainer.py:765] (0/8) Epoch 17, batch 500, train_loss[loss=2.898, ArTop10Accuracy=0.7378, over 12485.00 frames. ], tot_loss[loss=2.82, ArTop10Accuracy=0.7484, over 10884.29 frames. ], batch size: 22, lr: 7.12e-03 2024-08-06 06:05:45,317 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-21000.pt 2024-08-06 06:05:49,550 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.142e+02 1.359e+02 1.445e+02 1.551e+02 2.741e+02, threshold=2.891e+02, percent-clipped=0.0 2024-08-06 06:06:20,723 INFO [trainer.py:765] (0/8) Epoch 17, batch 600, train_loss[loss=2.821, ArTop10Accuracy=0.7406, over 11623.00 frames. ], tot_loss[loss=2.826, ArTop10Accuracy=0.7471, over 11411.79 frames. ], batch size: 18, lr: 7.10e-03 2024-08-06 06:07:04,694 INFO [trainer.py:765] (0/8) Epoch 17, batch 700, train_loss[loss=2.79, ArTop10Accuracy=0.7534, over 10259.00 frames. ], tot_loss[loss=2.832, ArTop10Accuracy=0.7458, over 11559.16 frames. ], batch size: 12, lr: 7.09e-03 2024-08-06 06:07:44,896 INFO [trainer.py:765] (0/8) Epoch 17, batch 800, train_loss[loss=2.866, ArTop10Accuracy=0.7368, over 9507.00 frames. ], tot_loss[loss=2.838, ArTop10Accuracy=0.7445, over 11664.29 frames. ], batch size: 11, lr: 7.07e-03 2024-08-06 06:08:16,384 INFO [trainer.py:765] (0/8) Epoch 17, batch 900, train_loss[loss=2.853, ArTop10Accuracy=0.7414, over 13042.00 frames. ], tot_loss[loss=2.83, ArTop10Accuracy=0.7463, over 11738.31 frames. ], batch size: 27, lr: 7.05e-03 2024-08-06 06:08:47,994 INFO [trainer.py:765] (0/8) Epoch 17, batch 1000, train_loss[loss=2.792, ArTop10Accuracy=0.7535, over 12989.00 frames. ], tot_loss[loss=2.835, ArTop10Accuracy=0.7452, over 11937.82 frames. ], batch size: 27, lr: 7.04e-03 2024-08-06 06:09:19,134 INFO [trainer.py:765] (0/8) Epoch 17, batch 1100, train_loss[loss=2.841, ArTop10Accuracy=0.7416, over 13771.00 frames. ], tot_loss[loss=2.848, ArTop10Accuracy=0.7425, over 11988.67 frames. ], batch size: 34, lr: 7.02e-03 2024-08-06 06:09:49,444 INFO [trainer.py:765] (0/8) Epoch 17, batch 1200, train_loss[loss=2.999, ArTop10Accuracy=0.7195, over 12179.00 frames. ], tot_loss[loss=2.847, ArTop10Accuracy=0.7425, over 11931.79 frames. ], batch size: 99, lr: 7.01e-03 2024-08-06 06:10:14,221 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 06:10:14,224 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-17.pt 2024-08-06 06:11:23,102 INFO [trainer.py:765] (0/8) Epoch 18, batch 100, train_loss[loss=2.853, ArTop10Accuracy=0.7415, over 14643.00 frames. ], tot_loss[loss=2.815, ArTop10Accuracy=0.7502, over 4796.36 frames. ], batch size: 61, lr: 6.78e-03 2024-08-06 06:12:16,260 INFO [trainer.py:765] (0/8) Epoch 18, batch 200, train_loss[loss=2.882, ArTop10Accuracy=0.7403, over 13651.00 frames. ], tot_loss[loss=2.813, ArTop10Accuracy=0.7505, over 7788.10 frames. ], batch size: 34, lr: 6.77e-03 2024-08-06 06:12:40,318 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-22000.pt 2024-08-06 06:12:43,327 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 06:12:48,991 INFO [trainer.py:811] (0/8) Epoch 18, validation: loss=2.916, ArTop10Accuracy=0.7343, over 1829298.00 frames. 2024-08-06 06:12:48,992 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30166MB 2024-08-06 06:12:49,335 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.163e+02 1.377e+02 1.476e+02 1.588e+02 2.450e+02, threshold=2.952e+02, percent-clipped=0.0 2024-08-06 06:13:07,116 INFO [trainer.py:765] (0/8) Epoch 18, batch 300, train_loss[loss=2.888, ArTop10Accuracy=0.7327, over 14308.00 frames. ], tot_loss[loss=2.807, ArTop10Accuracy=0.7513, over 9413.99 frames. ], batch size: 44, lr: 6.75e-03 2024-08-06 06:13:54,098 INFO [trainer.py:765] (0/8) Epoch 18, batch 400, train_loss[loss=2.751, ArTop10Accuracy=0.7596, over 10282.00 frames. ], tot_loss[loss=2.812, ArTop10Accuracy=0.75, over 10327.09 frames. ], batch size: 14, lr: 6.74e-03 2024-08-06 06:14:38,488 INFO [trainer.py:765] (0/8) Epoch 18, batch 500, train_loss[loss=2.848, ArTop10Accuracy=0.744, over 12247.00 frames. ], tot_loss[loss=2.811, ArTop10Accuracy=0.7498, over 10876.52 frames. ], batch size: 22, lr: 6.73e-03 2024-08-06 06:15:23,628 INFO [trainer.py:765] (0/8) Epoch 18, batch 600, train_loss[loss=2.561, ArTop10Accuracy=0.7949, over 11489.00 frames. ], tot_loss[loss=2.818, ArTop10Accuracy=0.7481, over 11411.90 frames. ], batch size: 18, lr: 6.71e-03 2024-08-06 06:16:17,342 INFO [trainer.py:765] (0/8) Epoch 18, batch 700, train_loss[loss=2.7, ArTop10Accuracy=0.7757, over 10169.00 frames. ], tot_loss[loss=2.821, ArTop10Accuracy=0.7477, over 11561.13 frames. ], batch size: 12, lr: 6.70e-03 2024-08-06 06:16:51,428 INFO [trainer.py:765] (0/8) Epoch 18, batch 800, train_loss[loss=2.759, ArTop10Accuracy=0.7525, over 10039.00 frames. ], tot_loss[loss=2.83, ArTop10Accuracy=0.7461, over 11679.50 frames. ], batch size: 12, lr: 6.68e-03 2024-08-06 06:17:22,912 INFO [trainer.py:765] (0/8) Epoch 18, batch 900, train_loss[loss=2.808, ArTop10Accuracy=0.7512, over 12826.00 frames. ], tot_loss[loss=2.822, ArTop10Accuracy=0.7476, over 11727.26 frames. ], batch size: 27, lr: 6.67e-03 2024-08-06 06:17:54,529 INFO [trainer.py:765] (0/8) Epoch 18, batch 1000, train_loss[loss=2.921, ArTop10Accuracy=0.7333, over 12980.00 frames. ], tot_loss[loss=2.83, ArTop10Accuracy=0.7462, over 11940.48 frames. ], batch size: 27, lr: 6.65e-03 2024-08-06 06:18:25,663 INFO [trainer.py:765] (0/8) Epoch 18, batch 1100, train_loss[loss=2.845, ArTop10Accuracy=0.7458, over 13691.00 frames. ], tot_loss[loss=2.839, ArTop10Accuracy=0.7444, over 11993.81 frames. ], batch size: 34, lr: 6.64e-03 2024-08-06 06:18:55,971 INFO [trainer.py:765] (0/8) Epoch 18, batch 1200, train_loss[loss=2.972, ArTop10Accuracy=0.7211, over 12298.00 frames. ], tot_loss[loss=2.835, ArTop10Accuracy=0.7449, over 11951.69 frames. ], batch size: 97, lr: 6.63e-03 2024-08-06 06:19:16,340 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-23000.pt 2024-08-06 06:19:19,163 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.178e+02 1.387e+02 1.492e+02 1.607e+02 2.982e+02, threshold=2.983e+02, percent-clipped=0.1 2024-08-06 06:19:23,631 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 06:19:23,635 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-18.pt 2024-08-06 06:20:29,728 INFO [trainer.py:765] (0/8) Epoch 19, batch 100, train_loss[loss=2.852, ArTop10Accuracy=0.7432, over 14291.00 frames. ], tot_loss[loss=2.82, ArTop10Accuracy=0.7494, over 4776.18 frames. ], batch size: 61, lr: 6.43e-03 2024-08-06 06:21:11,275 INFO [trainer.py:765] (0/8) Epoch 19, batch 200, train_loss[loss=2.814, ArTop10Accuracy=0.7541, over 13811.00 frames. ], tot_loss[loss=2.804, ArTop10Accuracy=0.7521, over 7782.53 frames. ], batch size: 34, lr: 6.41e-03 2024-08-06 06:21:56,079 INFO [trainer.py:765] (0/8) Epoch 19, batch 300, train_loss[loss=2.842, ArTop10Accuracy=0.7431, over 14254.00 frames. ], tot_loss[loss=2.805, ArTop10Accuracy=0.7518, over 9431.80 frames. ], batch size: 44, lr: 6.40e-03 2024-08-06 06:22:36,013 INFO [trainer.py:765] (0/8) Epoch 19, batch 400, train_loss[loss=2.606, ArTop10Accuracy=0.7849, over 11002.00 frames. ], tot_loss[loss=2.801, ArTop10Accuracy=0.7524, over 10338.69 frames. ], batch size: 15, lr: 6.39e-03 2024-08-06 06:23:18,998 INFO [trainer.py:765] (0/8) Epoch 19, batch 500, train_loss[loss=2.852, ArTop10Accuracy=0.7431, over 12222.00 frames. ], tot_loss[loss=2.798, ArTop10Accuracy=0.7529, over 10889.96 frames. ], batch size: 22, lr: 6.37e-03 2024-08-06 06:24:03,685 INFO [trainer.py:765] (0/8) Epoch 19, batch 600, train_loss[loss=2.907, ArTop10Accuracy=0.73, over 11542.00 frames. ], tot_loss[loss=2.804, ArTop10Accuracy=0.7514, over 11420.59 frames. ], batch size: 18, lr: 6.36e-03 2024-08-06 06:24:46,186 INFO [trainer.py:765] (0/8) Epoch 19, batch 700, train_loss[loss=2.787, ArTop10Accuracy=0.758, over 10189.00 frames. ], tot_loss[loss=2.808, ArTop10Accuracy=0.7504, over 11562.39 frames. ], batch size: 12, lr: 6.35e-03 2024-08-06 06:25:22,355 INFO [trainer.py:765] (0/8) Epoch 19, batch 800, train_loss[loss=2.902, ArTop10Accuracy=0.739, over 9174.00 frames. ], tot_loss[loss=2.817, ArTop10Accuracy=0.7488, over 11675.84 frames. ], batch size: 11, lr: 6.33e-03 2024-08-06 06:25:53,625 INFO [trainer.py:765] (0/8) Epoch 19, batch 900, train_loss[loss=2.827, ArTop10Accuracy=0.7457, over 12958.00 frames. ], tot_loss[loss=2.816, ArTop10Accuracy=0.7489, over 11720.74 frames. ], batch size: 27, lr: 6.32e-03 2024-08-06 06:26:21,773 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-24000.pt 2024-08-06 06:26:24,787 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 06:26:30,765 INFO [trainer.py:811] (0/8) Epoch 19, validation: loss=2.918, ArTop10Accuracy=0.733, over 1829298.00 frames. 2024-08-06 06:26:30,765 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 33183MB 2024-08-06 06:26:31,053 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.198e+02 1.416e+02 1.525e+02 1.662e+02 2.849e+02, threshold=3.050e+02, percent-clipped=0.0 2024-08-06 06:26:34,030 INFO [trainer.py:765] (0/8) Epoch 19, batch 1000, train_loss[loss=2.748, ArTop10Accuracy=0.7586, over 13149.00 frames. ], tot_loss[loss=2.82, ArTop10Accuracy=0.7478, over 11942.62 frames. ], batch size: 27, lr: 6.31e-03 2024-08-06 06:27:05,189 INFO [trainer.py:765] (0/8) Epoch 19, batch 1100, train_loss[loss=2.835, ArTop10Accuracy=0.7438, over 13647.00 frames. ], tot_loss[loss=2.827, ArTop10Accuracy=0.7467, over 12013.00 frames. ], batch size: 34, lr: 6.30e-03 2024-08-06 06:27:35,453 INFO [trainer.py:765] (0/8) Epoch 19, batch 1200, train_loss[loss=3.08, ArTop10Accuracy=0.7029, over 11853.00 frames. ], tot_loss[loss=2.826, ArTop10Accuracy=0.7469, over 11931.73 frames. ], batch size: 99, lr: 6.28e-03 2024-08-06 06:28:00,683 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 06:28:00,687 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-19.pt 2024-08-06 06:29:08,985 INFO [trainer.py:765] (0/8) Epoch 20, batch 100, train_loss[loss=2.85, ArTop10Accuracy=0.7455, over 14495.00 frames. ], tot_loss[loss=2.792, ArTop10Accuracy=0.7549, over 4779.37 frames. ], batch size: 61, lr: 6.10e-03 2024-08-06 06:29:50,318 INFO [trainer.py:765] (0/8) Epoch 20, batch 200, train_loss[loss=2.777, ArTop10Accuracy=0.762, over 13551.00 frames. ], tot_loss[loss=2.788, ArTop10Accuracy=0.7547, over 7790.64 frames. ], batch size: 34, lr: 6.09e-03 2024-08-06 06:30:37,106 INFO [trainer.py:765] (0/8) Epoch 20, batch 300, train_loss[loss=2.88, ArTop10Accuracy=0.7345, over 14548.00 frames. ], tot_loss[loss=2.786, ArTop10Accuracy=0.755, over 9412.14 frames. ], batch size: 45, lr: 6.08e-03 2024-08-06 06:31:16,353 INFO [trainer.py:765] (0/8) Epoch 20, batch 400, train_loss[loss=2.728, ArTop10Accuracy=0.7665, over 10879.00 frames. ], tot_loss[loss=2.785, ArTop10Accuracy=0.7552, over 10334.05 frames. ], batch size: 15, lr: 6.07e-03 2024-08-06 06:32:03,759 INFO [trainer.py:765] (0/8) Epoch 20, batch 500, train_loss[loss=2.734, ArTop10Accuracy=0.7637, over 12180.00 frames. ], tot_loss[loss=2.784, ArTop10Accuracy=0.755, over 10892.48 frames. ], batch size: 22, lr: 6.05e-03 2024-08-06 06:32:43,357 INFO [trainer.py:765] (0/8) Epoch 20, batch 600, train_loss[loss=2.778, ArTop10Accuracy=0.7586, over 11337.00 frames. ], tot_loss[loss=2.787, ArTop10Accuracy=0.754, over 11419.54 frames. ], batch size: 18, lr: 6.04e-03 2024-08-06 06:33:36,751 INFO [trainer.py:765] (0/8) Epoch 20, batch 700, train_loss[loss=2.755, ArTop10Accuracy=0.7634, over 10040.00 frames. ], tot_loss[loss=2.799, ArTop10Accuracy=0.7522, over 11566.56 frames. ], batch size: 12, lr: 6.03e-03 2024-08-06 06:33:40,691 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-25000.pt 2024-08-06 06:33:43,829 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.196e+02 1.417e+02 1.526e+02 1.639e+02 3.791e+02, threshold=3.052e+02, percent-clipped=0.1 2024-08-06 06:34:13,304 INFO [trainer.py:765] (0/8) Epoch 20, batch 800, train_loss[loss=2.725, ArTop10Accuracy=0.7659, over 10180.00 frames. ], tot_loss[loss=2.804, ArTop10Accuracy=0.7513, over 11700.04 frames. ], batch size: 12, lr: 6.02e-03 2024-08-06 06:34:44,580 INFO [trainer.py:765] (0/8) Epoch 20, batch 900, train_loss[loss=2.898, ArTop10Accuracy=0.739, over 13040.00 frames. ], tot_loss[loss=2.798, ArTop10Accuracy=0.7525, over 11747.01 frames. ], batch size: 27, lr: 6.01e-03 2024-08-06 06:35:16,138 INFO [trainer.py:765] (0/8) Epoch 20, batch 1000, train_loss[loss=2.837, ArTop10Accuracy=0.742, over 13147.00 frames. ], tot_loss[loss=2.801, ArTop10Accuracy=0.7517, over 11942.69 frames. ], batch size: 27, lr: 6.00e-03 2024-08-06 06:35:47,214 INFO [trainer.py:765] (0/8) Epoch 20, batch 1100, train_loss[loss=2.8, ArTop10Accuracy=0.7521, over 13935.00 frames. ], tot_loss[loss=2.811, ArTop10Accuracy=0.7498, over 12009.72 frames. ], batch size: 34, lr: 5.99e-03 2024-08-06 06:36:17,438 INFO [trainer.py:765] (0/8) Epoch 20, batch 1200, train_loss[loss=2.939, ArTop10Accuracy=0.7217, over 12322.00 frames. ], tot_loss[loss=2.815, ArTop10Accuracy=0.749, over 11955.80 frames. ], batch size: 97, lr: 5.97e-03 2024-08-06 06:36:42,678 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 06:36:42,681 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-20.pt 2024-08-06 06:36:48,334 INFO [trainer.py:1069] (0/8) Done!