2024-08-06 08:06:14,316 INFO [trainer.py:870] (0/8) Training started 2024-08-06 08:06:14,320 INFO [trainer.py:889] (0/8) Device: cuda:0 2024-08-06 08:06:14,320 INFO [trainer.py:890] (0/8) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 100, 'reset_interval': 200, 'valid_interval': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '279b0c87015a615b81b147251814d737a548f397', 'k2-git-date': 'Wed May 24 22:24:09 2023', 'lhotse-version': '1.26.0', 'torch-version': '2.0.1+cu118', 'torch-cuda-available': True, 'torch-cuda-version': '11.8', 'python-version': '3.10', 'icefall-git-branch': None, 'icefall-git-sha1': None, 'icefall-git-date': None, 'icefall-path': '/workspace/icefall_llm', 'k2-path': '/usr/local/lib/python3.10/dist-packages/k2/__init__.py', 'lhotse-path': '/usr/local/lib/python3.10/dist-packages/lhotse/__init__.py', 'hostname': '6867463', 'IP address': '0.104.202.7'}, 'world_size': 8, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 20, 'start_epoch': 1, 'start_batch': 0, 'exp_dir': PosixPath('exp/valle'), 'optimizer_name': 'ScaledAdam', 'scheduler_name': 'Eden', 'base_lr': 0.03, 'warmup_steps': 200, 'seed': 42, 'inf_check': False, 'save_every_n': 20000, 'keep_last_k': 20, 'average_period': 0, 'accumulate_grad_steps': 1, 'dtype': 'bfloat16', 'filter_min_duration': 0.5, 'filter_max_duration': 14.0, 'train_stage': 1, 'visualize': False, 'oom_check': False, 'model_name': 'valle', 'decoder_dim': 1024, 'nhead': 16, 'num_decoder_layers': 12, 'scale_factor': 1.0, 'norm_first': True, 'add_prenet': False, 'prefix_mode': 1, 'share_embedding': True, 'prepend_bos': False, 'num_quantizers': 8, 'scaling_xformers': False, 'manifest_dir': PosixPath('data/tokenized'), 'max_duration': 320, 'bucketing_sampler': True, 'num_buckets': 6, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 0.1, 'on_the_fly_feats': False, 'shuffle': True, 'buffer_size': 40000, 'shuffle_buffer_size': 100000, 'drop_last': False, 'return_cuts': True, 'num_workers': 8, 'enable_spec_aug': False, 'spec_aug_time_warp_factor': 80, 'input_strategy': 'PrecomputedFeatures', 'dataset': 'libritts', 'text_tokens': 'data/tokenized/unique_text_tokens.k2symbols', 'sampling_rate': 24000} 2024-08-06 08:06:14,320 INFO [trainer.py:892] (0/8) About to create model 2024-08-06 08:06:15,058 INFO [trainer.py:899] (0/8) Number of model parameters: 367386628 2024-08-06 08:06:16,197 INFO [trainer.py:914] (0/8) Using DDP 2024-08-06 08:06:19,148 INFO [datamodule.py:427] (0/8) About to get train cuts 2024-08-06 08:06:19,149 INFO [datamodule.py:434] (0/8) About to get dev cuts 2024-08-06 08:06:19,151 INFO [datamodule.py:292] (0/8) Disable SpecAugment 2024-08-06 08:06:19,151 INFO [datamodule.py:294] (0/8) About to create train dataset 2024-08-06 08:06:19,152 INFO [datamodule.py:323] (0/8) Using DynamicBucketingSampler 2024-08-06 08:06:19,772 INFO [datamodule.py:344] (0/8) About to create train dataloader 2024-08-06 08:06:19,772 INFO [datamodule.py:367] (0/8) About to create dev dataset 2024-08-06 08:06:20,101 INFO [datamodule.py:388] (0/8) About to create dev dataloader 2024-08-06 08:08:02,122 INFO [trainer.py:765] (0/8) Epoch 1, batch 100, train_loss[loss=4.278, ArTop10Accuracy=0.5092, over 14148.00 frames. ], tot_loss[loss=5.044, ArTop10Accuracy=0.3756, over 4763.57 frames. ], batch size: 62, lr: 2.25e-02 2024-08-06 08:09:28,829 INFO [trainer.py:765] (0/8) Epoch 1, batch 200, train_loss[loss=4.012, ArTop10Accuracy=0.5509, over 13728.00 frames. ], tot_loss[loss=4.485, ArTop10Accuracy=0.4688, over 7742.94 frames. ], batch size: 34, lr: 3.00e-02 2024-08-06 08:10:52,430 INFO [trainer.py:765] (0/8) Epoch 1, batch 300, train_loss[loss=3.906, ArTop10Accuracy=0.5625, over 14160.00 frames. ], tot_loss[loss=4.212, ArTop10Accuracy=0.5139, over 9378.07 frames. ], batch size: 44, lr: 3.00e-02 2024-08-06 08:12:12,698 INFO [trainer.py:765] (0/8) Epoch 1, batch 400, train_loss[loss=3.687, ArTop10Accuracy=0.6066, over 10701.00 frames. ], tot_loss[loss=4.03, ArTop10Accuracy=0.5447, over 10279.72 frames. ], batch size: 15, lr: 3.00e-02 2024-08-06 08:13:40,049 INFO [trainer.py:765] (0/8) Epoch 1, batch 500, train_loss[loss=3.62, ArTop10Accuracy=0.6184, over 12078.00 frames. ], tot_loss[loss=3.882, ArTop10Accuracy=0.5706, over 10862.69 frames. ], batch size: 22, lr: 2.99e-02 2024-08-06 08:15:00,242 INFO [trainer.py:765] (0/8) Epoch 1, batch 600, train_loss[loss=3.472, ArTop10Accuracy=0.6423, over 11520.00 frames. ], tot_loss[loss=3.767, ArTop10Accuracy=0.5908, over 11393.62 frames. ], batch size: 18, lr: 2.99e-02 2024-08-06 08:16:26,423 INFO [trainer.py:765] (0/8) Epoch 1, batch 700, train_loss[loss=3.358, ArTop10Accuracy=0.672, over 10137.00 frames. ], tot_loss[loss=3.689, ArTop10Accuracy=0.6048, over 11532.89 frames. ], batch size: 12, lr: 2.99e-02 2024-08-06 08:17:43,016 INFO [trainer.py:765] (0/8) Epoch 1, batch 800, train_loss[loss=3.327, ArTop10Accuracy=0.6665, over 10185.00 frames. ], tot_loss[loss=3.625, ArTop10Accuracy=0.6163, over 11642.41 frames. ], batch size: 12, lr: 2.98e-02 2024-08-06 08:18:56,150 INFO [trainer.py:765] (0/8) Epoch 1, batch 900, train_loss[loss=3.441, ArTop10Accuracy=0.6465, over 12888.00 frames. ], tot_loss[loss=3.567, ArTop10Accuracy=0.6273, over 11696.36 frames. ], batch size: 27, lr: 2.98e-02 2024-08-06 08:20:12,861 INFO [trainer.py:765] (0/8) Epoch 1, batch 1000, train_loss[loss=3.339, ArTop10Accuracy=0.6726, over 13041.00 frames. ], tot_loss[loss=3.523, ArTop10Accuracy=0.6352, over 11871.89 frames. ], batch size: 27, lr: 2.97e-02 2024-08-06 08:20:13,538 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 9.300e+01 1.871e+02 2.675e+02 4.030e+02 9.119e+03, threshold=5.351e+02, percent-clipped=0.0 2024-08-06 08:21:29,154 INFO [trainer.py:765] (0/8) Epoch 1, batch 1100, train_loss[loss=3.447, ArTop10Accuracy=0.6526, over 13650.00 frames. ], tot_loss[loss=3.488, ArTop10Accuracy=0.6416, over 11935.79 frames. ], batch size: 34, lr: 2.96e-02 2024-08-06 08:22:45,413 INFO [trainer.py:765] (0/8) Epoch 1, batch 1200, train_loss[loss=3.399, ArTop10Accuracy=0.6582, over 12078.00 frames. ], tot_loss[loss=3.456, ArTop10Accuracy=0.6476, over 11841.59 frames. ], batch size: 101, lr: 2.96e-02 2024-08-06 08:23:45,268 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 08:23:45,272 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-1.pt 2024-08-06 08:25:36,238 INFO [trainer.py:765] (0/8) Epoch 2, batch 100, train_loss[loss=3.464, ArTop10Accuracy=0.6425, over 14586.00 frames. ], tot_loss[loss=3.422, ArTop10Accuracy=0.6529, over 4763.48 frames. ], batch size: 62, lr: 2.90e-02 2024-08-06 08:26:58,957 INFO [trainer.py:765] (0/8) Epoch 2, batch 200, train_loss[loss=3.368, ArTop10Accuracy=0.6675, over 13398.00 frames. ], tot_loss[loss=3.39, ArTop10Accuracy=0.6587, over 7742.95 frames. ], batch size: 34, lr: 2.89e-02 2024-08-06 08:28:25,534 INFO [trainer.py:765] (0/8) Epoch 2, batch 300, train_loss[loss=3.316, ArTop10Accuracy=0.6735, over 14268.00 frames. ], tot_loss[loss=3.369, ArTop10Accuracy=0.6629, over 9373.85 frames. ], batch size: 44, lr: 2.89e-02 2024-08-06 08:29:48,638 INFO [trainer.py:765] (0/8) Epoch 2, batch 400, train_loss[loss=3.324, ArTop10Accuracy=0.6711, over 10158.00 frames. ], tot_loss[loss=3.357, ArTop10Accuracy=0.6656, over 10297.01 frames. ], batch size: 14, lr: 2.88e-02 2024-08-06 08:31:22,899 INFO [trainer.py:765] (0/8) Epoch 2, batch 500, train_loss[loss=3.281, ArTop10Accuracy=0.6779, over 12660.00 frames. ], tot_loss[loss=3.346, ArTop10Accuracy=0.6677, over 10864.71 frames. ], batch size: 23, lr: 2.87e-02 2024-08-06 08:32:45,689 INFO [trainer.py:765] (0/8) Epoch 2, batch 600, train_loss[loss=3.325, ArTop10Accuracy=0.6746, over 11358.00 frames. ], tot_loss[loss=3.334, ArTop10Accuracy=0.6699, over 11387.56 frames. ], batch size: 18, lr: 2.86e-02 2024-08-06 08:34:13,583 INFO [trainer.py:765] (0/8) Epoch 2, batch 700, train_loss[loss=3.228, ArTop10Accuracy=0.6954, over 10290.00 frames. ], tot_loss[loss=3.325, ArTop10Accuracy=0.6716, over 11532.16 frames. ], batch size: 12, lr: 2.85e-02 2024-08-06 08:34:31,174 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 08:34:40,887 INFO [trainer.py:811] (0/8) Epoch 2, validation: loss=3.277, ArTop10Accuracy=0.6803, over 1827537.00 frames. 2024-08-06 08:34:40,888 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 28892MB 2024-08-06 08:34:41,700 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 7.953e+01 1.592e+02 2.200e+02 3.344e+02 2.949e+03, threshold=4.400e+02, percent-clipped=8.6 2024-08-06 08:35:39,879 INFO [trainer.py:765] (0/8) Epoch 2, batch 800, train_loss[loss=3.349, ArTop10Accuracy=0.6662, over 10188.00 frames. ], tot_loss[loss=3.321, ArTop10Accuracy=0.6725, over 11635.67 frames. ], batch size: 12, lr: 2.84e-02 2024-08-06 08:36:56,371 INFO [trainer.py:765] (0/8) Epoch 2, batch 900, train_loss[loss=3.325, ArTop10Accuracy=0.6663, over 12789.00 frames. ], tot_loss[loss=3.308, ArTop10Accuracy=0.6752, over 11673.27 frames. ], batch size: 27, lr: 2.83e-02 2024-08-06 08:38:10,512 INFO [trainer.py:765] (0/8) Epoch 2, batch 1000, train_loss[loss=3.315, ArTop10Accuracy=0.6742, over 12846.00 frames. ], tot_loss[loss=3.298, ArTop10Accuracy=0.677, over 11874.71 frames. ], batch size: 27, lr: 2.82e-02 2024-08-06 08:39:25,060 INFO [trainer.py:765] (0/8) Epoch 2, batch 1100, train_loss[loss=3.25, ArTop10Accuracy=0.6829, over 13677.00 frames. ], tot_loss[loss=3.29, ArTop10Accuracy=0.6784, over 11958.14 frames. ], batch size: 34, lr: 2.81e-02 2024-08-06 08:40:38,220 INFO [trainer.py:765] (0/8) Epoch 2, batch 1200, train_loss[loss=3.32, ArTop10Accuracy=0.6759, over 12444.00 frames. ], tot_loss[loss=3.281, ArTop10Accuracy=0.6801, over 11872.51 frames. ], batch size: 103, lr: 2.80e-02 2024-08-06 08:41:38,460 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 08:41:38,463 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-2.pt 2024-08-06 08:43:36,651 INFO [trainer.py:765] (0/8) Epoch 3, batch 100, train_loss[loss=3.221, ArTop10Accuracy=0.6956, over 14391.00 frames. ], tot_loss[loss=3.251, ArTop10Accuracy=0.6852, over 4763.04 frames. ], batch size: 63, lr: 2.67e-02 2024-08-06 08:45:10,502 INFO [trainer.py:765] (0/8) Epoch 3, batch 200, train_loss[loss=3.149, ArTop10Accuracy=0.7102, over 13716.00 frames. ], tot_loss[loss=3.222, ArTop10Accuracy=0.6905, over 7746.96 frames. ], batch size: 34, lr: 2.66e-02 2024-08-06 08:46:29,258 INFO [trainer.py:765] (0/8) Epoch 3, batch 300, train_loss[loss=3.237, ArTop10Accuracy=0.6852, over 14136.00 frames. ], tot_loss[loss=3.205, ArTop10Accuracy=0.6942, over 9365.48 frames. ], batch size: 44, lr: 2.64e-02 2024-08-06 08:48:04,219 INFO [trainer.py:765] (0/8) Epoch 3, batch 400, train_loss[loss=3.123, ArTop10Accuracy=0.7122, over 10929.00 frames. ], tot_loss[loss=3.19, ArTop10Accuracy=0.6973, over 10272.52 frames. ], batch size: 15, lr: 2.63e-02 2024-08-06 08:48:40,881 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 9.282e+01 1.561e+02 1.981e+02 2.686e+02 1.768e+03, threshold=3.962e+02, percent-clipped=7.6 2024-08-06 08:49:25,542 INFO [trainer.py:765] (0/8) Epoch 3, batch 500, train_loss[loss=3.102, ArTop10Accuracy=0.715, over 12600.00 frames. ], tot_loss[loss=3.169, ArTop10Accuracy=0.7016, over 10831.61 frames. ], batch size: 23, lr: 2.62e-02 2024-08-06 08:51:00,477 INFO [trainer.py:765] (0/8) Epoch 3, batch 600, train_loss[loss=3.068, ArTop10Accuracy=0.7223, over 11331.00 frames. ], tot_loss[loss=3.154, ArTop10Accuracy=0.7042, over 11373.76 frames. ], batch size: 18, lr: 2.61e-02 2024-08-06 08:52:31,618 INFO [trainer.py:765] (0/8) Epoch 3, batch 700, train_loss[loss=3.132, ArTop10Accuracy=0.7058, over 9480.00 frames. ], tot_loss[loss=3.145, ArTop10Accuracy=0.7061, over 11509.58 frames. ], batch size: 11, lr: 2.60e-02 2024-08-06 08:53:57,388 INFO [trainer.py:765] (0/8) Epoch 3, batch 800, train_loss[loss=3.117, ArTop10Accuracy=0.7134, over 9261.00 frames. ], tot_loss[loss=3.138, ArTop10Accuracy=0.7073, over 11647.27 frames. ], batch size: 11, lr: 2.59e-02 2024-08-06 08:55:15,118 INFO [trainer.py:765] (0/8) Epoch 3, batch 900, train_loss[loss=3.036, ArTop10Accuracy=0.7285, over 12813.00 frames. ], tot_loss[loss=3.12, ArTop10Accuracy=0.7107, over 11687.90 frames. ], batch size: 27, lr: 2.57e-02 2024-08-06 08:56:31,558 INFO [trainer.py:765] (0/8) Epoch 3, batch 1000, train_loss[loss=3.046, ArTop10Accuracy=0.725, over 13272.00 frames. ], tot_loss[loss=3.111, ArTop10Accuracy=0.7124, over 11871.56 frames. ], batch size: 28, lr: 2.56e-02 2024-08-06 08:57:46,506 INFO [trainer.py:765] (0/8) Epoch 3, batch 1100, train_loss[loss=2.998, ArTop10Accuracy=0.7314, over 13731.00 frames. ], tot_loss[loss=3.104, ArTop10Accuracy=0.7135, over 11943.01 frames. ], batch size: 34, lr: 2.55e-02 2024-08-06 08:59:01,400 INFO [trainer.py:765] (0/8) Epoch 3, batch 1200, train_loss[loss=3.119, ArTop10Accuracy=0.7097, over 12366.00 frames. ], tot_loss[loss=3.097, ArTop10Accuracy=0.7148, over 11874.89 frames. ], batch size: 101, lr: 2.54e-02 2024-08-06 09:00:02,053 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 09:00:02,056 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-3.pt 2024-08-06 09:01:50,742 INFO [trainer.py:765] (0/8) Epoch 4, batch 100, train_loss[loss=3.096, ArTop10Accuracy=0.7157, over 14289.00 frames. ], tot_loss[loss=3.065, ArTop10Accuracy=0.72, over 4767.49 frames. ], batch size: 63, lr: 2.38e-02 2024-08-06 09:02:52,858 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 09:03:02,384 INFO [trainer.py:811] (0/8) Epoch 4, validation: loss=2.997, ArTop10Accuracy=0.7338, over 1827537.00 frames. 2024-08-06 09:03:02,384 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 29208MB 2024-08-06 09:03:03,364 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.072e+02 1.499e+02 1.782e+02 2.273e+02 1.100e+03, threshold=3.565e+02, percent-clipped=4.7 2024-08-06 09:03:29,274 INFO [trainer.py:765] (0/8) Epoch 4, batch 200, train_loss[loss=3.021, ArTop10Accuracy=0.7286, over 13689.00 frames. ], tot_loss[loss=3.042, ArTop10Accuracy=0.7243, over 7737.91 frames. ], batch size: 34, lr: 2.37e-02 2024-08-06 09:05:01,733 INFO [trainer.py:765] (0/8) Epoch 4, batch 300, train_loss[loss=3.066, ArTop10Accuracy=0.7201, over 14166.00 frames. ], tot_loss[loss=3.037, ArTop10Accuracy=0.7256, over 9341.46 frames. ], batch size: 45, lr: 2.36e-02 2024-08-06 09:06:28,148 INFO [trainer.py:765] (0/8) Epoch 4, batch 400, train_loss[loss=2.954, ArTop10Accuracy=0.7482, over 11040.00 frames. ], tot_loss[loss=3.034, ArTop10Accuracy=0.7265, over 10259.58 frames. ], batch size: 15, lr: 2.34e-02 2024-08-06 09:08:01,927 INFO [trainer.py:765] (0/8) Epoch 4, batch 500, train_loss[loss=2.944, ArTop10Accuracy=0.742, over 12204.00 frames. ], tot_loss[loss=3.023, ArTop10Accuracy=0.7286, over 10800.93 frames. ], batch size: 22, lr: 2.33e-02 2024-08-06 09:09:28,540 INFO [trainer.py:765] (0/8) Epoch 4, batch 600, train_loss[loss=2.976, ArTop10Accuracy=0.7362, over 12120.00 frames. ], tot_loss[loss=3.02, ArTop10Accuracy=0.7291, over 11356.71 frames. ], batch size: 19, lr: 2.32e-02 2024-08-06 09:10:59,865 INFO [trainer.py:765] (0/8) Epoch 4, batch 700, train_loss[loss=2.918, ArTop10Accuracy=0.7454, over 10062.00 frames. ], tot_loss[loss=3.021, ArTop10Accuracy=0.7289, over 11509.51 frames. ], batch size: 12, lr: 2.31e-02 2024-08-06 09:12:17,513 INFO [trainer.py:765] (0/8) Epoch 4, batch 800, train_loss[loss=2.966, ArTop10Accuracy=0.7337, over 9609.00 frames. ], tot_loss[loss=3.023, ArTop10Accuracy=0.7287, over 11636.06 frames. ], batch size: 11, lr: 2.30e-02 2024-08-06 09:13:33,212 INFO [trainer.py:765] (0/8) Epoch 4, batch 900, train_loss[loss=3.013, ArTop10Accuracy=0.7299, over 12831.00 frames. ], tot_loss[loss=3.013, ArTop10Accuracy=0.7305, over 11687.30 frames. ], batch size: 27, lr: 2.29e-02 2024-08-06 09:14:47,520 INFO [trainer.py:765] (0/8) Epoch 4, batch 1000, train_loss[loss=3.014, ArTop10Accuracy=0.7309, over 13050.00 frames. ], tot_loss[loss=3.013, ArTop10Accuracy=0.7305, over 11894.91 frames. ], batch size: 27, lr: 2.28e-02 2024-08-06 09:16:02,982 INFO [trainer.py:765] (0/8) Epoch 4, batch 1100, train_loss[loss=3.08, ArTop10Accuracy=0.7138, over 13473.00 frames. ], tot_loss[loss=3.013, ArTop10Accuracy=0.7305, over 11959.77 frames. ], batch size: 34, lr: 2.26e-02 2024-08-06 09:16:53,291 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.100e+02 1.440e+02 1.636e+02 1.968e+02 7.702e+02, threshold=3.273e+02, percent-clipped=1.3 2024-08-06 09:17:18,344 INFO [trainer.py:765] (0/8) Epoch 4, batch 1200, train_loss[loss=3.102, ArTop10Accuracy=0.7123, over 12105.00 frames. ], tot_loss[loss=3.012, ArTop10Accuracy=0.7306, over 11874.25 frames. ], batch size: 101, lr: 2.25e-02 2024-08-06 09:18:17,203 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 09:18:17,206 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-4.pt 2024-08-06 09:20:17,173 INFO [trainer.py:765] (0/8) Epoch 5, batch 100, train_loss[loss=2.964, ArTop10Accuracy=0.7386, over 14166.00 frames. ], tot_loss[loss=2.993, ArTop10Accuracy=0.7338, over 4763.83 frames. ], batch size: 62, lr: 2.10e-02 2024-08-06 09:21:52,291 INFO [trainer.py:765] (0/8) Epoch 5, batch 200, train_loss[loss=2.948, ArTop10Accuracy=0.742, over 13764.00 frames. ], tot_loss[loss=2.985, ArTop10Accuracy=0.7353, over 7747.26 frames. ], batch size: 34, lr: 2.09e-02 2024-08-06 09:23:19,241 INFO [trainer.py:765] (0/8) Epoch 5, batch 300, train_loss[loss=2.968, ArTop10Accuracy=0.7409, over 14202.00 frames. ], tot_loss[loss=2.975, ArTop10Accuracy=0.7374, over 9374.58 frames. ], batch size: 44, lr: 2.08e-02 2024-08-06 09:24:53,537 INFO [trainer.py:765] (0/8) Epoch 5, batch 400, train_loss[loss=2.865, ArTop10Accuracy=0.759, over 10353.00 frames. ], tot_loss[loss=2.971, ArTop10Accuracy=0.7383, over 10278.00 frames. ], batch size: 14, lr: 2.07e-02 2024-08-06 09:26:19,418 INFO [trainer.py:765] (0/8) Epoch 5, batch 500, train_loss[loss=2.904, ArTop10Accuracy=0.7532, over 12828.00 frames. ], tot_loss[loss=2.963, ArTop10Accuracy=0.7399, over 10863.61 frames. ], batch size: 23, lr: 2.06e-02 2024-08-06 09:27:49,537 INFO [trainer.py:765] (0/8) Epoch 5, batch 600, train_loss[loss=2.948, ArTop10Accuracy=0.7446, over 11325.00 frames. ], tot_loss[loss=2.967, ArTop10Accuracy=0.7389, over 11367.62 frames. ], batch size: 18, lr: 2.05e-02 2024-08-06 09:29:21,670 INFO [trainer.py:765] (0/8) Epoch 5, batch 700, train_loss[loss=2.924, ArTop10Accuracy=0.7441, over 10278.00 frames. ], tot_loss[loss=2.968, ArTop10Accuracy=0.7386, over 11525.90 frames. ], batch size: 12, lr: 2.04e-02 2024-08-06 09:30:44,693 INFO [trainer.py:765] (0/8) Epoch 5, batch 800, train_loss[loss=2.783, ArTop10Accuracy=0.78, over 10116.00 frames. ], tot_loss[loss=2.969, ArTop10Accuracy=0.7385, over 11608.55 frames. ], batch size: 12, lr: 2.03e-02 2024-08-06 09:31:51,239 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 09:32:00,760 INFO [trainer.py:811] (0/8) Epoch 5, validation: loss=2.926, ArTop10Accuracy=0.7466, over 1827537.00 frames. 2024-08-06 09:32:00,761 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 29301MB 2024-08-06 09:32:01,710 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.060e+02 1.349e+02 1.525e+02 1.806e+02 1.007e+03, threshold=3.049e+02, percent-clipped=2.3 2024-08-06 09:32:10,554 INFO [trainer.py:765] (0/8) Epoch 5, batch 900, train_loss[loss=2.973, ArTop10Accuracy=0.7398, over 12834.00 frames. ], tot_loss[loss=2.962, ArTop10Accuracy=0.74, over 11677.29 frames. ], batch size: 27, lr: 2.02e-02 2024-08-06 09:33:27,322 INFO [trainer.py:765] (0/8) Epoch 5, batch 1000, train_loss[loss=2.965, ArTop10Accuracy=0.7397, over 12891.00 frames. ], tot_loss[loss=2.961, ArTop10Accuracy=0.7404, over 11884.96 frames. ], batch size: 27, lr: 2.01e-02 2024-08-06 09:34:42,300 INFO [trainer.py:765] (0/8) Epoch 5, batch 1100, train_loss[loss=2.896, ArTop10Accuracy=0.7578, over 13686.00 frames. ], tot_loss[loss=2.963, ArTop10Accuracy=0.7399, over 11934.83 frames. ], batch size: 34, lr: 2.00e-02 2024-08-06 09:35:56,332 INFO [trainer.py:765] (0/8) Epoch 5, batch 1200, train_loss[loss=3.075, ArTop10Accuracy=0.7164, over 12558.00 frames. ], tot_loss[loss=2.96, ArTop10Accuracy=0.7403, over 11853.83 frames. ], batch size: 101, lr: 1.99e-02 2024-08-06 09:36:54,969 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 09:36:54,973 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-5.pt 2024-08-06 09:38:52,662 INFO [trainer.py:765] (0/8) Epoch 6, batch 100, train_loss[loss=2.91, ArTop10Accuracy=0.7488, over 14733.00 frames. ], tot_loss[loss=2.956, ArTop10Accuracy=0.7406, over 4774.76 frames. ], batch size: 62, lr: 1.85e-02 2024-08-06 09:40:19,834 INFO [trainer.py:765] (0/8) Epoch 6, batch 200, train_loss[loss=2.939, ArTop10Accuracy=0.7433, over 13842.00 frames. ], tot_loss[loss=2.937, ArTop10Accuracy=0.7444, over 7770.99 frames. ], batch size: 34, lr: 1.84e-02 2024-08-06 09:41:52,967 INFO [trainer.py:765] (0/8) Epoch 6, batch 300, train_loss[loss=2.942, ArTop10Accuracy=0.7465, over 14082.00 frames. ], tot_loss[loss=2.928, ArTop10Accuracy=0.7464, over 9388.00 frames. ], batch size: 44, lr: 1.83e-02 2024-08-06 09:43:17,829 INFO [trainer.py:765] (0/8) Epoch 6, batch 400, train_loss[loss=2.955, ArTop10Accuracy=0.7367, over 10458.00 frames. ], tot_loss[loss=2.924, ArTop10Accuracy=0.7473, over 10284.90 frames. ], batch size: 14, lr: 1.83e-02 2024-08-06 09:44:54,130 INFO [trainer.py:765] (0/8) Epoch 6, batch 500, train_loss[loss=2.873, ArTop10Accuracy=0.7609, over 12111.00 frames. ], tot_loss[loss=2.92, ArTop10Accuracy=0.7479, over 10842.93 frames. ], batch size: 22, lr: 1.82e-02 2024-08-06 09:46:22,873 INFO [trainer.py:765] (0/8) Epoch 6, batch 600, train_loss[loss=2.85, ArTop10Accuracy=0.7588, over 11493.00 frames. ], tot_loss[loss=2.919, ArTop10Accuracy=0.7482, over 11346.39 frames. ], batch size: 18, lr: 1.81e-02 2024-08-06 09:46:37,217 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.012e+02 1.339e+02 1.480e+02 1.701e+02 7.506e+02, threshold=2.959e+02, percent-clipped=1.1 2024-08-06 09:47:57,871 INFO [trainer.py:765] (0/8) Epoch 6, batch 700, train_loss[loss=2.809, ArTop10Accuracy=0.7689, over 9942.00 frames. ], tot_loss[loss=2.924, ArTop10Accuracy=0.7472, over 11515.67 frames. ], batch size: 12, lr: 1.80e-02 2024-08-06 09:49:15,955 INFO [trainer.py:765] (0/8) Epoch 6, batch 800, train_loss[loss=2.98, ArTop10Accuracy=0.7345, over 10128.00 frames. ], tot_loss[loss=2.93, ArTop10Accuracy=0.7461, over 11608.70 frames. ], batch size: 12, lr: 1.79e-02 2024-08-06 09:50:32,135 INFO [trainer.py:765] (0/8) Epoch 6, batch 900, train_loss[loss=2.967, ArTop10Accuracy=0.7375, over 12921.00 frames. ], tot_loss[loss=2.921, ArTop10Accuracy=0.7477, over 11657.07 frames. ], batch size: 27, lr: 1.78e-02 2024-08-06 09:51:47,297 INFO [trainer.py:765] (0/8) Epoch 6, batch 1000, train_loss[loss=2.928, ArTop10Accuracy=0.7449, over 12840.00 frames. ], tot_loss[loss=2.926, ArTop10Accuracy=0.747, over 11871.01 frames. ], batch size: 27, lr: 1.77e-02 2024-08-06 09:53:00,921 INFO [trainer.py:765] (0/8) Epoch 6, batch 1100, train_loss[loss=2.869, ArTop10Accuracy=0.7588, over 13548.00 frames. ], tot_loss[loss=2.93, ArTop10Accuracy=0.7462, over 11945.67 frames. ], batch size: 34, lr: 1.77e-02 2024-08-06 09:54:14,336 INFO [trainer.py:765] (0/8) Epoch 6, batch 1200, train_loss[loss=3.067, ArTop10Accuracy=0.7191, over 11925.00 frames. ], tot_loss[loss=2.927, ArTop10Accuracy=0.7467, over 11849.90 frames. ], batch size: 101, lr: 1.76e-02 2024-08-06 09:55:13,161 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 09:55:13,166 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-6.pt 2024-08-06 09:57:06,699 INFO [trainer.py:765] (0/8) Epoch 7, batch 100, train_loss[loss=3.021, ArTop10Accuracy=0.7334, over 14799.00 frames. ], tot_loss[loss=2.908, ArTop10Accuracy=0.7499, over 4751.91 frames. ], batch size: 62, lr: 1.64e-02 2024-08-06 09:58:39,426 INFO [trainer.py:765] (0/8) Epoch 7, batch 200, train_loss[loss=2.914, ArTop10Accuracy=0.747, over 13647.00 frames. ], tot_loss[loss=2.895, ArTop10Accuracy=0.7525, over 7761.94 frames. ], batch size: 34, lr: 1.64e-02 2024-08-06 10:00:06,082 INFO [trainer.py:765] (0/8) Epoch 7, batch 300, train_loss[loss=2.962, ArTop10Accuracy=0.7351, over 14187.00 frames. ], tot_loss[loss=2.891, ArTop10Accuracy=0.7533, over 9377.67 frames. ], batch size: 44, lr: 1.63e-02 2024-08-06 10:00:40,508 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 10:00:50,245 INFO [trainer.py:811] (0/8) Epoch 7, validation: loss=2.88, ArTop10Accuracy=0.7554, over 1827537.00 frames. 2024-08-06 10:00:50,246 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 29301MB 2024-08-06 10:00:50,977 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.002e+02 1.286e+02 1.429e+02 1.605e+02 1.020e+03, threshold=2.857e+02, percent-clipped=1.5 2024-08-06 10:01:49,117 INFO [trainer.py:765] (0/8) Epoch 7, batch 400, train_loss[loss=2.834, ArTop10Accuracy=0.7614, over 10248.00 frames. ], tot_loss[loss=2.894, ArTop10Accuracy=0.7525, over 10301.33 frames. ], batch size: 14, lr: 1.62e-02 2024-08-06 10:03:21,458 INFO [trainer.py:765] (0/8) Epoch 7, batch 500, train_loss[loss=2.807, ArTop10Accuracy=0.7693, over 12213.00 frames. ], tot_loss[loss=2.891, ArTop10Accuracy=0.7533, over 10851.85 frames. ], batch size: 22, lr: 1.61e-02 2024-08-06 10:04:51,882 INFO [trainer.py:765] (0/8) Epoch 7, batch 600, train_loss[loss=2.822, ArTop10Accuracy=0.7734, over 11343.00 frames. ], tot_loss[loss=2.892, ArTop10Accuracy=0.7531, over 11367.18 frames. ], batch size: 18, lr: 1.61e-02 2024-08-06 10:06:25,111 INFO [trainer.py:765] (0/8) Epoch 7, batch 700, train_loss[loss=2.937, ArTop10Accuracy=0.7484, over 10293.00 frames. ], tot_loss[loss=2.898, ArTop10Accuracy=0.7521, over 11508.82 frames. ], batch size: 12, lr: 1.60e-02 2024-08-06 10:07:46,950 INFO [trainer.py:765] (0/8) Epoch 7, batch 800, train_loss[loss=2.873, ArTop10Accuracy=0.7592, over 10146.00 frames. ], tot_loss[loss=2.898, ArTop10Accuracy=0.7523, over 11629.16 frames. ], batch size: 12, lr: 1.59e-02 2024-08-06 10:09:02,824 INFO [trainer.py:765] (0/8) Epoch 7, batch 900, train_loss[loss=2.971, ArTop10Accuracy=0.7367, over 12774.00 frames. ], tot_loss[loss=2.89, ArTop10Accuracy=0.7537, over 11679.75 frames. ], batch size: 27, lr: 1.59e-02 2024-08-06 10:10:19,635 INFO [trainer.py:765] (0/8) Epoch 7, batch 1000, train_loss[loss=2.904, ArTop10Accuracy=0.7487, over 12768.00 frames. ], tot_loss[loss=2.896, ArTop10Accuracy=0.7523, over 11877.49 frames. ], batch size: 27, lr: 1.58e-02 2024-08-06 10:11:35,207 INFO [trainer.py:765] (0/8) Epoch 7, batch 1100, train_loss[loss=2.97, ArTop10Accuracy=0.7417, over 13638.00 frames. ], tot_loss[loss=2.902, ArTop10Accuracy=0.7512, over 11956.00 frames. ], batch size: 34, lr: 1.57e-02 2024-08-06 10:12:48,204 INFO [trainer.py:765] (0/8) Epoch 7, batch 1200, train_loss[loss=3.043, ArTop10Accuracy=0.7302, over 12201.00 frames. ], tot_loss[loss=2.901, ArTop10Accuracy=0.7514, over 11879.95 frames. ], batch size: 101, lr: 1.57e-02 2024-08-06 10:13:46,785 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 10:13:46,788 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-7.pt 2024-08-06 10:15:03,600 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.017e+02 1.283e+02 1.410e+02 1.601e+02 1.017e+03, threshold=2.820e+02, percent-clipped=0.9 2024-08-06 10:15:40,820 INFO [trainer.py:765] (0/8) Epoch 8, batch 100, train_loss[loss=2.903, ArTop10Accuracy=0.7486, over 14670.00 frames. ], tot_loss[loss=2.889, ArTop10Accuracy=0.7534, over 4768.92 frames. ], batch size: 62, lr: 1.47e-02 2024-08-06 10:17:12,862 INFO [trainer.py:765] (0/8) Epoch 8, batch 200, train_loss[loss=2.86, ArTop10Accuracy=0.7606, over 13779.00 frames. ], tot_loss[loss=2.879, ArTop10Accuracy=0.7555, over 7751.17 frames. ], batch size: 34, lr: 1.46e-02 2024-08-06 10:18:37,898 INFO [trainer.py:765] (0/8) Epoch 8, batch 300, train_loss[loss=2.875, ArTop10Accuracy=0.7556, over 14097.00 frames. ], tot_loss[loss=2.872, ArTop10Accuracy=0.7568, over 9353.66 frames. ], batch size: 44, lr: 1.46e-02 2024-08-06 10:20:06,341 INFO [trainer.py:765] (0/8) Epoch 8, batch 400, train_loss[loss=2.701, ArTop10Accuracy=0.7881, over 10956.00 frames. ], tot_loss[loss=2.869, ArTop10Accuracy=0.7574, over 10265.29 frames. ], batch size: 15, lr: 1.45e-02 2024-08-06 10:21:32,411 INFO [trainer.py:765] (0/8) Epoch 8, batch 500, train_loss[loss=2.816, ArTop10Accuracy=0.7739, over 12225.00 frames. ], tot_loss[loss=2.862, ArTop10Accuracy=0.7587, over 10847.20 frames. ], batch size: 22, lr: 1.45e-02 2024-08-06 10:23:00,974 INFO [trainer.py:765] (0/8) Epoch 8, batch 600, train_loss[loss=2.888, ArTop10Accuracy=0.7598, over 11340.00 frames. ], tot_loss[loss=2.865, ArTop10Accuracy=0.7583, over 11377.64 frames. ], batch size: 18, lr: 1.44e-02 2024-08-06 10:24:37,787 INFO [trainer.py:765] (0/8) Epoch 8, batch 700, train_loss[loss=2.898, ArTop10Accuracy=0.7523, over 9450.00 frames. ], tot_loss[loss=2.869, ArTop10Accuracy=0.7574, over 11524.35 frames. ], batch size: 11, lr: 1.43e-02 2024-08-06 10:25:56,088 INFO [trainer.py:765] (0/8) Epoch 8, batch 800, train_loss[loss=2.921, ArTop10Accuracy=0.7523, over 10218.00 frames. ], tot_loss[loss=2.874, ArTop10Accuracy=0.7566, over 11645.69 frames. ], batch size: 12, lr: 1.43e-02 2024-08-06 10:27:12,246 INFO [trainer.py:765] (0/8) Epoch 8, batch 900, train_loss[loss=2.885, ArTop10Accuracy=0.7531, over 12756.00 frames. ], tot_loss[loss=2.865, ArTop10Accuracy=0.7585, over 11705.50 frames. ], batch size: 27, lr: 1.42e-02 2024-08-06 10:28:25,263 INFO [trainer.py:765] (0/8) Epoch 8, batch 1000, train_loss[loss=2.886, ArTop10Accuracy=0.7564, over 12960.00 frames. ], tot_loss[loss=2.872, ArTop10Accuracy=0.7571, over 11890.41 frames. ], batch size: 27, lr: 1.42e-02 2024-08-06 10:29:07,155 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 10:29:16,830 INFO [trainer.py:811] (0/8) Epoch 8, validation: loss=2.858, ArTop10Accuracy=0.7594, over 1827537.00 frames. 2024-08-06 10:29:16,831 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 29519MB 2024-08-06 10:29:17,490 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.032e+02 1.275e+02 1.390e+02 1.547e+02 3.717e+02, threshold=2.781e+02, percent-clipped=0.7 2024-08-06 10:29:51,730 INFO [trainer.py:765] (0/8) Epoch 8, batch 1100, train_loss[loss=2.87, ArTop10Accuracy=0.7579, over 13614.00 frames. ], tot_loss[loss=2.875, ArTop10Accuracy=0.7563, over 11953.41 frames. ], batch size: 34, lr: 1.41e-02 2024-08-06 10:31:05,948 INFO [trainer.py:765] (0/8) Epoch 8, batch 1200, train_loss[loss=2.996, ArTop10Accuracy=0.7309, over 12624.00 frames. ], tot_loss[loss=2.878, ArTop10Accuracy=0.7558, over 11901.69 frames. ], batch size: 103, lr: 1.40e-02 2024-08-06 10:32:05,402 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 10:32:05,407 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-8.pt 2024-08-06 10:34:01,256 INFO [trainer.py:765] (0/8) Epoch 9, batch 100, train_loss[loss=2.882, ArTop10Accuracy=0.7575, over 14805.00 frames. ], tot_loss[loss=2.858, ArTop10Accuracy=0.7592, over 4758.64 frames. ], batch size: 63, lr: 1.32e-02 2024-08-06 10:35:31,772 INFO [trainer.py:765] (0/8) Epoch 9, batch 200, train_loss[loss=2.787, ArTop10Accuracy=0.7743, over 13638.00 frames. ], tot_loss[loss=2.848, ArTop10Accuracy=0.7607, over 7747.34 frames. ], batch size: 34, lr: 1.32e-02 2024-08-06 10:36:57,926 INFO [trainer.py:765] (0/8) Epoch 9, batch 300, train_loss[loss=2.883, ArTop10Accuracy=0.7545, over 14226.00 frames. ], tot_loss[loss=2.849, ArTop10Accuracy=0.7606, over 9353.90 frames. ], batch size: 45, lr: 1.31e-02 2024-08-06 10:38:32,698 INFO [trainer.py:765] (0/8) Epoch 9, batch 400, train_loss[loss=2.729, ArTop10Accuracy=0.7845, over 10419.00 frames. ], tot_loss[loss=2.844, ArTop10Accuracy=0.762, over 10282.65 frames. ], batch size: 14, lr: 1.31e-02 2024-08-06 10:39:59,256 INFO [trainer.py:765] (0/8) Epoch 9, batch 500, train_loss[loss=2.871, ArTop10Accuracy=0.7572, over 11979.00 frames. ], tot_loss[loss=2.838, ArTop10Accuracy=0.7632, over 10858.06 frames. ], batch size: 22, lr: 1.30e-02 2024-08-06 10:41:29,690 INFO [trainer.py:765] (0/8) Epoch 9, batch 600, train_loss[loss=2.819, ArTop10Accuracy=0.7655, over 11466.00 frames. ], tot_loss[loss=2.842, ArTop10Accuracy=0.7628, over 11377.95 frames. ], batch size: 18, lr: 1.30e-02 2024-08-06 10:42:58,440 INFO [trainer.py:765] (0/8) Epoch 9, batch 700, train_loss[loss=2.632, ArTop10Accuracy=0.7985, over 10293.00 frames. ], tot_loss[loss=2.841, ArTop10Accuracy=0.7629, over 11523.81 frames. ], batch size: 12, lr: 1.29e-02 2024-08-06 10:44:02,952 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.039e+02 1.253e+02 1.352e+02 1.493e+02 7.010e+02, threshold=2.704e+02, percent-clipped=0.6 2024-08-06 10:44:19,669 INFO [trainer.py:765] (0/8) Epoch 9, batch 800, train_loss[loss=2.764, ArTop10Accuracy=0.7822, over 9246.00 frames. ], tot_loss[loss=2.845, ArTop10Accuracy=0.762, over 11613.97 frames. ], batch size: 11, lr: 1.29e-02 2024-08-06 10:45:35,718 INFO [trainer.py:765] (0/8) Epoch 9, batch 900, train_loss[loss=2.91, ArTop10Accuracy=0.7467, over 13092.00 frames. ], tot_loss[loss=2.84, ArTop10Accuracy=0.7629, over 11694.74 frames. ], batch size: 27, lr: 1.28e-02 2024-08-06 10:46:51,271 INFO [trainer.py:765] (0/8) Epoch 9, batch 1000, train_loss[loss=2.834, ArTop10Accuracy=0.7649, over 12852.00 frames. ], tot_loss[loss=2.846, ArTop10Accuracy=0.762, over 11888.03 frames. ], batch size: 27, lr: 1.28e-02 2024-08-06 10:48:06,247 INFO [trainer.py:765] (0/8) Epoch 9, batch 1100, train_loss[loss=2.846, ArTop10Accuracy=0.7591, over 13386.00 frames. ], tot_loss[loss=2.854, ArTop10Accuracy=0.7603, over 11952.87 frames. ], batch size: 34, lr: 1.28e-02 2024-08-06 10:49:21,054 INFO [trainer.py:765] (0/8) Epoch 9, batch 1200, train_loss[loss=2.929, ArTop10Accuracy=0.7448, over 12195.00 frames. ], tot_loss[loss=2.855, ArTop10Accuracy=0.7599, over 11873.16 frames. ], batch size: 101, lr: 1.27e-02 2024-08-06 10:50:22,708 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 10:50:22,712 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-9.pt 2024-08-06 10:52:12,325 INFO [trainer.py:765] (0/8) Epoch 10, batch 100, train_loss[loss=2.889, ArTop10Accuracy=0.7574, over 14289.00 frames. ], tot_loss[loss=2.85, ArTop10Accuracy=0.7603, over 4750.13 frames. ], batch size: 62, lr: 1.20e-02 2024-08-06 10:53:44,584 INFO [trainer.py:765] (0/8) Epoch 10, batch 200, train_loss[loss=2.83, ArTop10Accuracy=0.7625, over 13602.00 frames. ], tot_loss[loss=2.834, ArTop10Accuracy=0.7638, over 7744.10 frames. ], batch size: 34, lr: 1.20e-02 2024-08-06 10:55:08,089 INFO [trainer.py:765] (0/8) Epoch 10, batch 300, train_loss[loss=2.874, ArTop10Accuracy=0.7542, over 13893.00 frames. ], tot_loss[loss=2.828, ArTop10Accuracy=0.7651, over 9372.24 frames. ], batch size: 44, lr: 1.19e-02 2024-08-06 10:56:41,177 INFO [trainer.py:765] (0/8) Epoch 10, batch 400, train_loss[loss=2.776, ArTop10Accuracy=0.7748, over 10425.00 frames. ], tot_loss[loss=2.825, ArTop10Accuracy=0.7658, over 10286.81 frames. ], batch size: 14, lr: 1.19e-02 2024-08-06 10:58:04,937 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 10:58:14,555 INFO [trainer.py:811] (0/8) Epoch 10, validation: loss=2.842, ArTop10Accuracy=0.7624, over 1827537.00 frames. 2024-08-06 10:58:14,556 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 29519MB 2024-08-06 10:58:15,574 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.035e+02 1.228e+02 1.320e+02 1.458e+02 6.096e+02, threshold=2.641e+02, percent-clipped=0.6 2024-08-06 10:58:15,579 INFO [trainer.py:765] (0/8) Epoch 10, batch 500, train_loss[loss=2.791, ArTop10Accuracy=0.7754, over 12168.00 frames. ], tot_loss[loss=2.824, ArTop10Accuracy=0.766, over 10844.50 frames. ], batch size: 22, lr: 1.19e-02 2024-08-06 10:59:42,817 INFO [trainer.py:765] (0/8) Epoch 10, batch 600, train_loss[loss=2.772, ArTop10Accuracy=0.7802, over 11367.00 frames. ], tot_loss[loss=2.822, ArTop10Accuracy=0.7664, over 11362.90 frames. ], batch size: 18, lr: 1.18e-02 2024-08-06 11:01:18,109 INFO [trainer.py:765] (0/8) Epoch 10, batch 700, train_loss[loss=2.798, ArTop10Accuracy=0.769, over 10206.00 frames. ], tot_loss[loss=2.827, ArTop10Accuracy=0.7656, over 11497.81 frames. ], batch size: 12, lr: 1.18e-02 2024-08-06 11:02:36,918 INFO [trainer.py:765] (0/8) Epoch 10, batch 800, train_loss[loss=2.694, ArTop10Accuracy=0.7909, over 9990.00 frames. ], tot_loss[loss=2.833, ArTop10Accuracy=0.764, over 11622.09 frames. ], batch size: 12, lr: 1.17e-02 2024-08-06 11:03:51,212 INFO [trainer.py:765] (0/8) Epoch 10, batch 900, train_loss[loss=2.837, ArTop10Accuracy=0.7643, over 13029.00 frames. ], tot_loss[loss=2.829, ArTop10Accuracy=0.765, over 11669.36 frames. ], batch size: 27, lr: 1.17e-02 2024-08-06 11:05:06,352 INFO [trainer.py:765] (0/8) Epoch 10, batch 1000, train_loss[loss=2.861, ArTop10Accuracy=0.7555, over 13305.00 frames. ], tot_loss[loss=2.829, ArTop10Accuracy=0.7652, over 11865.57 frames. ], batch size: 28, lr: 1.17e-02 2024-08-06 11:06:21,724 INFO [trainer.py:765] (0/8) Epoch 10, batch 1100, train_loss[loss=2.84, ArTop10Accuracy=0.7659, over 13539.00 frames. ], tot_loss[loss=2.834, ArTop10Accuracy=0.7642, over 11938.55 frames. ], batch size: 34, lr: 1.16e-02 2024-08-06 11:07:34,772 INFO [trainer.py:765] (0/8) Epoch 10, batch 1200, train_loss[loss=2.951, ArTop10Accuracy=0.7414, over 12210.00 frames. ], tot_loss[loss=2.837, ArTop10Accuracy=0.7635, over 11860.87 frames. ], batch size: 101, lr: 1.16e-02 2024-08-06 11:08:33,817 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 11:08:33,820 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-10.pt 2024-08-06 11:10:29,953 INFO [trainer.py:765] (0/8) Epoch 11, batch 100, train_loss[loss=2.853, ArTop10Accuracy=0.7606, over 14718.00 frames. ], tot_loss[loss=2.822, ArTop10Accuracy=0.766, over 4759.47 frames. ], batch size: 62, lr: 1.10e-02 2024-08-06 11:12:04,673 INFO [trainer.py:765] (0/8) Epoch 11, batch 200, train_loss[loss=2.845, ArTop10Accuracy=0.7641, over 14085.00 frames. ], tot_loss[loss=2.814, ArTop10Accuracy=0.7672, over 7737.29 frames. ], batch size: 35, lr: 1.10e-02 2024-08-06 11:12:22,823 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 9.884e+01 1.240e+02 1.333e+02 1.457e+02 6.939e+02, threshold=2.667e+02, percent-clipped=0.1 2024-08-06 11:13:31,549 INFO [trainer.py:765] (0/8) Epoch 11, batch 300, train_loss[loss=2.914, ArTop10Accuracy=0.749, over 14385.00 frames. ], tot_loss[loss=2.81, ArTop10Accuracy=0.7683, over 9360.87 frames. ], batch size: 44, lr: 1.09e-02 2024-08-06 11:15:03,268 INFO [trainer.py:765] (0/8) Epoch 11, batch 400, train_loss[loss=2.742, ArTop10Accuracy=0.7831, over 10377.00 frames. ], tot_loss[loss=2.807, ArTop10Accuracy=0.769, over 10296.01 frames. ], batch size: 14, lr: 1.09e-02 2024-08-06 11:16:29,637 INFO [trainer.py:765] (0/8) Epoch 11, batch 500, train_loss[loss=2.801, ArTop10Accuracy=0.7716, over 12243.00 frames. ], tot_loss[loss=2.802, ArTop10Accuracy=0.7699, over 10869.51 frames. ], batch size: 22, lr: 1.09e-02 2024-08-06 11:18:00,516 INFO [trainer.py:765] (0/8) Epoch 11, batch 600, train_loss[loss=2.718, ArTop10Accuracy=0.7889, over 11457.00 frames. ], tot_loss[loss=2.804, ArTop10Accuracy=0.7698, over 11377.43 frames. ], batch size: 18, lr: 1.08e-02 2024-08-06 11:19:34,512 INFO [trainer.py:765] (0/8) Epoch 11, batch 700, train_loss[loss=2.702, ArTop10Accuracy=0.7909, over 10167.00 frames. ], tot_loss[loss=2.81, ArTop10Accuracy=0.7684, over 11520.97 frames. ], batch size: 12, lr: 1.08e-02 2024-08-06 11:20:55,482 INFO [trainer.py:765] (0/8) Epoch 11, batch 800, train_loss[loss=2.676, ArTop10Accuracy=0.795, over 10086.00 frames. ], tot_loss[loss=2.813, ArTop10Accuracy=0.7681, over 11629.35 frames. ], batch size: 12, lr: 1.07e-02 2024-08-06 11:22:13,704 INFO [trainer.py:765] (0/8) Epoch 11, batch 900, train_loss[loss=2.802, ArTop10Accuracy=0.771, over 12939.00 frames. ], tot_loss[loss=2.81, ArTop10Accuracy=0.7685, over 11682.50 frames. ], batch size: 27, lr: 1.07e-02 2024-08-06 11:23:31,797 INFO [trainer.py:765] (0/8) Epoch 11, batch 1000, train_loss[loss=2.76, ArTop10Accuracy=0.7754, over 12987.00 frames. ], tot_loss[loss=2.815, ArTop10Accuracy=0.7675, over 11877.90 frames. ], batch size: 27, lr: 1.07e-02 2024-08-06 11:24:46,901 INFO [trainer.py:765] (0/8) Epoch 11, batch 1100, train_loss[loss=2.779, ArTop10Accuracy=0.7775, over 13578.00 frames. ], tot_loss[loss=2.821, ArTop10Accuracy=0.7666, over 11962.65 frames. ], batch size: 34, lr: 1.06e-02 2024-08-06 11:26:00,733 INFO [trainer.py:765] (0/8) Epoch 11, batch 1200, train_loss[loss=2.934, ArTop10Accuracy=0.7458, over 12288.00 frames. ], tot_loss[loss=2.821, ArTop10Accuracy=0.7664, over 11874.75 frames. ], batch size: 103, lr: 1.06e-02 2024-08-06 11:26:15,845 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 11:26:25,556 INFO [trainer.py:811] (0/8) Epoch 11, validation: loss=2.831, ArTop10Accuracy=0.7643, over 1827537.00 frames. 2024-08-06 11:26:25,557 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 29519MB 2024-08-06 11:26:26,185 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.029e+02 1.251e+02 1.335e+02 1.441e+02 2.942e+02, threshold=2.669e+02, percent-clipped=0.1 2024-08-06 11:27:09,747 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 11:27:09,754 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-11.pt 2024-08-06 11:29:03,450 INFO [trainer.py:765] (0/8) Epoch 12, batch 100, train_loss[loss=2.881, ArTop10Accuracy=0.7544, over 14667.00 frames. ], tot_loss[loss=2.797, ArTop10Accuracy=0.7704, over 4747.48 frames. ], batch size: 62, lr: 1.01e-02 2024-08-06 11:30:30,673 INFO [trainer.py:765] (0/8) Epoch 12, batch 200, train_loss[loss=2.785, ArTop10Accuracy=0.7731, over 13518.00 frames. ], tot_loss[loss=2.799, ArTop10Accuracy=0.7705, over 7756.10 frames. ], batch size: 34, lr: 1.01e-02 2024-08-06 11:31:57,654 INFO [trainer.py:765] (0/8) Epoch 12, batch 300, train_loss[loss=2.832, ArTop10Accuracy=0.7641, over 14247.00 frames. ], tot_loss[loss=2.792, ArTop10Accuracy=0.7719, over 9392.76 frames. ], batch size: 44, lr: 1.01e-02 2024-08-06 11:33:30,737 INFO [trainer.py:765] (0/8) Epoch 12, batch 400, train_loss[loss=2.739, ArTop10Accuracy=0.7777, over 10179.00 frames. ], tot_loss[loss=2.794, ArTop10Accuracy=0.7714, over 10292.52 frames. ], batch size: 14, lr: 1.00e-02 2024-08-06 11:34:55,734 INFO [trainer.py:765] (0/8) Epoch 12, batch 500, train_loss[loss=2.801, ArTop10Accuracy=0.7709, over 12150.00 frames. ], tot_loss[loss=2.788, ArTop10Accuracy=0.7728, over 10850.93 frames. ], batch size: 22, lr: 1.00e-02 2024-08-06 11:36:29,361 INFO [trainer.py:765] (0/8) Epoch 12, batch 600, train_loss[loss=2.741, ArTop10Accuracy=0.7815, over 11487.00 frames. ], tot_loss[loss=2.79, ArTop10Accuracy=0.7724, over 11362.82 frames. ], batch size: 18, lr: 9.97e-03 2024-08-06 11:38:00,343 INFO [trainer.py:765] (0/8) Epoch 12, batch 700, train_loss[loss=2.741, ArTop10Accuracy=0.7811, over 10062.00 frames. ], tot_loss[loss=2.796, ArTop10Accuracy=0.7713, over 11517.08 frames. ], batch size: 12, lr: 9.93e-03 2024-08-06 11:39:23,610 INFO [trainer.py:765] (0/8) Epoch 12, batch 800, train_loss[loss=2.759, ArTop10Accuracy=0.7732, over 10080.00 frames. ], tot_loss[loss=2.799, ArTop10Accuracy=0.7705, over 11636.73 frames. ], batch size: 12, lr: 9.90e-03 2024-08-06 11:40:39,888 INFO [trainer.py:765] (0/8) Epoch 12, batch 900, train_loss[loss=2.823, ArTop10Accuracy=0.7693, over 12876.00 frames. ], tot_loss[loss=2.79, ArTop10Accuracy=0.7724, over 11678.76 frames. ], batch size: 27, lr: 9.87e-03 2024-08-06 11:41:13,995 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.041e+02 1.248e+02 1.348e+02 1.459e+02 5.540e+02, threshold=2.695e+02, percent-clipped=0.3 2024-08-06 11:41:56,188 INFO [trainer.py:765] (0/8) Epoch 12, batch 1000, train_loss[loss=2.811, ArTop10Accuracy=0.7647, over 12993.00 frames. ], tot_loss[loss=2.795, ArTop10Accuracy=0.7713, over 11886.62 frames. ], batch size: 27, lr: 9.85e-03 2024-08-06 11:43:14,319 INFO [trainer.py:765] (0/8) Epoch 12, batch 1100, train_loss[loss=2.781, ArTop10Accuracy=0.7739, over 13596.00 frames. ], tot_loss[loss=2.799, ArTop10Accuracy=0.7707, over 11956.72 frames. ], batch size: 34, lr: 9.82e-03 2024-08-06 11:44:26,155 INFO [trainer.py:765] (0/8) Epoch 12, batch 1200, train_loss[loss=2.9, ArTop10Accuracy=0.7534, over 12429.00 frames. ], tot_loss[loss=2.803, ArTop10Accuracy=0.7699, over 11861.85 frames. ], batch size: 103, lr: 9.79e-03 2024-08-06 11:45:26,924 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 11:45:26,927 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-12.pt 2024-08-06 11:47:26,603 INFO [trainer.py:765] (0/8) Epoch 13, batch 100, train_loss[loss=2.873, ArTop10Accuracy=0.7586, over 14676.00 frames. ], tot_loss[loss=2.786, ArTop10Accuracy=0.7726, over 4752.95 frames. ], batch size: 63, lr: 9.37e-03 2024-08-06 11:48:54,779 INFO [trainer.py:765] (0/8) Epoch 13, batch 200, train_loss[loss=2.687, ArTop10Accuracy=0.7937, over 13491.00 frames. ], tot_loss[loss=2.78, ArTop10Accuracy=0.7742, over 7726.95 frames. ], batch size: 34, lr: 9.34e-03 2024-08-06 11:50:20,515 INFO [trainer.py:765] (0/8) Epoch 13, batch 300, train_loss[loss=2.866, ArTop10Accuracy=0.7566, over 14184.00 frames. ], tot_loss[loss=2.777, ArTop10Accuracy=0.7748, over 9347.37 frames. ], batch size: 44, lr: 9.31e-03 2024-08-06 11:51:48,764 INFO [trainer.py:765] (0/8) Epoch 13, batch 400, train_loss[loss=2.725, ArTop10Accuracy=0.7881, over 10398.00 frames. ], tot_loss[loss=2.771, ArTop10Accuracy=0.7757, over 10274.62 frames. ], batch size: 14, lr: 9.28e-03 2024-08-06 11:53:13,408 INFO [trainer.py:765] (0/8) Epoch 13, batch 500, train_loss[loss=2.662, ArTop10Accuracy=0.7992, over 12588.00 frames. ], tot_loss[loss=2.77, ArTop10Accuracy=0.7759, over 10848.23 frames. ], batch size: 23, lr: 9.26e-03 2024-08-06 11:54:52,223 INFO [trainer.py:765] (0/8) Epoch 13, batch 600, train_loss[loss=2.762, ArTop10Accuracy=0.7758, over 11412.00 frames. ], tot_loss[loss=2.773, ArTop10Accuracy=0.7756, over 11379.70 frames. ], batch size: 18, lr: 9.23e-03 2024-08-06 11:55:47,082 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 11:55:56,835 INFO [trainer.py:811] (0/8) Epoch 13, validation: loss=2.824, ArTop10Accuracy=0.7662, over 1827537.00 frames. 2024-08-06 11:55:56,835 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 29519MB 2024-08-06 11:55:57,712 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.064e+02 1.255e+02 1.343e+02 1.452e+02 4.888e+02, threshold=2.687e+02, percent-clipped=0.1 2024-08-06 11:56:28,465 INFO [trainer.py:765] (0/8) Epoch 13, batch 700, train_loss[loss=2.707, ArTop10Accuracy=0.7891, over 10317.00 frames. ], tot_loss[loss=2.776, ArTop10Accuracy=0.7749, over 11514.39 frames. ], batch size: 12, lr: 9.20e-03 2024-08-06 11:57:46,684 INFO [trainer.py:765] (0/8) Epoch 13, batch 800, train_loss[loss=2.67, ArTop10Accuracy=0.7936, over 9534.00 frames. ], tot_loss[loss=2.78, ArTop10Accuracy=0.7743, over 11614.14 frames. ], batch size: 11, lr: 9.18e-03 2024-08-06 11:59:03,289 INFO [trainer.py:765] (0/8) Epoch 13, batch 900, train_loss[loss=2.81, ArTop10Accuracy=0.7719, over 13218.00 frames. ], tot_loss[loss=2.777, ArTop10Accuracy=0.775, over 11658.17 frames. ], batch size: 28, lr: 9.15e-03 2024-08-06 12:00:19,174 INFO [trainer.py:765] (0/8) Epoch 13, batch 1000, train_loss[loss=2.737, ArTop10Accuracy=0.7798, over 13002.00 frames. ], tot_loss[loss=2.781, ArTop10Accuracy=0.7743, over 11880.17 frames. ], batch size: 27, lr: 9.13e-03 2024-08-06 12:01:34,883 INFO [trainer.py:765] (0/8) Epoch 13, batch 1100, train_loss[loss=2.79, ArTop10Accuracy=0.7723, over 13695.00 frames. ], tot_loss[loss=2.789, ArTop10Accuracy=0.7728, over 11953.89 frames. ], batch size: 34, lr: 9.10e-03 2024-08-06 12:02:48,662 INFO [trainer.py:765] (0/8) Epoch 13, batch 1200, train_loss[loss=2.97, ArTop10Accuracy=0.7374, over 12612.00 frames. ], tot_loss[loss=2.79, ArTop10Accuracy=0.7723, over 11864.76 frames. ], batch size: 101, lr: 9.08e-03 2024-08-06 12:03:47,909 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 12:03:47,912 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-13.pt 2024-08-06 12:05:45,336 INFO [trainer.py:765] (0/8) Epoch 14, batch 100, train_loss[loss=2.841, ArTop10Accuracy=0.7652, over 14853.00 frames. ], tot_loss[loss=2.769, ArTop10Accuracy=0.7759, over 4768.63 frames. ], batch size: 64, lr: 8.71e-03 2024-08-06 12:07:16,605 INFO [trainer.py:765] (0/8) Epoch 14, batch 200, train_loss[loss=2.748, ArTop10Accuracy=0.7779, over 13905.00 frames. ], tot_loss[loss=2.765, ArTop10Accuracy=0.7762, over 7750.37 frames. ], batch size: 35, lr: 8.69e-03 2024-08-06 12:08:44,312 INFO [trainer.py:765] (0/8) Epoch 14, batch 300, train_loss[loss=2.778, ArTop10Accuracy=0.772, over 14355.00 frames. ], tot_loss[loss=2.764, ArTop10Accuracy=0.7769, over 9379.42 frames. ], batch size: 44, lr: 8.66e-03 2024-08-06 12:10:01,132 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.072e+02 1.266e+02 1.374e+02 1.483e+02 6.480e+02, threshold=2.748e+02, percent-clipped=0.2 2024-08-06 12:10:10,226 INFO [trainer.py:765] (0/8) Epoch 14, batch 400, train_loss[loss=2.741, ArTop10Accuracy=0.7836, over 10218.00 frames. ], tot_loss[loss=2.766, ArTop10Accuracy=0.7765, over 10280.49 frames. ], batch size: 14, lr: 8.64e-03 2024-08-06 12:11:36,150 INFO [trainer.py:765] (0/8) Epoch 14, batch 500, train_loss[loss=2.676, ArTop10Accuracy=0.7933, over 12192.00 frames. ], tot_loss[loss=2.759, ArTop10Accuracy=0.7781, over 10852.25 frames. ], batch size: 22, lr: 8.62e-03 2024-08-06 12:13:05,993 INFO [trainer.py:765] (0/8) Epoch 14, batch 600, train_loss[loss=2.752, ArTop10Accuracy=0.7771, over 11457.00 frames. ], tot_loss[loss=2.763, ArTop10Accuracy=0.7771, over 11369.19 frames. ], batch size: 18, lr: 8.59e-03 2024-08-06 12:14:38,553 INFO [trainer.py:765] (0/8) Epoch 14, batch 700, train_loss[loss=2.705, ArTop10Accuracy=0.7961, over 10176.00 frames. ], tot_loss[loss=2.769, ArTop10Accuracy=0.7761, over 11536.24 frames. ], batch size: 12, lr: 8.57e-03 2024-08-06 12:15:58,070 INFO [trainer.py:765] (0/8) Epoch 14, batch 800, train_loss[loss=2.689, ArTop10Accuracy=0.7909, over 10209.00 frames. ], tot_loss[loss=2.772, ArTop10Accuracy=0.7753, over 11651.71 frames. ], batch size: 12, lr: 8.55e-03 2024-08-06 12:17:12,865 INFO [trainer.py:765] (0/8) Epoch 14, batch 900, train_loss[loss=2.732, ArTop10Accuracy=0.7826, over 12996.00 frames. ], tot_loss[loss=2.768, ArTop10Accuracy=0.7763, over 11690.20 frames. ], batch size: 27, lr: 8.52e-03 2024-08-06 12:18:29,613 INFO [trainer.py:765] (0/8) Epoch 14, batch 1000, train_loss[loss=2.803, ArTop10Accuracy=0.7661, over 12777.00 frames. ], tot_loss[loss=2.775, ArTop10Accuracy=0.7751, over 11882.08 frames. ], batch size: 27, lr: 8.50e-03 2024-08-06 12:19:45,376 INFO [trainer.py:765] (0/8) Epoch 14, batch 1100, train_loss[loss=2.777, ArTop10Accuracy=0.7773, over 13431.00 frames. ], tot_loss[loss=2.781, ArTop10Accuracy=0.7741, over 11942.94 frames. ], batch size: 34, lr: 8.48e-03 2024-08-06 12:20:59,278 INFO [trainer.py:765] (0/8) Epoch 14, batch 1200, train_loss[loss=2.906, ArTop10Accuracy=0.75, over 12942.00 frames. ], tot_loss[loss=2.779, ArTop10Accuracy=0.7745, over 11861.52 frames. ], batch size: 102, lr: 8.46e-03 2024-08-06 12:21:58,346 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 12:21:58,348 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-14.pt 2024-08-06 12:23:51,960 INFO [trainer.py:765] (0/8) Epoch 15, batch 100, train_loss[loss=2.836, ArTop10Accuracy=0.7586, over 14583.00 frames. ], tot_loss[loss=2.768, ArTop10Accuracy=0.7759, over 4768.34 frames. ], batch size: 62, lr: 8.14e-03 2024-08-06 12:24:00,597 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 12:24:10,290 INFO [trainer.py:811] (0/8) Epoch 15, validation: loss=2.819, ArTop10Accuracy=0.7675, over 1827537.00 frames. 2024-08-06 12:24:10,291 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 29519MB 2024-08-06 12:24:11,094 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.080e+02 1.284e+02 1.371e+02 1.488e+02 4.667e+02, threshold=2.743e+02, percent-clipped=0.2 2024-08-06 12:25:29,990 INFO [trainer.py:765] (0/8) Epoch 15, batch 200, train_loss[loss=2.826, ArTop10Accuracy=0.7627, over 13767.00 frames. ], tot_loss[loss=2.757, ArTop10Accuracy=0.7783, over 7773.81 frames. ], batch size: 34, lr: 8.12e-03 2024-08-06 12:26:58,695 INFO [trainer.py:765] (0/8) Epoch 15, batch 300, train_loss[loss=2.819, ArTop10Accuracy=0.7639, over 13935.00 frames. ], tot_loss[loss=2.753, ArTop10Accuracy=0.7791, over 9378.32 frames. ], batch size: 44, lr: 8.09e-03 2024-08-06 12:28:28,536 INFO [trainer.py:765] (0/8) Epoch 15, batch 400, train_loss[loss=2.684, ArTop10Accuracy=0.7938, over 10281.00 frames. ], tot_loss[loss=2.751, ArTop10Accuracy=0.7796, over 10275.42 frames. ], batch size: 14, lr: 8.07e-03 2024-08-06 12:29:54,033 INFO [trainer.py:765] (0/8) Epoch 15, batch 500, train_loss[loss=2.691, ArTop10Accuracy=0.7942, over 12264.00 frames. ], tot_loss[loss=2.746, ArTop10Accuracy=0.7804, over 10831.07 frames. ], batch size: 22, lr: 8.05e-03 2024-08-06 12:31:23,293 INFO [trainer.py:765] (0/8) Epoch 15, batch 600, train_loss[loss=2.735, ArTop10Accuracy=0.7826, over 11409.00 frames. ], tot_loss[loss=2.753, ArTop10Accuracy=0.7791, over 11367.63 frames. ], batch size: 18, lr: 8.03e-03 2024-08-06 12:32:53,176 INFO [trainer.py:765] (0/8) Epoch 15, batch 700, train_loss[loss=2.686, ArTop10Accuracy=0.7965, over 9513.00 frames. ], tot_loss[loss=2.757, ArTop10Accuracy=0.7784, over 11512.36 frames. ], batch size: 11, lr: 8.01e-03 2024-08-06 12:34:18,254 INFO [trainer.py:765] (0/8) Epoch 15, batch 800, train_loss[loss=2.696, ArTop10Accuracy=0.7923, over 9231.00 frames. ], tot_loss[loss=2.759, ArTop10Accuracy=0.7779, over 11623.54 frames. ], batch size: 11, lr: 7.99e-03 2024-08-06 12:35:34,727 INFO [trainer.py:765] (0/8) Epoch 15, batch 900, train_loss[loss=2.797, ArTop10Accuracy=0.7713, over 12987.00 frames. ], tot_loss[loss=2.756, ArTop10Accuracy=0.7785, over 11662.82 frames. ], batch size: 27, lr: 7.97e-03 2024-08-06 12:36:50,540 INFO [trainer.py:765] (0/8) Epoch 15, batch 1000, train_loss[loss=2.785, ArTop10Accuracy=0.7808, over 12819.00 frames. ], tot_loss[loss=2.76, ArTop10Accuracy=0.7779, over 11884.84 frames. ], batch size: 27, lr: 7.95e-03 2024-08-06 12:38:05,181 INFO [trainer.py:765] (0/8) Epoch 15, batch 1100, train_loss[loss=2.821, ArTop10Accuracy=0.7663, over 13737.00 frames. ], tot_loss[loss=2.766, ArTop10Accuracy=0.7766, over 11951.41 frames. ], batch size: 34, lr: 7.93e-03 2024-08-06 12:38:12,841 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.080e+02 1.293e+02 1.379e+02 1.467e+02 2.824e+02, threshold=2.759e+02, percent-clipped=0.1 2024-08-06 12:39:18,789 INFO [trainer.py:765] (0/8) Epoch 15, batch 1200, train_loss[loss=2.888, ArTop10Accuracy=0.7535, over 12417.00 frames. ], tot_loss[loss=2.769, ArTop10Accuracy=0.776, over 11865.99 frames. ], batch size: 101, lr: 7.91e-03 2024-08-06 12:40:18,830 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 12:40:18,833 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-15.pt 2024-08-06 12:42:17,620 INFO [trainer.py:765] (0/8) Epoch 16, batch 100, train_loss[loss=2.796, ArTop10Accuracy=0.7693, over 14427.00 frames. ], tot_loss[loss=2.744, ArTop10Accuracy=0.7806, over 4772.64 frames. ], batch size: 62, lr: 7.63e-03 2024-08-06 12:43:49,565 INFO [trainer.py:765] (0/8) Epoch 16, batch 200, train_loss[loss=2.744, ArTop10Accuracy=0.7763, over 13896.00 frames. ], tot_loss[loss=2.744, ArTop10Accuracy=0.7806, over 7760.92 frames. ], batch size: 35, lr: 7.61e-03 2024-08-06 12:45:18,501 INFO [trainer.py:765] (0/8) Epoch 16, batch 300, train_loss[loss=2.8, ArTop10Accuracy=0.7674, over 14070.00 frames. ], tot_loss[loss=2.742, ArTop10Accuracy=0.7808, over 9372.95 frames. ], batch size: 44, lr: 7.59e-03 2024-08-06 12:46:45,208 INFO [trainer.py:765] (0/8) Epoch 16, batch 400, train_loss[loss=2.716, ArTop10Accuracy=0.7876, over 10245.00 frames. ], tot_loss[loss=2.743, ArTop10Accuracy=0.7806, over 10296.61 frames. ], batch size: 14, lr: 7.58e-03 2024-08-06 12:48:16,312 INFO [trainer.py:765] (0/8) Epoch 16, batch 500, train_loss[loss=2.686, ArTop10Accuracy=0.7917, over 12087.00 frames. ], tot_loss[loss=2.738, ArTop10Accuracy=0.7818, over 10853.59 frames. ], batch size: 22, lr: 7.56e-03 2024-08-06 12:49:46,641 INFO [trainer.py:765] (0/8) Epoch 16, batch 600, train_loss[loss=2.69, ArTop10Accuracy=0.792, over 11271.00 frames. ], tot_loss[loss=2.741, ArTop10Accuracy=0.7811, over 11366.38 frames. ], batch size: 18, lr: 7.54e-03 2024-08-06 12:51:23,680 INFO [trainer.py:765] (0/8) Epoch 16, batch 700, train_loss[loss=2.612, ArTop10Accuracy=0.8047, over 9414.00 frames. ], tot_loss[loss=2.743, ArTop10Accuracy=0.7808, over 11501.99 frames. ], batch size: 11, lr: 7.52e-03 2024-08-06 12:52:43,500 INFO [trainer.py:765] (0/8) Epoch 16, batch 800, train_loss[loss=2.757, ArTop10Accuracy=0.7792, over 9558.00 frames. ], tot_loss[loss=2.748, ArTop10Accuracy=0.7797, over 11628.70 frames. ], batch size: 11, lr: 7.51e-03 2024-08-06 12:53:06,014 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-20000.pt 2024-08-06 12:53:08,969 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 12:53:15,494 INFO [trainer.py:811] (0/8) Epoch 16, validation: loss=2.816, ArTop10Accuracy=0.7678, over 1827537.00 frames. 2024-08-06 12:53:15,495 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 29524MB 2024-08-06 12:53:16,187 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.112e+02 1.291e+02 1.391e+02 1.487e+02 3.459e+02, threshold=2.783e+02, percent-clipped=0.1 2024-08-06 12:54:06,480 INFO [trainer.py:765] (0/8) Epoch 16, batch 900, train_loss[loss=2.719, ArTop10Accuracy=0.7879, over 12855.00 frames. ], tot_loss[loss=2.748, ArTop10Accuracy=0.7799, over 11666.65 frames. ], batch size: 27, lr: 7.49e-03 2024-08-06 12:55:19,791 INFO [trainer.py:765] (0/8) Epoch 16, batch 1000, train_loss[loss=2.721, ArTop10Accuracy=0.7857, over 12909.00 frames. ], tot_loss[loss=2.75, ArTop10Accuracy=0.78, over 11872.94 frames. ], batch size: 27, lr: 7.47e-03 2024-08-06 12:56:33,165 INFO [trainer.py:765] (0/8) Epoch 16, batch 1100, train_loss[loss=2.78, ArTop10Accuracy=0.7767, over 13479.00 frames. ], tot_loss[loss=2.758, ArTop10Accuracy=0.7784, over 11947.93 frames. ], batch size: 34, lr: 7.45e-03 2024-08-06 12:57:48,485 INFO [trainer.py:765] (0/8) Epoch 16, batch 1200, train_loss[loss=2.835, ArTop10Accuracy=0.7596, over 12075.00 frames. ], tot_loss[loss=2.756, ArTop10Accuracy=0.7786, over 11864.13 frames. ], batch size: 103, lr: 7.44e-03 2024-08-06 12:58:48,289 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 12:58:48,292 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-16.pt 2024-08-06 13:00:47,900 INFO [trainer.py:765] (0/8) Epoch 17, batch 100, train_loss[loss=2.789, ArTop10Accuracy=0.7719, over 14058.00 frames. ], tot_loss[loss=2.741, ArTop10Accuracy=0.7811, over 4756.29 frames. ], batch size: 62, lr: 7.18e-03 2024-08-06 13:02:19,302 INFO [trainer.py:765] (0/8) Epoch 17, batch 200, train_loss[loss=2.781, ArTop10Accuracy=0.7758, over 14010.00 frames. ], tot_loss[loss=2.737, ArTop10Accuracy=0.7818, over 7771.17 frames. ], batch size: 35, lr: 7.17e-03 2024-08-06 13:03:45,518 INFO [trainer.py:765] (0/8) Epoch 17, batch 300, train_loss[loss=2.776, ArTop10Accuracy=0.7769, over 14058.00 frames. ], tot_loss[loss=2.73, ArTop10Accuracy=0.7832, over 9383.89 frames. ], batch size: 44, lr: 7.15e-03 2024-08-06 13:05:21,760 INFO [trainer.py:765] (0/8) Epoch 17, batch 400, train_loss[loss=2.661, ArTop10Accuracy=0.8014, over 10203.00 frames. ], tot_loss[loss=2.725, ArTop10Accuracy=0.7843, over 10293.21 frames. ], batch size: 14, lr: 7.14e-03 2024-08-06 13:06:47,021 INFO [trainer.py:765] (0/8) Epoch 17, batch 500, train_loss[loss=2.668, ArTop10Accuracy=0.7993, over 11934.00 frames. ], tot_loss[loss=2.723, ArTop10Accuracy=0.7846, over 10850.62 frames. ], batch size: 22, lr: 7.12e-03 2024-08-06 13:07:39,880 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.140e+02 1.293e+02 1.386e+02 1.488e+02 3.253e+02, threshold=2.772e+02, percent-clipped=0.1 2024-08-06 13:08:22,689 INFO [trainer.py:765] (0/8) Epoch 17, batch 600, train_loss[loss=2.689, ArTop10Accuracy=0.7944, over 11556.00 frames. ], tot_loss[loss=2.725, ArTop10Accuracy=0.7845, over 11380.73 frames. ], batch size: 18, lr: 7.10e-03 2024-08-06 13:09:54,835 INFO [trainer.py:765] (0/8) Epoch 17, batch 700, train_loss[loss=2.51, ArTop10Accuracy=0.8276, over 10056.00 frames. ], tot_loss[loss=2.732, ArTop10Accuracy=0.7833, over 11512.80 frames. ], batch size: 12, lr: 7.09e-03 2024-08-06 13:11:19,481 INFO [trainer.py:765] (0/8) Epoch 17, batch 800, train_loss[loss=2.601, ArTop10Accuracy=0.8115, over 10008.00 frames. ], tot_loss[loss=2.733, ArTop10Accuracy=0.7831, over 11634.34 frames. ], batch size: 12, lr: 7.07e-03 2024-08-06 13:12:35,670 INFO [trainer.py:765] (0/8) Epoch 17, batch 900, train_loss[loss=2.754, ArTop10Accuracy=0.7803, over 13182.00 frames. ], tot_loss[loss=2.731, ArTop10Accuracy=0.7836, over 11691.70 frames. ], batch size: 28, lr: 7.06e-03 2024-08-06 13:13:53,063 INFO [trainer.py:765] (0/8) Epoch 17, batch 1000, train_loss[loss=2.715, ArTop10Accuracy=0.79, over 13344.00 frames. ], tot_loss[loss=2.737, ArTop10Accuracy=0.7822, over 11909.14 frames. ], batch size: 28, lr: 7.04e-03 2024-08-06 13:15:08,483 INFO [trainer.py:765] (0/8) Epoch 17, batch 1100, train_loss[loss=2.693, ArTop10Accuracy=0.7931, over 13545.00 frames. ], tot_loss[loss=2.744, ArTop10Accuracy=0.7808, over 11965.37 frames. ], batch size: 34, lr: 7.02e-03 2024-08-06 13:16:22,389 INFO [trainer.py:765] (0/8) Epoch 17, batch 1200, train_loss[loss=2.875, ArTop10Accuracy=0.7577, over 12714.00 frames. ], tot_loss[loss=2.744, ArTop10Accuracy=0.7807, over 11871.85 frames. ], batch size: 101, lr: 7.01e-03 2024-08-06 13:17:21,130 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 13:17:21,134 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-17.pt 2024-08-06 13:19:15,995 INFO [trainer.py:765] (0/8) Epoch 18, batch 100, train_loss[loss=2.79, ArTop10Accuracy=0.7737, over 14730.00 frames. ], tot_loss[loss=2.738, ArTop10Accuracy=0.7815, over 4768.08 frames. ], batch size: 62, lr: 6.78e-03 2024-08-06 13:20:46,597 INFO [trainer.py:765] (0/8) Epoch 18, batch 200, train_loss[loss=2.71, ArTop10Accuracy=0.7899, over 13821.00 frames. ], tot_loss[loss=2.73, ArTop10Accuracy=0.7832, over 7767.37 frames. ], batch size: 34, lr: 6.77e-03 2024-08-06 13:21:55,105 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 13:22:04,751 INFO [trainer.py:811] (0/8) Epoch 18, validation: loss=2.817, ArTop10Accuracy=0.768, over 1827537.00 frames. 2024-08-06 13:22:04,752 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 29524MB 2024-08-06 13:22:05,474 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.131e+02 1.323e+02 1.409e+02 1.514e+02 3.209e+02, threshold=2.818e+02, percent-clipped=0.1 2024-08-06 13:22:26,581 INFO [trainer.py:765] (0/8) Epoch 18, batch 300, train_loss[loss=2.789, ArTop10Accuracy=0.7743, over 14439.00 frames. ], tot_loss[loss=2.721, ArTop10Accuracy=0.7848, over 9396.72 frames. ], batch size: 45, lr: 6.76e-03 2024-08-06 13:23:57,928 INFO [trainer.py:765] (0/8) Epoch 18, batch 400, train_loss[loss=2.672, ArTop10Accuracy=0.7928, over 10332.00 frames. ], tot_loss[loss=2.718, ArTop10Accuracy=0.7856, over 10300.70 frames. ], batch size: 14, lr: 6.74e-03 2024-08-06 13:25:34,012 INFO [trainer.py:765] (0/8) Epoch 18, batch 500, train_loss[loss=2.645, ArTop10Accuracy=0.8047, over 12105.00 frames. ], tot_loss[loss=2.714, ArTop10Accuracy=0.7864, over 10847.39 frames. ], batch size: 22, lr: 6.73e-03 2024-08-06 13:27:00,632 INFO [trainer.py:765] (0/8) Epoch 18, batch 600, train_loss[loss=2.638, ArTop10Accuracy=0.7969, over 12132.00 frames. ], tot_loss[loss=2.716, ArTop10Accuracy=0.7859, over 11382.46 frames. ], batch size: 19, lr: 6.71e-03 2024-08-06 13:28:33,583 INFO [trainer.py:765] (0/8) Epoch 18, batch 700, train_loss[loss=2.602, ArTop10Accuracy=0.8091, over 10143.00 frames. ], tot_loss[loss=2.724, ArTop10Accuracy=0.7847, over 11529.41 frames. ], batch size: 12, lr: 6.70e-03 2024-08-06 13:29:54,986 INFO [trainer.py:765] (0/8) Epoch 18, batch 800, train_loss[loss=2.684, ArTop10Accuracy=0.7945, over 10206.00 frames. ], tot_loss[loss=2.727, ArTop10Accuracy=0.7842, over 11648.40 frames. ], batch size: 12, lr: 6.68e-03 2024-08-06 13:31:12,518 INFO [trainer.py:765] (0/8) Epoch 18, batch 900, train_loss[loss=2.679, ArTop10Accuracy=0.7971, over 12792.00 frames. ], tot_loss[loss=2.721, ArTop10Accuracy=0.7852, over 11695.39 frames. ], batch size: 27, lr: 6.67e-03 2024-08-06 13:32:26,550 INFO [trainer.py:765] (0/8) Epoch 18, batch 1000, train_loss[loss=2.67, ArTop10Accuracy=0.7925, over 12756.00 frames. ], tot_loss[loss=2.727, ArTop10Accuracy=0.7841, over 11888.08 frames. ], batch size: 27, lr: 6.66e-03 2024-08-06 13:33:41,496 INFO [trainer.py:765] (0/8) Epoch 18, batch 1100, train_loss[loss=2.722, ArTop10Accuracy=0.7837, over 13665.00 frames. ], tot_loss[loss=2.732, ArTop10Accuracy=0.7832, over 11958.37 frames. ], batch size: 34, lr: 6.64e-03 2024-08-06 13:34:54,675 INFO [trainer.py:765] (0/8) Epoch 18, batch 1200, train_loss[loss=2.85, ArTop10Accuracy=0.7681, over 12753.00 frames. ], tot_loss[loss=2.731, ArTop10Accuracy=0.7834, over 11867.53 frames. ], batch size: 101, lr: 6.63e-03 2024-08-06 13:35:51,064 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.124e+02 1.340e+02 1.433e+02 1.533e+02 2.444e+02, threshold=2.867e+02, percent-clipped=0.0 2024-08-06 13:35:54,972 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 13:35:54,974 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-18.pt 2024-08-06 13:37:48,624 INFO [trainer.py:765] (0/8) Epoch 19, batch 100, train_loss[loss=2.798, ArTop10Accuracy=0.7718, over 14607.00 frames. ], tot_loss[loss=2.719, ArTop10Accuracy=0.7848, over 4757.47 frames. ], batch size: 62, lr: 6.43e-03 2024-08-06 13:39:23,256 INFO [trainer.py:765] (0/8) Epoch 19, batch 200, train_loss[loss=2.751, ArTop10Accuracy=0.7775, over 13698.00 frames. ], tot_loss[loss=2.715, ArTop10Accuracy=0.7858, over 7745.97 frames. ], batch size: 34, lr: 6.41e-03 2024-08-06 13:40:48,358 INFO [trainer.py:765] (0/8) Epoch 19, batch 300, train_loss[loss=2.768, ArTop10Accuracy=0.7791, over 14661.00 frames. ], tot_loss[loss=2.708, ArTop10Accuracy=0.7874, over 9366.48 frames. ], batch size: 45, lr: 6.40e-03 2024-08-06 13:42:21,068 INFO [trainer.py:765] (0/8) Epoch 19, batch 400, train_loss[loss=2.615, ArTop10Accuracy=0.8009, over 10503.00 frames. ], tot_loss[loss=2.706, ArTop10Accuracy=0.7879, over 10283.24 frames. ], batch size: 14, lr: 6.39e-03 2024-08-06 13:43:44,955 INFO [trainer.py:765] (0/8) Epoch 19, batch 500, train_loss[loss=2.719, ArTop10Accuracy=0.7864, over 12033.00 frames. ], tot_loss[loss=2.704, ArTop10Accuracy=0.7883, over 10843.55 frames. ], batch size: 22, lr: 6.37e-03 2024-08-06 13:45:16,681 INFO [trainer.py:765] (0/8) Epoch 19, batch 600, train_loss[loss=2.701, ArTop10Accuracy=0.7897, over 11457.00 frames. ], tot_loss[loss=2.706, ArTop10Accuracy=0.788, over 11364.68 frames. ], batch size: 18, lr: 6.36e-03 2024-08-06 13:46:48,323 INFO [trainer.py:765] (0/8) Epoch 19, batch 700, train_loss[loss=2.563, ArTop10Accuracy=0.8169, over 10290.00 frames. ], tot_loss[loss=2.709, ArTop10Accuracy=0.7874, over 11523.58 frames. ], batch size: 12, lr: 6.35e-03 2024-08-06 13:48:11,883 INFO [trainer.py:765] (0/8) Epoch 19, batch 800, train_loss[loss=2.58, ArTop10Accuracy=0.8112, over 10350.00 frames. ], tot_loss[loss=2.713, ArTop10Accuracy=0.7865, over 11661.89 frames. ], batch size: 12, lr: 6.34e-03 2024-08-06 13:49:27,256 INFO [trainer.py:765] (0/8) Epoch 19, batch 900, train_loss[loss=2.645, ArTop10Accuracy=0.8003, over 13215.00 frames. ], tot_loss[loss=2.708, ArTop10Accuracy=0.7877, over 11704.49 frames. ], batch size: 27, lr: 6.32e-03 2024-08-06 13:50:40,654 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 13:50:50,535 INFO [trainer.py:811] (0/8) Epoch 19, validation: loss=2.818, ArTop10Accuracy=0.7679, over 1827537.00 frames. 2024-08-06 13:50:50,535 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 29524MB 2024-08-06 13:50:51,491 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.161e+02 1.371e+02 1.455e+02 1.550e+02 3.697e+02, threshold=2.909e+02, percent-clipped=0.2 2024-08-06 13:50:52,917 INFO [trainer.py:765] (0/8) Epoch 19, batch 1000, train_loss[loss=2.755, ArTop10Accuracy=0.7868, over 13050.00 frames. ], tot_loss[loss=2.713, ArTop10Accuracy=0.7866, over 11906.01 frames. ], batch size: 27, lr: 6.31e-03 2024-08-06 13:52:08,264 INFO [trainer.py:765] (0/8) Epoch 19, batch 1100, train_loss[loss=2.738, ArTop10Accuracy=0.7811, over 13623.00 frames. ], tot_loss[loss=2.72, ArTop10Accuracy=0.7853, over 11973.95 frames. ], batch size: 34, lr: 6.30e-03 2024-08-06 13:53:22,314 INFO [trainer.py:765] (0/8) Epoch 19, batch 1200, train_loss[loss=2.835, ArTop10Accuracy=0.7578, over 12549.00 frames. ], tot_loss[loss=2.723, ArTop10Accuracy=0.7847, over 11876.88 frames. ], batch size: 103, lr: 6.28e-03 2024-08-06 13:54:21,954 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 13:54:21,958 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-19.pt 2024-08-06 13:56:12,902 INFO [trainer.py:765] (0/8) Epoch 20, batch 100, train_loss[loss=2.747, ArTop10Accuracy=0.7778, over 14466.00 frames. ], tot_loss[loss=2.71, ArTop10Accuracy=0.7868, over 4752.63 frames. ], batch size: 63, lr: 6.10e-03 2024-08-06 13:57:42,495 INFO [trainer.py:765] (0/8) Epoch 20, batch 200, train_loss[loss=2.695, ArTop10Accuracy=0.7915, over 13728.00 frames. ], tot_loss[loss=2.708, ArTop10Accuracy=0.7873, over 7731.34 frames. ], batch size: 34, lr: 6.09e-03 2024-08-06 13:59:15,430 INFO [trainer.py:765] (0/8) Epoch 20, batch 300, train_loss[loss=2.759, ArTop10Accuracy=0.775, over 14556.00 frames. ], tot_loss[loss=2.706, ArTop10Accuracy=0.7876, over 9388.49 frames. ], batch size: 45, lr: 6.08e-03 2024-08-06 14:00:44,358 INFO [trainer.py:765] (0/8) Epoch 20, batch 400, train_loss[loss=2.515, ArTop10Accuracy=0.8197, over 10503.00 frames. ], tot_loss[loss=2.699, ArTop10Accuracy=0.7887, over 10287.72 frames. ], batch size: 14, lr: 6.07e-03 2024-08-06 14:02:14,855 INFO [trainer.py:765] (0/8) Epoch 20, batch 500, train_loss[loss=2.665, ArTop10Accuracy=0.7981, over 12078.00 frames. ], tot_loss[loss=2.693, ArTop10Accuracy=0.7902, over 10843.66 frames. ], batch size: 22, lr: 6.06e-03 2024-08-06 14:03:40,853 INFO [trainer.py:765] (0/8) Epoch 20, batch 600, train_loss[loss=2.729, ArTop10Accuracy=0.7816, over 11331.00 frames. ], tot_loss[loss=2.695, ArTop10Accuracy=0.7897, over 11359.53 frames. ], batch size: 18, lr: 6.04e-03 2024-08-06 14:05:13,864 INFO [trainer.py:765] (0/8) Epoch 20, batch 700, train_loss[loss=2.575, ArTop10Accuracy=0.8131, over 10071.00 frames. ], tot_loss[loss=2.7, ArTop10Accuracy=0.789, over 11524.60 frames. ], batch size: 12, lr: 6.03e-03 2024-08-06 14:05:30,792 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.180e+02 1.365e+02 1.456e+02 1.550e+02 3.525e+02, threshold=2.913e+02, percent-clipped=0.1 2024-08-06 14:06:34,509 INFO [trainer.py:765] (0/8) Epoch 20, batch 800, train_loss[loss=2.737, ArTop10Accuracy=0.7809, over 10116.00 frames. ], tot_loss[loss=2.704, ArTop10Accuracy=0.7881, over 11642.44 frames. ], batch size: 12, lr: 6.02e-03 2024-08-06 14:07:50,944 INFO [trainer.py:765] (0/8) Epoch 20, batch 900, train_loss[loss=2.648, ArTop10Accuracy=0.796, over 12945.00 frames. ], tot_loss[loss=2.701, ArTop10Accuracy=0.7887, over 11672.81 frames. ], batch size: 27, lr: 6.01e-03 2024-08-06 14:09:07,174 INFO [trainer.py:765] (0/8) Epoch 20, batch 1000, train_loss[loss=2.709, ArTop10Accuracy=0.7859, over 13047.00 frames. ], tot_loss[loss=2.708, ArTop10Accuracy=0.7876, over 11885.74 frames. ], batch size: 27, lr: 6.00e-03 2024-08-06 14:10:21,209 INFO [trainer.py:765] (0/8) Epoch 20, batch 1100, train_loss[loss=2.764, ArTop10Accuracy=0.7796, over 13680.00 frames. ], tot_loss[loss=2.716, ArTop10Accuracy=0.7862, over 11935.78 frames. ], batch size: 34, lr: 5.99e-03 2024-08-06 14:11:37,813 INFO [trainer.py:765] (0/8) Epoch 20, batch 1200, train_loss[loss=2.803, ArTop10Accuracy=0.7718, over 11997.00 frames. ], tot_loss[loss=2.718, ArTop10Accuracy=0.7858, over 11870.84 frames. ], batch size: 101, lr: 5.98e-03 2024-08-06 14:12:37,148 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 14:12:37,151 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-20.pt 2024-08-06 14:12:43,011 INFO [trainer.py:1069] (0/8) Done!