2024-08-06 14:23:41,767 INFO [trainer.py:870] (0/8) Training started 2024-08-06 14:23:41,773 INFO [trainer.py:889] (0/8) Device: cuda:0 2024-08-06 14:23:41,773 INFO [trainer.py:890] (0/8) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 100, 'reset_interval': 200, 'valid_interval': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '279b0c87015a615b81b147251814d737a548f397', 'k2-git-date': 'Wed May 24 22:24:09 2023', 'lhotse-version': '1.26.0', 'torch-version': '2.0.1+cu118', 'torch-cuda-available': True, 'torch-cuda-version': '11.8', 'python-version': '3.10', 'icefall-git-branch': None, 'icefall-git-sha1': None, 'icefall-git-date': None, 'icefall-path': '/workspace/icefall_llm', 'k2-path': '/usr/local/lib/python3.10/dist-packages/k2/__init__.py', 'lhotse-path': '/usr/local/lib/python3.10/dist-packages/lhotse/__init__.py', 'hostname': '6867463', 'IP address': '0.104.202.7'}, 'world_size': 8, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 40, 'start_epoch': 100, 'start_batch': 0, 'exp_dir': PosixPath('exp/valle'), 'optimizer_name': 'ScaledAdam', 'scheduler_name': 'Eden', 'base_lr': 0.03, 'warmup_steps': 200, 'seed': 42, 'inf_check': False, 'save_every_n': 100000, 'keep_last_k': 20, 'average_period': 0, 'accumulate_grad_steps': 2, 'dtype': 'float32', 'filter_min_duration': 0.5, 'filter_max_duration': 14.0, 'train_stage': 2, 'visualize': False, 'oom_check': False, 'model_name': 'valle', 'decoder_dim': 1024, 'nhead': 16, 'num_decoder_layers': 12, 'scale_factor': 1.0, 'norm_first': True, 'add_prenet': False, 'prefix_mode': 1, 'share_embedding': True, 'prepend_bos': False, 'num_quantizers': 8, 'scaling_xformers': False, 'manifest_dir': PosixPath('data/tokenized'), 'max_duration': 160, 'bucketing_sampler': True, 'num_buckets': 6, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 0.1, 'on_the_fly_feats': False, 'shuffle': True, 'buffer_size': 40000, 'shuffle_buffer_size': 100000, 'drop_last': False, 'return_cuts': True, 'num_workers': 8, 'enable_spec_aug': False, 'spec_aug_time_warp_factor': 80, 'input_strategy': 'PrecomputedFeatures', 'dataset': 'libritts', 'text_tokens': 'data/tokenized/unique_text_tokens.k2symbols', 'sampling_rate': 24000} 2024-08-06 14:23:41,773 INFO [trainer.py:892] (0/8) About to create model 2024-08-06 14:23:42,559 INFO [trainer.py:899] (0/8) Number of model parameters: 367386628 2024-08-06 14:23:42,559 INFO [checkpoint.py:112] (0/8) Loading checkpoint from exp/valle/epoch-99.pt 2024-08-06 14:23:47,526 INFO [trainer.py:914] (0/8) Using DDP 2024-08-06 14:23:49,643 INFO [datamodule.py:427] (0/8) About to get train cuts 2024-08-06 14:23:49,644 INFO [datamodule.py:434] (0/8) About to get dev cuts 2024-08-06 14:23:49,646 INFO [datamodule.py:292] (0/8) Disable SpecAugment 2024-08-06 14:23:49,646 INFO [datamodule.py:294] (0/8) About to create train dataset 2024-08-06 14:23:49,646 INFO [datamodule.py:323] (0/8) Using DynamicBucketingSampler 2024-08-06 14:23:50,267 INFO [datamodule.py:344] (0/8) About to create train dataloader 2024-08-06 14:23:50,267 INFO [datamodule.py:367] (0/8) About to create dev dataset 2024-08-06 14:23:50,599 INFO [datamodule.py:388] (0/8) About to create dev dataloader 2024-08-06 14:24:38,248 INFO [trainer.py:765] (0/8) Epoch 1, batch 100, train_loss[loss=105, NarTop10Accuracy=0.02097, over 7647.00 frames. ], tot_loss[loss=73.82, NarTop10Accuracy=0.04726, over 2372.89 frames. ], batch size: 32, lr: 2.25e-02 2024-08-06 14:25:07,518 INFO [trainer.py:765] (0/8) Epoch 1, batch 200, train_loss[loss=142.8, NarTop10Accuracy=0.01102, over 6753.00 frames. ], tot_loss[loss=97.51, NarTop10Accuracy=0.0428, over 3846.60 frames. ], batch size: 17, lr: 3.00e-02 2024-08-06 14:25:37,110 INFO [trainer.py:765] (0/8) Epoch 1, batch 300, train_loss[loss=106.3, NarTop10Accuracy=0.01929, over 6993.00 frames. ], tot_loss[loss=85.06, NarTop10Accuracy=0.04279, over 4664.29 frames. ], batch size: 22, lr: 3.00e-02 2024-08-06 14:26:07,483 INFO [trainer.py:765] (0/8) Epoch 1, batch 400, train_loss[loss=51.43, NarTop10Accuracy=0.01936, over 5139.00 frames. ], tot_loss[loss=67.66, NarTop10Accuracy=0.0466, over 5130.21 frames. ], batch size: 7, lr: 3.00e-02 2024-08-06 14:26:35,358 INFO [trainer.py:765] (0/8) Epoch 1, batch 500, train_loss[loss=14.94, NarTop10Accuracy=0.0248, over 6060.00 frames. ], tot_loss[loss=49, NarTop10Accuracy=0.04913, over 5394.70 frames. ], batch size: 11, lr: 2.99e-02 2024-08-06 14:27:04,001 INFO [trainer.py:765] (0/8) Epoch 1, batch 600, train_loss[loss=6.167, NarTop10Accuracy=0.1951, over 5670.00 frames. ], tot_loss[loss=33.42, NarTop10Accuracy=0.0545, over 5662.20 frames. ], batch size: 9, lr: 2.99e-02 2024-08-06 14:27:39,491 INFO [trainer.py:765] (0/8) Epoch 1, batch 700, train_loss[loss=6.837, NarTop10Accuracy=0.09458, over 5040.00 frames. ], tot_loss[loss=23.4, NarTop10Accuracy=0.06406, over 5731.17 frames. ], batch size: 6, lr: 2.99e-02 2024-08-06 14:28:08,833 INFO [trainer.py:765] (0/8) Epoch 1, batch 800, train_loss[loss=6.401, NarTop10Accuracy=0.1431, over 5061.00 frames. ], tot_loss[loss=17.15, NarTop10Accuracy=0.08487, over 5790.72 frames. ], batch size: 6, lr: 2.98e-02 2024-08-06 14:28:36,759 INFO [trainer.py:765] (0/8) Epoch 1, batch 900, train_loss[loss=5.758, NarTop10Accuracy=0.1671, over 6312.00 frames. ], tot_loss[loss=12.78, NarTop10Accuracy=0.1136, over 5810.92 frames. ], batch size: 13, lr: 2.98e-02 2024-08-06 14:29:12,588 INFO [trainer.py:765] (0/8) Epoch 1, batch 1000, train_loss[loss=5.851, NarTop10Accuracy=0.1673, over 6138.00 frames. ], tot_loss[loss=10.09, NarTop10Accuracy=0.1359, over 5910.81 frames. ], batch size: 13, lr: 2.97e-02 2024-08-06 14:29:42,826 INFO [trainer.py:765] (0/8) Epoch 1, batch 1100, train_loss[loss=5.707, NarTop10Accuracy=0.1984, over 6663.00 frames. ], tot_loss[loss=8.416, NarTop10Accuracy=0.1532, over 5940.25 frames. ], batch size: 17, lr: 2.96e-02 2024-08-06 14:30:11,470 INFO [trainer.py:765] (0/8) Epoch 1, batch 1200, train_loss[loss=5.887, NarTop10Accuracy=0.1764, over 7245.00 frames. ], tot_loss[loss=7.35, NarTop10Accuracy=0.1708, over 5937.84 frames. ], batch size: 31, lr: 2.96e-02 2024-08-06 14:30:48,752 INFO [trainer.py:765] (0/8) Epoch 1, batch 1300, train_loss[loss=5.249, NarTop10Accuracy=0.2838, over 5001.00 frames. ], tot_loss[loss=6.686, NarTop10Accuracy=0.1861, over 5990.56 frames. ], batch size: 6, lr: 2.95e-02 2024-08-06 14:31:18,145 INFO [trainer.py:765] (0/8) Epoch 1, batch 1400, train_loss[loss=5.694, NarTop10Accuracy=0.191, over 6123.00 frames. ], tot_loss[loss=6.256, NarTop10Accuracy=0.1972, over 6011.67 frames. ], batch size: 11, lr: 2.94e-02 2024-08-06 14:31:46,027 INFO [trainer.py:765] (0/8) Epoch 1, batch 1500, train_loss[loss=5.649, NarTop10Accuracy=0.206, over 6378.00 frames. ], tot_loss[loss=5.971, NarTop10Accuracy=0.2093, over 5947.78 frames. ], batch size: 51, lr: 2.94e-02 2024-08-06 14:32:13,693 INFO [trainer.py:765] (0/8) Epoch 1, batch 1600, train_loss[loss=5.52, NarTop10Accuracy=0.226, over 7074.00 frames. ], tot_loss[loss=5.79, NarTop10Accuracy=0.2179, over 5913.02 frames. ], batch size: 22, lr: 2.93e-02 2024-08-06 14:32:40,199 INFO [trainer.py:765] (0/8) Epoch 1, batch 1700, train_loss[loss=5.463, NarTop10Accuracy=0.2397, over 6135.00 frames. ], tot_loss[loss=5.669, NarTop10Accuracy=0.2252, over 5903.26 frames. ], batch size: 13, lr: 2.92e-02 2024-08-06 14:33:06,500 INFO [trainer.py:765] (0/8) Epoch 1, batch 1800, train_loss[loss=5.624, NarTop10Accuracy=0.1997, over 7122.00 frames. ], tot_loss[loss=5.588, NarTop10Accuracy=0.2304, over 5955.21 frames. ], batch size: 23, lr: 2.91e-02 2024-08-06 14:33:32,626 INFO [trainer.py:765] (0/8) Epoch 1, batch 1900, train_loss[loss=5.713, NarTop10Accuracy=0.1862, over 6579.00 frames. ], tot_loss[loss=5.513, NarTop10Accuracy=0.2396, over 6011.46 frames. ], batch size: 51, lr: 2.90e-02 2024-08-06 14:33:58,015 INFO [trainer.py:765] (0/8) Epoch 1, batch 2000, train_loss[loss=5.458, NarTop10Accuracy=0.2503, over 5856.00 frames. ], tot_loss[loss=5.446, NarTop10Accuracy=0.2494, over 6003.31 frames. ], batch size: 51, lr: 2.89e-02 2024-08-06 14:33:58,017 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 14:34:06,103 INFO [trainer.py:811] (0/8) Epoch 1, validation: loss=5.397, NarTop10Accuracy=0.2581, over 1905321.00 frames. 2024-08-06 14:34:06,104 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 26917MB 2024-08-06 14:34:06,612 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 4.749e+01 2.278e+02 7.300e+02 1.664e+04 7.177e+05, threshold=1.460e+03, percent-clipped=0.0 2024-08-06 14:34:32,061 INFO [trainer.py:765] (0/8) Epoch 1, batch 2100, train_loss[loss=5.12, NarTop10Accuracy=0.3081, over 4809.00 frames. ], tot_loss[loss=5.388, NarTop10Accuracy=0.2592, over 5970.78 frames. ], batch size: 5, lr: 2.88e-02 2024-08-06 14:34:57,303 INFO [trainer.py:765] (0/8) Epoch 1, batch 2200, train_loss[loss=5.504, NarTop10Accuracy=0.2395, over 7437.00 frames. ], tot_loss[loss=5.36, NarTop10Accuracy=0.2634, over 6011.58 frames. ], batch size: 33, lr: 2.87e-02 2024-08-06 14:35:22,455 INFO [trainer.py:765] (0/8) Epoch 1, batch 2300, train_loss[loss=5.201, NarTop10Accuracy=0.2931, over 5613.00 frames. ], tot_loss[loss=5.339, NarTop10Accuracy=0.2667, over 6023.24 frames. ], batch size: 9, lr: 2.86e-02 2024-08-06 14:35:46,815 INFO [trainer.py:765] (0/8) Epoch 1, batch 2400, train_loss[loss=5.374, NarTop10Accuracy=0.2526, over 5205.00 frames. ], tot_loss[loss=5.282, NarTop10Accuracy=0.2775, over 5777.18 frames. ], batch size: 7, lr: 2.85e-02 2024-08-06 14:36:10,408 INFO [trainer.py:765] (0/8) Epoch 1, batch 2500, train_loss[loss=5.143, NarTop10Accuracy=0.3044, over 5031.00 frames. ], tot_loss[loss=5.217, NarTop10Accuracy=0.2889, over 5500.73 frames. ], batch size: 7, lr: 2.84e-02 2024-08-06 14:36:31,000 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 14:36:31,003 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-1.pt 2024-08-06 14:37:29,671 INFO [trainer.py:765] (0/8) Epoch 2, batch 100, train_loss[loss=4.991, NarTop10Accuracy=0.3392, over 7191.00 frames. ], tot_loss[loss=5.184, NarTop10Accuracy=0.2963, over 2375.63 frames. ], batch size: 31, lr: 2.77e-02 2024-08-06 14:38:10,016 INFO [trainer.py:765] (0/8) Epoch 2, batch 200, train_loss[loss=5.163, NarTop10Accuracy=0.3069, over 6723.00 frames. ], tot_loss[loss=5.16, NarTop10Accuracy=0.2996, over 3850.00 frames. ], batch size: 17, lr: 2.76e-02 2024-08-06 14:38:38,298 INFO [trainer.py:765] (0/8) Epoch 2, batch 300, train_loss[loss=5.1, NarTop10Accuracy=0.3142, over 6855.00 frames. ], tot_loss[loss=5.131, NarTop10Accuracy=0.3048, over 4655.37 frames. ], batch size: 22, lr: 2.75e-02 2024-08-06 14:39:07,000 INFO [trainer.py:765] (0/8) Epoch 2, batch 400, train_loss[loss=4.828, NarTop10Accuracy=0.3566, over 5016.00 frames. ], tot_loss[loss=5.113, NarTop10Accuracy=0.3074, over 5119.47 frames. ], batch size: 7, lr: 2.74e-02 2024-08-06 14:39:46,120 INFO [trainer.py:765] (0/8) Epoch 2, batch 500, train_loss[loss=4.949, NarTop10Accuracy=0.3396, over 6144.00 frames. ], tot_loss[loss=5.07, NarTop10Accuracy=0.316, over 5389.35 frames. ], batch size: 11, lr: 2.73e-02 2024-08-06 14:40:15,084 INFO [trainer.py:765] (0/8) Epoch 2, batch 600, train_loss[loss=4.951, NarTop10Accuracy=0.3419, over 5760.00 frames. ], tot_loss[loss=5.046, NarTop10Accuracy=0.3206, over 5665.67 frames. ], batch size: 9, lr: 2.71e-02 2024-08-06 14:40:44,592 INFO [trainer.py:765] (0/8) Epoch 2, batch 700, train_loss[loss=5.12, NarTop10Accuracy=0.3019, over 5169.00 frames. ], tot_loss[loss=5.033, NarTop10Accuracy=0.3227, over 5715.02 frames. ], batch size: 6, lr: 2.70e-02 2024-08-06 14:41:24,516 INFO [trainer.py:765] (0/8) Epoch 2, batch 800, train_loss[loss=4.982, NarTop10Accuracy=0.3311, over 4239.00 frames. ], tot_loss[loss=5.019, NarTop10Accuracy=0.3247, over 5768.80 frames. ], batch size: 5, lr: 2.69e-02 2024-08-06 14:41:54,406 INFO [trainer.py:765] (0/8) Epoch 2, batch 900, train_loss[loss=4.866, NarTop10Accuracy=0.3503, over 6240.00 frames. ], tot_loss[loss=4.981, NarTop10Accuracy=0.3323, over 5789.80 frames. ], batch size: 13, lr: 2.68e-02 2024-08-06 14:42:23,903 INFO [trainer.py:765] (0/8) Epoch 2, batch 1000, train_loss[loss=4.773, NarTop10Accuracy=0.3753, over 6576.00 frames. ], tot_loss[loss=4.949, NarTop10Accuracy=0.3387, over 5883.22 frames. ], batch size: 14, lr: 2.66e-02 2024-08-06 14:42:56,256 INFO [trainer.py:765] (0/8) Epoch 2, batch 1100, train_loss[loss=4.971, NarTop10Accuracy=0.3313, over 6870.00 frames. ], tot_loss[loss=4.929, NarTop10Accuracy=0.3423, over 5921.81 frames. ], batch size: 17, lr: 2.65e-02 2024-08-06 14:43:35,189 INFO [trainer.py:765] (0/8) Epoch 2, batch 1200, train_loss[loss=4.731, NarTop10Accuracy=0.3814, over 7320.00 frames. ], tot_loss[loss=4.91, NarTop10Accuracy=0.3457, over 5918.38 frames. ], batch size: 31, lr: 2.64e-02 2024-08-06 14:44:04,347 INFO [trainer.py:765] (0/8) Epoch 2, batch 1300, train_loss[loss=4.799, NarTop10Accuracy=0.369, over 5079.00 frames. ], tot_loss[loss=4.866, NarTop10Accuracy=0.3539, over 5986.90 frames. ], batch size: 6, lr: 2.63e-02 2024-08-06 14:44:33,728 INFO [trainer.py:765] (0/8) Epoch 2, batch 1400, train_loss[loss=4.98, NarTop10Accuracy=0.3216, over 6153.00 frames. ], tot_loss[loss=4.844, NarTop10Accuracy=0.3582, over 6020.35 frames. ], batch size: 11, lr: 2.61e-02 2024-08-06 14:44:40,444 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 14:44:48,506 INFO [trainer.py:811] (0/8) Epoch 2, validation: loss=4.808, NarTop10Accuracy=0.3642, over 1905321.00 frames. 2024-08-06 14:44:48,506 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 27188MB 2024-08-06 14:44:49,204 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 6.328e+01 1.178e+02 1.410e+02 1.789e+02 6.269e+02, threshold=2.821e+02, percent-clipped=0.0 2024-08-06 14:45:09,806 INFO [trainer.py:765] (0/8) Epoch 2, batch 1500, train_loss[loss=4.809, NarTop10Accuracy=0.3668, over 6102.00 frames. ], tot_loss[loss=4.824, NarTop10Accuracy=0.3621, over 5958.65 frames. ], batch size: 51, lr: 2.60e-02 2024-08-06 14:45:37,659 INFO [trainer.py:765] (0/8) Epoch 2, batch 1600, train_loss[loss=4.768, NarTop10Accuracy=0.3801, over 7035.00 frames. ], tot_loss[loss=4.804, NarTop10Accuracy=0.3657, over 5933.77 frames. ], batch size: 22, lr: 2.59e-02 2024-08-06 14:46:04,368 INFO [trainer.py:765] (0/8) Epoch 2, batch 1700, train_loss[loss=4.818, NarTop10Accuracy=0.3555, over 6576.00 frames. ], tot_loss[loss=4.792, NarTop10Accuracy=0.368, over 5925.42 frames. ], batch size: 14, lr: 2.58e-02 2024-08-06 14:46:31,034 INFO [trainer.py:765] (0/8) Epoch 2, batch 1800, train_loss[loss=4.731, NarTop10Accuracy=0.3781, over 6921.00 frames. ], tot_loss[loss=4.772, NarTop10Accuracy=0.372, over 5993.33 frames. ], batch size: 22, lr: 2.56e-02 2024-08-06 14:46:57,532 INFO [trainer.py:765] (0/8) Epoch 2, batch 1900, train_loss[loss=4.704, NarTop10Accuracy=0.3845, over 5661.00 frames. ], tot_loss[loss=4.756, NarTop10Accuracy=0.3749, over 6019.38 frames. ], batch size: 50, lr: 2.55e-02 2024-08-06 14:47:23,234 INFO [trainer.py:765] (0/8) Epoch 2, batch 2000, train_loss[loss=4.906, NarTop10Accuracy=0.3549, over 6351.00 frames. ], tot_loss[loss=4.73, NarTop10Accuracy=0.3797, over 5991.27 frames. ], batch size: 53, lr: 2.54e-02 2024-08-06 14:47:48,589 INFO [trainer.py:765] (0/8) Epoch 2, batch 2100, train_loss[loss=4.848, NarTop10Accuracy=0.3518, over 3993.00 frames. ], tot_loss[loss=4.719, NarTop10Accuracy=0.3816, over 5979.39 frames. ], batch size: 4, lr: 2.53e-02 2024-08-06 14:48:13,765 INFO [trainer.py:765] (0/8) Epoch 2, batch 2200, train_loss[loss=4.635, NarTop10Accuracy=0.3985, over 7368.00 frames. ], tot_loss[loss=4.681, NarTop10Accuracy=0.389, over 6007.30 frames. ], batch size: 31, lr: 2.51e-02 2024-08-06 14:48:38,951 INFO [trainer.py:765] (0/8) Epoch 2, batch 2300, train_loss[loss=4.958, NarTop10Accuracy=0.326, over 5823.00 frames. ], tot_loss[loss=4.687, NarTop10Accuracy=0.3878, over 6021.27 frames. ], batch size: 9, lr: 2.50e-02 2024-08-06 14:49:03,320 INFO [trainer.py:765] (0/8) Epoch 2, batch 2400, train_loss[loss=4.474, NarTop10Accuracy=0.4173, over 5052.00 frames. ], tot_loss[loss=4.647, NarTop10Accuracy=0.3955, over 5773.42 frames. ], batch size: 7, lr: 2.49e-02 2024-08-06 14:49:26,867 INFO [trainer.py:765] (0/8) Epoch 2, batch 2500, train_loss[loss=4.64, NarTop10Accuracy=0.3929, over 5148.00 frames. ], tot_loss[loss=4.611, NarTop10Accuracy=0.4024, over 5463.74 frames. ], batch size: 7, lr: 2.48e-02 2024-08-06 14:49:46,775 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 14:49:46,779 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-2.pt 2024-08-06 14:50:51,117 INFO [trainer.py:765] (0/8) Epoch 3, batch 100, train_loss[loss=4.81, NarTop10Accuracy=0.3602, over 7140.00 frames. ], tot_loss[loss=4.586, NarTop10Accuracy=0.4079, over 2365.43 frames. ], batch size: 31, lr: 2.36e-02 2024-08-06 14:51:20,388 INFO [trainer.py:765] (0/8) Epoch 3, batch 200, train_loss[loss=4.805, NarTop10Accuracy=0.357, over 6777.00 frames. ], tot_loss[loss=4.547, NarTop10Accuracy=0.4157, over 3852.19 frames. ], batch size: 17, lr: 2.34e-02 2024-08-06 14:51:50,954 INFO [trainer.py:765] (0/8) Epoch 3, batch 300, train_loss[loss=4.713, NarTop10Accuracy=0.3716, over 7197.00 frames. ], tot_loss[loss=4.516, NarTop10Accuracy=0.4214, over 4636.61 frames. ], batch size: 22, lr: 2.33e-02 2024-08-06 14:52:32,359 INFO [trainer.py:765] (0/8) Epoch 3, batch 400, train_loss[loss=4.327, NarTop10Accuracy=0.462, over 5748.00 frames. ], tot_loss[loss=4.499, NarTop10Accuracy=0.4246, over 5104.02 frames. ], batch size: 8, lr: 2.32e-02 2024-08-06 14:53:00,680 INFO [trainer.py:765] (0/8) Epoch 3, batch 500, train_loss[loss=4.365, NarTop10Accuracy=0.4534, over 6012.00 frames. ], tot_loss[loss=4.485, NarTop10Accuracy=0.4271, over 5391.19 frames. ], batch size: 11, lr: 2.31e-02 2024-08-06 14:53:29,551 INFO [trainer.py:765] (0/8) Epoch 3, batch 600, train_loss[loss=4.103, NarTop10Accuracy=0.507, over 5727.00 frames. ], tot_loss[loss=4.472, NarTop10Accuracy=0.43, over 5655.49 frames. ], batch size: 9, lr: 2.30e-02 2024-08-06 14:54:12,466 INFO [trainer.py:765] (0/8) Epoch 3, batch 700, train_loss[loss=4.268, NarTop10Accuracy=0.4737, over 5190.00 frames. ], tot_loss[loss=4.449, NarTop10Accuracy=0.4346, over 5749.35 frames. ], batch size: 6, lr: 2.29e-02 2024-08-06 14:54:44,785 INFO [trainer.py:765] (0/8) Epoch 3, batch 800, train_loss[loss=4.224, NarTop10Accuracy=0.4837, over 5067.00 frames. ], tot_loss[loss=4.417, NarTop10Accuracy=0.4408, over 5814.17 frames. ], batch size: 6, lr: 2.28e-02 2024-08-06 14:54:58,684 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 14:55:06,655 INFO [trainer.py:811] (0/8) Epoch 3, validation: loss=4.276, NarTop10Accuracy=0.4689, over 1905321.00 frames. 2024-08-06 14:55:06,656 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 27314MB 2024-08-06 14:55:07,183 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 8.443e+01 1.396e+02 1.639e+02 2.017e+02 7.124e+02, threshold=3.277e+02, percent-clipped=4.5 2024-08-06 14:55:21,052 INFO [trainer.py:765] (0/8) Epoch 3, batch 900, train_loss[loss=4.033, NarTop10Accuracy=0.5184, over 6273.00 frames. ], tot_loss[loss=4.39, NarTop10Accuracy=0.4463, over 5836.51 frames. ], batch size: 13, lr: 2.26e-02 2024-08-06 14:56:04,958 INFO [trainer.py:765] (0/8) Epoch 3, batch 1000, train_loss[loss=4.229, NarTop10Accuracy=0.4771, over 6246.00 frames. ], tot_loss[loss=4.373, NarTop10Accuracy=0.4494, over 5922.26 frames. ], batch size: 13, lr: 2.25e-02 2024-08-06 14:56:37,301 INFO [trainer.py:765] (0/8) Epoch 3, batch 1100, train_loss[loss=4.378, NarTop10Accuracy=0.4348, over 6687.00 frames. ], tot_loss[loss=4.344, NarTop10Accuracy=0.4548, over 5939.32 frames. ], batch size: 17, lr: 2.24e-02 2024-08-06 14:57:06,378 INFO [trainer.py:765] (0/8) Epoch 3, batch 1200, train_loss[loss=4.296, NarTop10Accuracy=0.4554, over 7257.00 frames. ], tot_loss[loss=4.325, NarTop10Accuracy=0.4585, over 5932.72 frames. ], batch size: 31, lr: 2.23e-02 2024-08-06 14:57:51,631 INFO [trainer.py:765] (0/8) Epoch 3, batch 1300, train_loss[loss=4.371, NarTop10Accuracy=0.4529, over 4245.00 frames. ], tot_loss[loss=4.304, NarTop10Accuracy=0.4628, over 5964.61 frames. ], batch size: 5, lr: 2.22e-02 2024-08-06 14:58:22,900 INFO [trainer.py:765] (0/8) Epoch 3, batch 1400, train_loss[loss=4.044, NarTop10Accuracy=0.5134, over 6153.00 frames. ], tot_loss[loss=4.294, NarTop10Accuracy=0.4645, over 6009.45 frames. ], batch size: 11, lr: 2.21e-02 2024-08-06 14:58:50,856 INFO [trainer.py:765] (0/8) Epoch 3, batch 1500, train_loss[loss=4.359, NarTop10Accuracy=0.4528, over 6273.00 frames. ], tot_loss[loss=4.276, NarTop10Accuracy=0.4683, over 5963.27 frames. ], batch size: 50, lr: 2.20e-02 2024-08-06 14:59:18,715 INFO [trainer.py:765] (0/8) Epoch 3, batch 1600, train_loss[loss=3.953, NarTop10Accuracy=0.5381, over 7149.00 frames. ], tot_loss[loss=4.26, NarTop10Accuracy=0.4716, over 5956.55 frames. ], batch size: 22, lr: 2.19e-02 2024-08-06 14:59:45,952 INFO [trainer.py:765] (0/8) Epoch 3, batch 1700, train_loss[loss=4.181, NarTop10Accuracy=0.4818, over 6693.00 frames. ], tot_loss[loss=4.232, NarTop10Accuracy=0.4769, over 5934.67 frames. ], batch size: 14, lr: 2.18e-02 2024-08-06 15:00:12,498 INFO [trainer.py:765] (0/8) Epoch 3, batch 1800, train_loss[loss=3.962, NarTop10Accuracy=0.5352, over 7137.00 frames. ], tot_loss[loss=4.213, NarTop10Accuracy=0.4808, over 5982.10 frames. ], batch size: 22, lr: 2.17e-02 2024-08-06 15:00:38,949 INFO [trainer.py:765] (0/8) Epoch 3, batch 1900, train_loss[loss=4.646, NarTop10Accuracy=0.3923, over 6207.00 frames. ], tot_loss[loss=4.199, NarTop10Accuracy=0.4836, over 6025.95 frames. ], batch size: 51, lr: 2.16e-02 2024-08-06 15:01:04,606 INFO [trainer.py:765] (0/8) Epoch 3, batch 2000, train_loss[loss=4.487, NarTop10Accuracy=0.4217, over 6312.00 frames. ], tot_loss[loss=4.163, NarTop10Accuracy=0.4908, over 5977.51 frames. ], batch size: 50, lr: 2.15e-02 2024-08-06 15:01:29,899 INFO [trainer.py:765] (0/8) Epoch 3, batch 2100, train_loss[loss=3.958, NarTop10Accuracy=0.5327, over 3945.00 frames. ], tot_loss[loss=4.14, NarTop10Accuracy=0.4955, over 5948.52 frames. ], batch size: 4, lr: 2.14e-02 2024-08-06 15:01:55,182 INFO [trainer.py:765] (0/8) Epoch 3, batch 2200, train_loss[loss=3.954, NarTop10Accuracy=0.5321, over 7194.00 frames. ], tot_loss[loss=4.115, NarTop10Accuracy=0.5012, over 6007.35 frames. ], batch size: 31, lr: 2.13e-02 2024-08-06 15:02:20,410 INFO [trainer.py:765] (0/8) Epoch 3, batch 2300, train_loss[loss=4.411, NarTop10Accuracy=0.441, over 5682.00 frames. ], tot_loss[loss=4.127, NarTop10Accuracy=0.4986, over 6011.85 frames. ], batch size: 9, lr: 2.12e-02 2024-08-06 15:02:44,663 INFO [trainer.py:765] (0/8) Epoch 3, batch 2400, train_loss[loss=4.354, NarTop10Accuracy=0.4518, over 5019.00 frames. ], tot_loss[loss=4.102, NarTop10Accuracy=0.5037, over 5774.34 frames. ], batch size: 7, lr: 2.11e-02 2024-08-06 15:03:08,235 INFO [trainer.py:765] (0/8) Epoch 3, batch 2500, train_loss[loss=3.895, NarTop10Accuracy=0.5494, over 5250.00 frames. ], tot_loss[loss=4.048, NarTop10Accuracy=0.5146, over 5475.66 frames. ], batch size: 7, lr: 2.10e-02 2024-08-06 15:03:28,391 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 15:03:28,393 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-3.pt 2024-08-06 15:04:28,130 INFO [trainer.py:765] (0/8) Epoch 4, batch 100, train_loss[loss=3.959, NarTop10Accuracy=0.5323, over 7410.00 frames. ], tot_loss[loss=4.031, NarTop10Accuracy=0.5179, over 2368.88 frames. ], batch size: 31, lr: 1.97e-02 2024-08-06 15:04:59,842 INFO [trainer.py:765] (0/8) Epoch 4, batch 200, train_loss[loss=3.704, NarTop10Accuracy=0.5801, over 6750.00 frames. ], tot_loss[loss=4.01, NarTop10Accuracy=0.5229, over 3861.12 frames. ], batch size: 17, lr: 1.96e-02 2024-08-06 15:05:27,509 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 15:05:35,694 INFO [trainer.py:811] (0/8) Epoch 4, validation: loss=3.804, NarTop10Accuracy=0.5644, over 1905321.00 frames. 2024-08-06 15:05:35,695 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 27314MB 2024-08-06 15:05:36,238 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.166e+02 1.765e+02 1.975e+02 2.270e+02 5.852e+02, threshold=3.949e+02, percent-clipped=2.8 2024-08-06 15:05:43,889 INFO [trainer.py:765] (0/8) Epoch 4, batch 300, train_loss[loss=3.767, NarTop10Accuracy=0.5713, over 7185.00 frames. ], tot_loss[loss=3.998, NarTop10Accuracy=0.5254, over 4661.91 frames. ], batch size: 22, lr: 1.95e-02 2024-08-06 15:06:16,124 INFO [trainer.py:765] (0/8) Epoch 4, batch 400, train_loss[loss=3.578, NarTop10Accuracy=0.6158, over 5229.00 frames. ], tot_loss[loss=4.008, NarTop10Accuracy=0.5231, over 5111.58 frames. ], batch size: 7, lr: 1.94e-02 2024-08-06 15:06:46,473 INFO [trainer.py:765] (0/8) Epoch 4, batch 500, train_loss[loss=4.22, NarTop10Accuracy=0.4842, over 6174.00 frames. ], tot_loss[loss=3.989, NarTop10Accuracy=0.5267, over 5380.80 frames. ], batch size: 11, lr: 1.93e-02 2024-08-06 15:07:23,817 INFO [trainer.py:765] (0/8) Epoch 4, batch 600, train_loss[loss=3.705, NarTop10Accuracy=0.588, over 5838.00 frames. ], tot_loss[loss=3.981, NarTop10Accuracy=0.528, over 5658.01 frames. ], batch size: 9, lr: 1.93e-02 2024-08-06 15:07:59,001 INFO [trainer.py:765] (0/8) Epoch 4, batch 700, train_loss[loss=4.371, NarTop10Accuracy=0.4459, over 4305.00 frames. ], tot_loss[loss=3.976, NarTop10Accuracy=0.5293, over 5741.18 frames. ], batch size: 5, lr: 1.92e-02 2024-08-06 15:08:32,429 INFO [trainer.py:765] (0/8) Epoch 4, batch 800, train_loss[loss=3.669, NarTop10Accuracy=0.5992, over 5055.00 frames. ], tot_loss[loss=3.962, NarTop10Accuracy=0.5322, over 5790.70 frames. ], batch size: 6, lr: 1.91e-02 2024-08-06 15:09:10,690 INFO [trainer.py:765] (0/8) Epoch 4, batch 900, train_loss[loss=3.559, NarTop10Accuracy=0.611, over 6246.00 frames. ], tot_loss[loss=3.925, NarTop10Accuracy=0.5396, over 5798.61 frames. ], batch size: 13, lr: 1.90e-02 2024-08-06 15:09:46,076 INFO [trainer.py:765] (0/8) Epoch 4, batch 1000, train_loss[loss=3.57, NarTop10Accuracy=0.6103, over 6225.00 frames. ], tot_loss[loss=3.911, NarTop10Accuracy=0.542, over 5902.03 frames. ], batch size: 13, lr: 1.89e-02 2024-08-06 15:10:18,140 INFO [trainer.py:765] (0/8) Epoch 4, batch 1100, train_loss[loss=3.842, NarTop10Accuracy=0.5547, over 6840.00 frames. ], tot_loss[loss=3.902, NarTop10Accuracy=0.5441, over 5938.36 frames. ], batch size: 17, lr: 1.88e-02 2024-08-06 15:10:55,075 INFO [trainer.py:765] (0/8) Epoch 4, batch 1200, train_loss[loss=4.37, NarTop10Accuracy=0.4399, over 7392.00 frames. ], tot_loss[loss=3.898, NarTop10Accuracy=0.545, over 5943.45 frames. ], batch size: 31, lr: 1.88e-02 2024-08-06 15:11:32,074 INFO [trainer.py:765] (0/8) Epoch 4, batch 1300, train_loss[loss=3.712, NarTop10Accuracy=0.5762, over 5076.00 frames. ], tot_loss[loss=3.859, NarTop10Accuracy=0.5525, over 6002.17 frames. ], batch size: 6, lr: 1.87e-02 2024-08-06 15:12:05,688 INFO [trainer.py:765] (0/8) Epoch 4, batch 1400, train_loss[loss=3.787, NarTop10Accuracy=0.5743, over 6069.00 frames. ], tot_loss[loss=3.859, NarTop10Accuracy=0.5528, over 6028.65 frames. ], batch size: 11, lr: 1.86e-02 2024-08-06 15:12:33,695 INFO [trainer.py:765] (0/8) Epoch 4, batch 1500, train_loss[loss=3.836, NarTop10Accuracy=0.5589, over 6168.00 frames. ], tot_loss[loss=3.862, NarTop10Accuracy=0.5522, over 5971.10 frames. ], batch size: 50, lr: 1.85e-02 2024-08-06 15:13:01,510 INFO [trainer.py:765] (0/8) Epoch 4, batch 1600, train_loss[loss=3.801, NarTop10Accuracy=0.5672, over 6996.00 frames. ], tot_loss[loss=3.856, NarTop10Accuracy=0.5534, over 5936.92 frames. ], batch size: 22, lr: 1.84e-02 2024-08-06 15:13:28,133 INFO [trainer.py:765] (0/8) Epoch 4, batch 1700, train_loss[loss=3.699, NarTop10Accuracy=0.583, over 6189.00 frames. ], tot_loss[loss=3.826, NarTop10Accuracy=0.5597, over 5930.29 frames. ], batch size: 13, lr: 1.84e-02 2024-08-06 15:13:54,557 INFO [trainer.py:765] (0/8) Epoch 4, batch 1800, train_loss[loss=3.77, NarTop10Accuracy=0.5676, over 7167.00 frames. ], tot_loss[loss=3.824, NarTop10Accuracy=0.56, over 5975.82 frames. ], batch size: 23, lr: 1.83e-02 2024-08-06 15:14:20,998 INFO [trainer.py:765] (0/8) Epoch 4, batch 1900, train_loss[loss=3.773, NarTop10Accuracy=0.573, over 6150.00 frames. ], tot_loss[loss=3.848, NarTop10Accuracy=0.5557, over 6012.83 frames. ], batch size: 50, lr: 1.82e-02 2024-08-06 15:14:46,672 INFO [trainer.py:765] (0/8) Epoch 4, batch 2000, train_loss[loss=3.771, NarTop10Accuracy=0.582, over 6834.00 frames. ], tot_loss[loss=3.828, NarTop10Accuracy=0.5596, over 6004.56 frames. ], batch size: 53, lr: 1.81e-02 2024-08-06 15:15:11,859 INFO [trainer.py:765] (0/8) Epoch 4, batch 2100, train_loss[loss=3.572, NarTop10Accuracy=0.603, over 4863.00 frames. ], tot_loss[loss=3.81, NarTop10Accuracy=0.5633, over 5990.80 frames. ], batch size: 5, lr: 1.81e-02 2024-08-06 15:15:37,089 INFO [trainer.py:765] (0/8) Epoch 4, batch 2200, train_loss[loss=3.668, NarTop10Accuracy=0.597, over 7176.00 frames. ], tot_loss[loss=3.804, NarTop10Accuracy=0.5641, over 6034.65 frames. ], batch size: 31, lr: 1.80e-02 2024-08-06 15:15:55,090 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 15:16:03,242 INFO [trainer.py:811] (0/8) Epoch 4, validation: loss=3.665, NarTop10Accuracy=0.5912, over 1905321.00 frames. 2024-08-06 15:16:03,243 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 27314MB 2024-08-06 15:16:03,741 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.414e+02 1.889e+02 2.096e+02 2.369e+02 1.168e+03, threshold=4.192e+02, percent-clipped=1.7 2024-08-06 15:16:10,347 INFO [trainer.py:765] (0/8) Epoch 4, batch 2300, train_loss[loss=3.624, NarTop10Accuracy=0.6049, over 5622.00 frames. ], tot_loss[loss=3.806, NarTop10Accuracy=0.5637, over 6042.25 frames. ], batch size: 9, lr: 1.79e-02 2024-08-06 15:16:34,840 INFO [trainer.py:765] (0/8) Epoch 4, batch 2400, train_loss[loss=3.616, NarTop10Accuracy=0.6004, over 5166.00 frames. ], tot_loss[loss=3.772, NarTop10Accuracy=0.5705, over 5778.66 frames. ], batch size: 7, lr: 1.79e-02 2024-08-06 15:16:58,535 INFO [trainer.py:765] (0/8) Epoch 4, batch 2500, train_loss[loss=3.387, NarTop10Accuracy=0.6682, over 5262.00 frames. ], tot_loss[loss=3.762, NarTop10Accuracy=0.5725, over 5475.32 frames. ], batch size: 7, lr: 1.78e-02 2024-08-06 15:17:18,131 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 15:17:18,134 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-4.pt 2024-08-06 15:18:24,101 INFO [trainer.py:765] (0/8) Epoch 5, batch 100, train_loss[loss=3.599, NarTop10Accuracy=0.6115, over 7323.00 frames. ], tot_loss[loss=3.78, NarTop10Accuracy=0.5691, over 2362.71 frames. ], batch size: 31, lr: 1.66e-02 2024-08-06 15:18:59,676 INFO [trainer.py:765] (0/8) Epoch 5, batch 200, train_loss[loss=4.161, NarTop10Accuracy=0.4899, over 6672.00 frames. ], tot_loss[loss=3.759, NarTop10Accuracy=0.5735, over 3852.17 frames. ], batch size: 17, lr: 1.65e-02 2024-08-06 15:19:32,888 INFO [trainer.py:765] (0/8) Epoch 5, batch 300, train_loss[loss=3.961, NarTop10Accuracy=0.5252, over 7227.00 frames. ], tot_loss[loss=3.732, NarTop10Accuracy=0.5786, over 4659.95 frames. ], batch size: 22, lr: 1.65e-02 2024-08-06 15:20:01,656 INFO [trainer.py:765] (0/8) Epoch 5, batch 400, train_loss[loss=3.658, NarTop10Accuracy=0.5945, over 5106.00 frames. ], tot_loss[loss=3.719, NarTop10Accuracy=0.5811, over 5116.71 frames. ], batch size: 7, lr: 1.64e-02 2024-08-06 15:20:38,298 INFO [trainer.py:765] (0/8) Epoch 5, batch 500, train_loss[loss=3.926, NarTop10Accuracy=0.5349, over 6000.00 frames. ], tot_loss[loss=3.738, NarTop10Accuracy=0.5774, over 5391.54 frames. ], batch size: 11, lr: 1.63e-02 2024-08-06 15:21:13,711 INFO [trainer.py:765] (0/8) Epoch 5, batch 600, train_loss[loss=4.088, NarTop10Accuracy=0.5148, over 5781.00 frames. ], tot_loss[loss=3.721, NarTop10Accuracy=0.5809, over 5647.77 frames. ], batch size: 9, lr: 1.63e-02 2024-08-06 15:21:45,881 INFO [trainer.py:765] (0/8) Epoch 5, batch 700, train_loss[loss=3.485, NarTop10Accuracy=0.6278, over 4236.00 frames. ], tot_loss[loss=3.719, NarTop10Accuracy=0.5813, over 5728.32 frames. ], batch size: 5, lr: 1.62e-02 2024-08-06 15:22:24,499 INFO [trainer.py:765] (0/8) Epoch 5, batch 800, train_loss[loss=3.909, NarTop10Accuracy=0.5329, over 5073.00 frames. ], tot_loss[loss=3.71, NarTop10Accuracy=0.583, over 5793.58 frames. ], batch size: 6, lr: 1.62e-02 2024-08-06 15:22:56,783 INFO [trainer.py:765] (0/8) Epoch 5, batch 900, train_loss[loss=3.67, NarTop10Accuracy=0.5935, over 6273.00 frames. ], tot_loss[loss=3.698, NarTop10Accuracy=0.5849, over 5795.50 frames. ], batch size: 13, lr: 1.61e-02 2024-08-06 15:23:31,914 INFO [trainer.py:765] (0/8) Epoch 5, batch 1000, train_loss[loss=3.487, NarTop10Accuracy=0.6365, over 6570.00 frames. ], tot_loss[loss=3.688, NarTop10Accuracy=0.5874, over 5896.10 frames. ], batch size: 14, lr: 1.60e-02 2024-08-06 15:24:09,571 INFO [trainer.py:765] (0/8) Epoch 5, batch 1100, train_loss[loss=3.508, NarTop10Accuracy=0.6337, over 6819.00 frames. ], tot_loss[loss=3.679, NarTop10Accuracy=0.5894, over 5921.13 frames. ], batch size: 17, lr: 1.60e-02 2024-08-06 15:24:44,529 INFO [trainer.py:765] (0/8) Epoch 5, batch 1200, train_loss[loss=3.53, NarTop10Accuracy=0.6194, over 7293.00 frames. ], tot_loss[loss=3.679, NarTop10Accuracy=0.5894, over 5900.74 frames. ], batch size: 31, lr: 1.59e-02 2024-08-06 15:25:19,380 INFO [trainer.py:765] (0/8) Epoch 5, batch 1300, train_loss[loss=3.732, NarTop10Accuracy=0.5818, over 4248.00 frames. ], tot_loss[loss=3.668, NarTop10Accuracy=0.5916, over 5978.22 frames. ], batch size: 5, lr: 1.59e-02 2024-08-06 15:25:51,694 INFO [trainer.py:765] (0/8) Epoch 5, batch 1400, train_loss[loss=3.866, NarTop10Accuracy=0.5558, over 6120.00 frames. ], tot_loss[loss=3.671, NarTop10Accuracy=0.5912, over 6034.20 frames. ], batch size: 11, lr: 1.58e-02 2024-08-06 15:26:26,195 INFO [trainer.py:765] (0/8) Epoch 5, batch 1500, train_loss[loss=3.609, NarTop10Accuracy=0.612, over 6519.00 frames. ], tot_loss[loss=3.666, NarTop10Accuracy=0.5921, over 5971.11 frames. ], batch size: 51, lr: 1.58e-02 2024-08-06 15:26:54,130 INFO [trainer.py:765] (0/8) Epoch 5, batch 1600, train_loss[loss=3.463, NarTop10Accuracy=0.6322, over 7092.00 frames. ], tot_loss[loss=3.674, NarTop10Accuracy=0.5902, over 5953.33 frames. ], batch size: 22, lr: 1.57e-02 2024-08-06 15:27:19,604 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 15:27:27,821 INFO [trainer.py:811] (0/8) Epoch 5, validation: loss=3.552, NarTop10Accuracy=0.6147, over 1905321.00 frames. 2024-08-06 15:27:27,822 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 27314MB 2024-08-06 15:27:28,341 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.340e+02 1.756e+02 1.962e+02 2.205e+02 5.880e+02, threshold=3.924e+02, percent-clipped=0.8 2024-08-06 15:27:29,131 INFO [trainer.py:765] (0/8) Epoch 5, batch 1700, train_loss[loss=3.708, NarTop10Accuracy=0.5883, over 6198.00 frames. ], tot_loss[loss=3.67, NarTop10Accuracy=0.5909, over 5927.21 frames. ], batch size: 13, lr: 1.56e-02 2024-08-06 15:27:55,653 INFO [trainer.py:765] (0/8) Epoch 5, batch 1800, train_loss[loss=3.878, NarTop10Accuracy=0.5434, over 7143.00 frames. ], tot_loss[loss=3.666, NarTop10Accuracy=0.592, over 5997.54 frames. ], batch size: 22, lr: 1.56e-02 2024-08-06 15:28:22,172 INFO [trainer.py:765] (0/8) Epoch 5, batch 1900, train_loss[loss=3.628, NarTop10Accuracy=0.602, over 6159.00 frames. ], tot_loss[loss=3.667, NarTop10Accuracy=0.5915, over 6029.41 frames. ], batch size: 50, lr: 1.55e-02 2024-08-06 15:28:47,894 INFO [trainer.py:765] (0/8) Epoch 5, batch 2000, train_loss[loss=3.735, NarTop10Accuracy=0.5803, over 6147.00 frames. ], tot_loss[loss=3.67, NarTop10Accuracy=0.5906, over 6005.61 frames. ], batch size: 50, lr: 1.55e-02 2024-08-06 15:29:13,770 INFO [trainer.py:765] (0/8) Epoch 5, batch 2100, train_loss[loss=3.37, NarTop10Accuracy=0.6461, over 3960.00 frames. ], tot_loss[loss=3.679, NarTop10Accuracy=0.5881, over 5986.90 frames. ], batch size: 4, lr: 1.54e-02 2024-08-06 15:29:39,177 INFO [trainer.py:765] (0/8) Epoch 5, batch 2200, train_loss[loss=4.029, NarTop10Accuracy=0.5076, over 7449.00 frames. ], tot_loss[loss=3.662, NarTop10Accuracy=0.5918, over 6022.85 frames. ], batch size: 31, lr: 1.54e-02 2024-08-06 15:30:04,430 INFO [trainer.py:765] (0/8) Epoch 5, batch 2300, train_loss[loss=3.39, NarTop10Accuracy=0.6471, over 5805.00 frames. ], tot_loss[loss=3.669, NarTop10Accuracy=0.5903, over 6037.63 frames. ], batch size: 9, lr: 1.53e-02 2024-08-06 15:30:28,862 INFO [trainer.py:765] (0/8) Epoch 5, batch 2400, train_loss[loss=3.389, NarTop10Accuracy=0.6427, over 4998.00 frames. ], tot_loss[loss=3.645, NarTop10Accuracy=0.5954, over 5796.16 frames. ], batch size: 7, lr: 1.53e-02 2024-08-06 15:30:52,503 INFO [trainer.py:765] (0/8) Epoch 5, batch 2500, train_loss[loss=3.313, NarTop10Accuracy=0.6675, over 5151.00 frames. ], tot_loss[loss=3.61, NarTop10Accuracy=0.6026, over 5476.85 frames. ], batch size: 7, lr: 1.52e-02 2024-08-06 15:31:12,425 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 15:31:12,429 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-5.pt 2024-08-06 15:32:14,415 INFO [trainer.py:765] (0/8) Epoch 6, batch 100, train_loss[loss=3.445, NarTop10Accuracy=0.6468, over 7026.00 frames. ], tot_loss[loss=3.635, NarTop10Accuracy=0.5987, over 2359.54 frames. ], batch size: 31, lr: 1.42e-02 2024-08-06 15:32:46,015 INFO [trainer.py:765] (0/8) Epoch 6, batch 200, train_loss[loss=3.978, NarTop10Accuracy=0.5324, over 6699.00 frames. ], tot_loss[loss=3.612, NarTop10Accuracy=0.6033, over 3855.40 frames. ], batch size: 17, lr: 1.42e-02 2024-08-06 15:33:21,242 INFO [trainer.py:765] (0/8) Epoch 6, batch 300, train_loss[loss=3.462, NarTop10Accuracy=0.636, over 7227.00 frames. ], tot_loss[loss=3.599, NarTop10Accuracy=0.6053, over 4676.93 frames. ], batch size: 22, lr: 1.41e-02 2024-08-06 15:33:56,035 INFO [trainer.py:765] (0/8) Epoch 6, batch 400, train_loss[loss=3.425, NarTop10Accuracy=0.6412, over 5157.00 frames. ], tot_loss[loss=3.591, NarTop10Accuracy=0.6072, over 5124.41 frames. ], batch size: 7, lr: 1.41e-02 2024-08-06 15:34:26,759 INFO [trainer.py:765] (0/8) Epoch 6, batch 500, train_loss[loss=3.283, NarTop10Accuracy=0.6653, over 6084.00 frames. ], tot_loss[loss=3.577, NarTop10Accuracy=0.6105, over 5395.95 frames. ], batch size: 11, lr: 1.40e-02 2024-08-06 15:35:01,458 INFO [trainer.py:765] (0/8) Epoch 6, batch 600, train_loss[loss=3.319, NarTop10Accuracy=0.6589, over 5730.00 frames. ], tot_loss[loss=3.58, NarTop10Accuracy=0.6094, over 5676.42 frames. ], batch size: 9, lr: 1.40e-02 2024-08-06 15:35:32,734 INFO [trainer.py:765] (0/8) Epoch 6, batch 700, train_loss[loss=3.368, NarTop10Accuracy=0.6528, over 5175.00 frames. ], tot_loss[loss=3.584, NarTop10Accuracy=0.6083, over 5747.23 frames. ], batch size: 6, lr: 1.39e-02 2024-08-06 15:36:06,844 INFO [trainer.py:765] (0/8) Epoch 6, batch 800, train_loss[loss=3.853, NarTop10Accuracy=0.5519, over 4419.00 frames. ], tot_loss[loss=3.593, NarTop10Accuracy=0.6062, over 5801.03 frames. ], batch size: 5, lr: 1.39e-02 2024-08-06 15:36:40,384 INFO [trainer.py:765] (0/8) Epoch 6, batch 900, train_loss[loss=3.935, NarTop10Accuracy=0.5378, over 6195.00 frames. ], tot_loss[loss=3.578, NarTop10Accuracy=0.6093, over 5808.51 frames. ], batch size: 13, lr: 1.38e-02 2024-08-06 15:37:15,272 INFO [trainer.py:765] (0/8) Epoch 6, batch 1000, train_loss[loss=3.544, NarTop10Accuracy=0.6245, over 6663.00 frames. ], tot_loss[loss=3.594, NarTop10Accuracy=0.6058, over 5893.57 frames. ], batch size: 14, lr: 1.38e-02 2024-08-06 15:37:50,508 INFO [trainer.py:765] (0/8) Epoch 6, batch 1100, train_loss[loss=3.513, NarTop10Accuracy=0.6372, over 6816.00 frames. ], tot_loss[loss=3.592, NarTop10Accuracy=0.6063, over 5931.53 frames. ], batch size: 17, lr: 1.38e-02 2024-08-06 15:37:55,828 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 15:38:04,436 INFO [trainer.py:811] (0/8) Epoch 6, validation: loss=3.421, NarTop10Accuracy=0.6418, over 1905321.00 frames. 2024-08-06 15:38:04,437 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 27314MB 2024-08-06 15:38:04,966 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.415e+02 1.809e+02 1.991e+02 2.234e+02 5.215e+02, threshold=3.983e+02, percent-clipped=0.5 2024-08-06 15:38:36,168 INFO [trainer.py:765] (0/8) Epoch 6, batch 1200, train_loss[loss=3.433, NarTop10Accuracy=0.6435, over 7284.00 frames. ], tot_loss[loss=3.58, NarTop10Accuracy=0.6087, over 5911.47 frames. ], batch size: 31, lr: 1.37e-02 2024-08-06 15:39:08,243 INFO [trainer.py:765] (0/8) Epoch 6, batch 1300, train_loss[loss=3.347, NarTop10Accuracy=0.6542, over 5049.00 frames. ], tot_loss[loss=3.576, NarTop10Accuracy=0.6101, over 5985.53 frames. ], batch size: 6, lr: 1.37e-02 2024-08-06 15:39:44,070 INFO [trainer.py:765] (0/8) Epoch 6, batch 1400, train_loss[loss=3.413, NarTop10Accuracy=0.6514, over 6036.00 frames. ], tot_loss[loss=3.574, NarTop10Accuracy=0.6109, over 6020.78 frames. ], batch size: 11, lr: 1.36e-02 2024-08-06 15:40:15,383 INFO [trainer.py:765] (0/8) Epoch 6, batch 1500, train_loss[loss=3.959, NarTop10Accuracy=0.5284, over 6717.00 frames. ], tot_loss[loss=3.567, NarTop10Accuracy=0.6118, over 5951.86 frames. ], batch size: 52, lr: 1.36e-02 2024-08-06 15:40:43,106 INFO [trainer.py:765] (0/8) Epoch 6, batch 1600, train_loss[loss=3.439, NarTop10Accuracy=0.6464, over 7218.00 frames. ], tot_loss[loss=3.569, NarTop10Accuracy=0.6115, over 5930.18 frames. ], batch size: 23, lr: 1.35e-02 2024-08-06 15:41:09,789 INFO [trainer.py:765] (0/8) Epoch 6, batch 1700, train_loss[loss=3.48, NarTop10Accuracy=0.6384, over 6372.00 frames. ], tot_loss[loss=3.557, NarTop10Accuracy=0.6137, over 5915.95 frames. ], batch size: 13, lr: 1.35e-02 2024-08-06 15:41:36,317 INFO [trainer.py:765] (0/8) Epoch 6, batch 1800, train_loss[loss=3.425, NarTop10Accuracy=0.6426, over 7140.00 frames. ], tot_loss[loss=3.565, NarTop10Accuracy=0.6119, over 5985.41 frames. ], batch size: 22, lr: 1.35e-02 2024-08-06 15:42:02,720 INFO [trainer.py:765] (0/8) Epoch 6, batch 1900, train_loss[loss=3.792, NarTop10Accuracy=0.5725, over 5724.00 frames. ], tot_loss[loss=3.581, NarTop10Accuracy=0.6086, over 6025.36 frames. ], batch size: 50, lr: 1.34e-02 2024-08-06 15:42:28,319 INFO [trainer.py:765] (0/8) Epoch 6, batch 2000, train_loss[loss=3.491, NarTop10Accuracy=0.6278, over 6216.00 frames. ], tot_loss[loss=3.575, NarTop10Accuracy=0.6098, over 5984.80 frames. ], batch size: 50, lr: 1.34e-02 2024-08-06 15:42:53,669 INFO [trainer.py:765] (0/8) Epoch 6, batch 2100, train_loss[loss=3.37, NarTop10Accuracy=0.6529, over 3882.00 frames. ], tot_loss[loss=3.561, NarTop10Accuracy=0.6129, over 5964.57 frames. ], batch size: 4, lr: 1.33e-02 2024-08-06 15:43:18,977 INFO [trainer.py:765] (0/8) Epoch 6, batch 2200, train_loss[loss=3.825, NarTop10Accuracy=0.5623, over 7215.00 frames. ], tot_loss[loss=3.568, NarTop10Accuracy=0.6114, over 6018.32 frames. ], batch size: 31, lr: 1.33e-02 2024-08-06 15:43:44,105 INFO [trainer.py:765] (0/8) Epoch 6, batch 2300, train_loss[loss=3.368, NarTop10Accuracy=0.6542, over 5808.00 frames. ], tot_loss[loss=3.573, NarTop10Accuracy=0.6107, over 6028.53 frames. ], batch size: 9, lr: 1.33e-02 2024-08-06 15:44:08,620 INFO [trainer.py:765] (0/8) Epoch 6, batch 2400, train_loss[loss=3.253, NarTop10Accuracy=0.6769, over 5124.00 frames. ], tot_loss[loss=3.548, NarTop10Accuracy=0.6157, over 5771.82 frames. ], batch size: 7, lr: 1.32e-02 2024-08-06 15:44:32,132 INFO [trainer.py:765] (0/8) Epoch 6, batch 2500, train_loss[loss=3.543, NarTop10Accuracy=0.6223, over 5070.00 frames. ], tot_loss[loss=3.529, NarTop10Accuracy=0.6191, over 5490.16 frames. ], batch size: 7, lr: 1.32e-02 2024-08-06 15:44:51,940 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 15:44:51,944 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-6.pt 2024-08-06 15:45:58,043 INFO [trainer.py:765] (0/8) Epoch 7, batch 100, train_loss[loss=3.413, NarTop10Accuracy=0.641, over 7431.00 frames. ], tot_loss[loss=3.537, NarTop10Accuracy=0.6181, over 2371.15 frames. ], batch size: 31, lr: 1.24e-02 2024-08-06 15:46:33,615 INFO [trainer.py:765] (0/8) Epoch 7, batch 200, train_loss[loss=3.476, NarTop10Accuracy=0.639, over 6834.00 frames. ], tot_loss[loss=3.53, NarTop10Accuracy=0.6191, over 3873.64 frames. ], batch size: 17, lr: 1.23e-02 2024-08-06 15:47:03,247 INFO [trainer.py:765] (0/8) Epoch 7, batch 300, train_loss[loss=3.754, NarTop10Accuracy=0.5664, over 7251.00 frames. ], tot_loss[loss=3.543, NarTop10Accuracy=0.6165, over 4672.98 frames. ], batch size: 22, lr: 1.23e-02 2024-08-06 15:47:34,496 INFO [trainer.py:765] (0/8) Epoch 7, batch 400, train_loss[loss=3.298, NarTop10Accuracy=0.6589, over 5085.00 frames. ], tot_loss[loss=3.529, NarTop10Accuracy=0.6193, over 5111.55 frames. ], batch size: 7, lr: 1.23e-02 2024-08-06 15:48:13,731 INFO [trainer.py:765] (0/8) Epoch 7, batch 500, train_loss[loss=3.562, NarTop10Accuracy=0.6082, over 6084.00 frames. ], tot_loss[loss=3.523, NarTop10Accuracy=0.6206, over 5392.41 frames. ], batch size: 11, lr: 1.22e-02 2024-08-06 15:48:26,370 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 15:48:34,533 INFO [trainer.py:811] (0/8) Epoch 7, validation: loss=3.326, NarTop10Accuracy=0.6612, over 1905321.00 frames. 2024-08-06 15:48:34,534 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 28825MB 2024-08-06 15:48:35,078 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.466e+02 1.860e+02 2.018e+02 2.241e+02 5.111e+02, threshold=4.035e+02, percent-clipped=0.3 2024-08-06 15:48:52,721 INFO [trainer.py:765] (0/8) Epoch 7, batch 600, train_loss[loss=3.153, NarTop10Accuracy=0.6996, over 5727.00 frames. ], tot_loss[loss=3.522, NarTop10Accuracy=0.6207, over 5661.55 frames. ], batch size: 9, lr: 1.22e-02 2024-08-06 15:49:24,913 INFO [trainer.py:765] (0/8) Epoch 7, batch 700, train_loss[loss=3.569, NarTop10Accuracy=0.594, over 5202.00 frames. ], tot_loss[loss=3.514, NarTop10Accuracy=0.6223, over 5735.66 frames. ], batch size: 6, lr: 1.21e-02 2024-08-06 15:50:04,382 INFO [trainer.py:765] (0/8) Epoch 7, batch 800, train_loss[loss=3.363, NarTop10Accuracy=0.6648, over 5118.00 frames. ], tot_loss[loss=3.5, NarTop10Accuracy=0.6256, over 5800.32 frames. ], batch size: 6, lr: 1.21e-02 2024-08-06 15:50:34,550 INFO [trainer.py:765] (0/8) Epoch 7, batch 900, train_loss[loss=3.277, NarTop10Accuracy=0.6702, over 6195.00 frames. ], tot_loss[loss=3.494, NarTop10Accuracy=0.6265, over 5822.83 frames. ], batch size: 13, lr: 1.21e-02 2024-08-06 15:51:07,156 INFO [trainer.py:765] (0/8) Epoch 7, batch 1000, train_loss[loss=3.263, NarTop10Accuracy=0.6716, over 6372.00 frames. ], tot_loss[loss=3.49, NarTop10Accuracy=0.6275, over 5912.82 frames. ], batch size: 13, lr: 1.20e-02 2024-08-06 15:51:51,760 INFO [trainer.py:765] (0/8) Epoch 7, batch 1100, train_loss[loss=3.373, NarTop10Accuracy=0.6528, over 6822.00 frames. ], tot_loss[loss=3.492, NarTop10Accuracy=0.627, over 5933.00 frames. ], batch size: 17, lr: 1.20e-02 2024-08-06 15:52:22,700 INFO [trainer.py:765] (0/8) Epoch 7, batch 1200, train_loss[loss=3.368, NarTop10Accuracy=0.6553, over 7146.00 frames. ], tot_loss[loss=3.485, NarTop10Accuracy=0.6282, over 5921.73 frames. ], batch size: 31, lr: 1.20e-02 2024-08-06 15:52:52,008 INFO [trainer.py:765] (0/8) Epoch 7, batch 1300, train_loss[loss=3.264, NarTop10Accuracy=0.6692, over 4341.00 frames. ], tot_loss[loss=3.488, NarTop10Accuracy=0.6275, over 5977.49 frames. ], batch size: 5, lr: 1.19e-02 2024-08-06 15:53:33,843 INFO [trainer.py:765] (0/8) Epoch 7, batch 1400, train_loss[loss=3.409, NarTop10Accuracy=0.6467, over 6012.00 frames. ], tot_loss[loss=3.494, NarTop10Accuracy=0.6261, over 6021.77 frames. ], batch size: 11, lr: 1.19e-02 2024-08-06 15:54:04,600 INFO [trainer.py:765] (0/8) Epoch 7, batch 1500, train_loss[loss=3.784, NarTop10Accuracy=0.569, over 5850.00 frames. ], tot_loss[loss=3.475, NarTop10Accuracy=0.6302, over 5942.10 frames. ], batch size: 50, lr: 1.19e-02 2024-08-06 15:54:32,386 INFO [trainer.py:765] (0/8) Epoch 7, batch 1600, train_loss[loss=3.642, NarTop10Accuracy=0.5948, over 7005.00 frames. ], tot_loss[loss=3.486, NarTop10Accuracy=0.6279, over 5930.30 frames. ], batch size: 22, lr: 1.19e-02 2024-08-06 15:54:59,056 INFO [trainer.py:765] (0/8) Epoch 7, batch 1700, train_loss[loss=3.617, NarTop10Accuracy=0.5998, over 6645.00 frames. ], tot_loss[loss=3.497, NarTop10Accuracy=0.6251, over 5939.91 frames. ], batch size: 14, lr: 1.18e-02 2024-08-06 15:55:25,513 INFO [trainer.py:765] (0/8) Epoch 7, batch 1800, train_loss[loss=3.732, NarTop10Accuracy=0.5672, over 7101.00 frames. ], tot_loss[loss=3.488, NarTop10Accuracy=0.6275, over 5988.30 frames. ], batch size: 22, lr: 1.18e-02 2024-08-06 15:55:52,083 INFO [trainer.py:765] (0/8) Epoch 7, batch 1900, train_loss[loss=3.473, NarTop10Accuracy=0.6351, over 6111.00 frames. ], tot_loss[loss=3.507, NarTop10Accuracy=0.6236, over 6026.83 frames. ], batch size: 50, lr: 1.18e-02 2024-08-06 15:56:17,592 INFO [trainer.py:765] (0/8) Epoch 7, batch 2000, train_loss[loss=3.707, NarTop10Accuracy=0.5843, over 6405.00 frames. ], tot_loss[loss=3.505, NarTop10Accuracy=0.6242, over 6001.32 frames. ], batch size: 51, lr: 1.17e-02 2024-08-06 15:56:42,857 INFO [trainer.py:765] (0/8) Epoch 7, batch 2100, train_loss[loss=3.425, NarTop10Accuracy=0.6054, over 4068.00 frames. ], tot_loss[loss=3.486, NarTop10Accuracy=0.6281, over 5966.88 frames. ], batch size: 4, lr: 1.17e-02 2024-08-06 15:57:08,080 INFO [trainer.py:765] (0/8) Epoch 7, batch 2200, train_loss[loss=3.504, NarTop10Accuracy=0.6331, over 7044.00 frames. ], tot_loss[loss=3.504, NarTop10Accuracy=0.6244, over 6015.87 frames. ], batch size: 31, lr: 1.17e-02 2024-08-06 15:57:33,179 INFO [trainer.py:765] (0/8) Epoch 7, batch 2300, train_loss[loss=3.3, NarTop10Accuracy=0.6648, over 5808.00 frames. ], tot_loss[loss=3.506, NarTop10Accuracy=0.624, over 6029.06 frames. ], batch size: 9, lr: 1.16e-02 2024-08-06 15:57:57,620 INFO [trainer.py:765] (0/8) Epoch 7, batch 2400, train_loss[loss=3.372, NarTop10Accuracy=0.6556, over 5229.00 frames. ], tot_loss[loss=3.492, NarTop10Accuracy=0.6266, over 5775.86 frames. ], batch size: 7, lr: 1.16e-02 2024-08-06 15:58:21,089 INFO [trainer.py:765] (0/8) Epoch 7, batch 2500, train_loss[loss=3.687, NarTop10Accuracy=0.5931, over 5097.00 frames. ], tot_loss[loss=3.467, NarTop10Accuracy=0.6317, over 5461.86 frames. ], batch size: 7, lr: 1.16e-02 2024-08-06 15:58:31,566 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 15:58:39,769 INFO [trainer.py:811] (0/8) Epoch 7, validation: loss=3.381, NarTop10Accuracy=0.6488, over 1905321.00 frames. 2024-08-06 15:58:39,770 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 28825MB 2024-08-06 15:58:40,220 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.471e+02 1.831e+02 1.996e+02 2.207e+02 5.229e+02, threshold=3.992e+02, percent-clipped=0.2 2024-08-06 15:58:49,186 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 15:58:49,190 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-7.pt 2024-08-06 15:59:52,877 INFO [trainer.py:765] (0/8) Epoch 8, batch 100, train_loss[loss=3.513, NarTop10Accuracy=0.6199, over 7377.00 frames. ], tot_loss[loss=3.454, NarTop10Accuracy=0.6352, over 2361.58 frames. ], batch size: 31, lr: 1.09e-02 2024-08-06 16:00:27,881 INFO [trainer.py:765] (0/8) Epoch 8, batch 200, train_loss[loss=3.34, NarTop10Accuracy=0.6549, over 6918.00 frames. ], tot_loss[loss=3.476, NarTop10Accuracy=0.6304, over 3845.46 frames. ], batch size: 17, lr: 1.09e-02 2024-08-06 16:00:58,563 INFO [trainer.py:765] (0/8) Epoch 8, batch 300, train_loss[loss=3.351, NarTop10Accuracy=0.6574, over 7149.00 frames. ], tot_loss[loss=3.469, NarTop10Accuracy=0.632, over 4653.78 frames. ], batch size: 22, lr: 1.08e-02 2024-08-06 16:01:29,760 INFO [trainer.py:765] (0/8) Epoch 8, batch 400, train_loss[loss=3.701, NarTop10Accuracy=0.5839, over 5175.00 frames. ], tot_loss[loss=3.47, NarTop10Accuracy=0.6315, over 5101.42 frames. ], batch size: 7, lr: 1.08e-02 2024-08-06 16:02:04,066 INFO [trainer.py:765] (0/8) Epoch 8, batch 500, train_loss[loss=3.821, NarTop10Accuracy=0.5507, over 6174.00 frames. ], tot_loss[loss=3.457, NarTop10Accuracy=0.6338, over 5379.20 frames. ], batch size: 11, lr: 1.08e-02 2024-08-06 16:02:41,836 INFO [trainer.py:765] (0/8) Epoch 8, batch 600, train_loss[loss=3.156, NarTop10Accuracy=0.7004, over 5844.00 frames. ], tot_loss[loss=3.475, NarTop10Accuracy=0.6302, over 5642.59 frames. ], batch size: 9, lr: 1.08e-02 2024-08-06 16:03:11,500 INFO [trainer.py:765] (0/8) Epoch 8, batch 700, train_loss[loss=3.648, NarTop10Accuracy=0.5958, over 5007.00 frames. ], tot_loss[loss=3.48, NarTop10Accuracy=0.6291, over 5714.94 frames. ], batch size: 6, lr: 1.07e-02 2024-08-06 16:03:50,084 INFO [trainer.py:765] (0/8) Epoch 8, batch 800, train_loss[loss=3.511, NarTop10Accuracy=0.6198, over 5040.00 frames. ], tot_loss[loss=3.47, NarTop10Accuracy=0.6308, over 5763.16 frames. ], batch size: 6, lr: 1.07e-02 2024-08-06 16:04:27,588 INFO [trainer.py:765] (0/8) Epoch 8, batch 900, train_loss[loss=3.278, NarTop10Accuracy=0.665, over 6663.00 frames. ], tot_loss[loss=3.449, NarTop10Accuracy=0.635, over 5778.07 frames. ], batch size: 14, lr: 1.07e-02 2024-08-06 16:04:57,466 INFO [trainer.py:765] (0/8) Epoch 8, batch 1000, train_loss[loss=3.609, NarTop10Accuracy=0.5924, over 6159.00 frames. ], tot_loss[loss=3.444, NarTop10Accuracy=0.6367, over 5881.94 frames. ], batch size: 13, lr: 1.07e-02 2024-08-06 16:05:37,294 INFO [trainer.py:765] (0/8) Epoch 8, batch 1100, train_loss[loss=3.623, NarTop10Accuracy=0.591, over 7125.00 frames. ], tot_loss[loss=3.443, NarTop10Accuracy=0.637, over 5905.93 frames. ], batch size: 18, lr: 1.06e-02 2024-08-06 16:06:15,859 INFO [trainer.py:765] (0/8) Epoch 8, batch 1200, train_loss[loss=3.447, NarTop10Accuracy=0.643, over 7335.00 frames. ], tot_loss[loss=3.456, NarTop10Accuracy=0.6341, over 5915.91 frames. ], batch size: 31, lr: 1.06e-02 2024-08-06 16:06:45,187 INFO [trainer.py:765] (0/8) Epoch 8, batch 1300, train_loss[loss=3.228, NarTop10Accuracy=0.6784, over 5172.00 frames. ], tot_loss[loss=3.441, NarTop10Accuracy=0.6372, over 5990.31 frames. ], batch size: 6, lr: 1.06e-02 2024-08-06 16:07:24,236 INFO [trainer.py:765] (0/8) Epoch 8, batch 1400, train_loss[loss=3.525, NarTop10Accuracy=0.6169, over 5997.00 frames. ], tot_loss[loss=3.447, NarTop10Accuracy=0.6362, over 6004.63 frames. ], batch size: 11, lr: 1.05e-02 2024-08-06 16:07:52,169 INFO [trainer.py:765] (0/8) Epoch 8, batch 1500, train_loss[loss=3.456, NarTop10Accuracy=0.6435, over 6444.00 frames. ], tot_loss[loss=3.43, NarTop10Accuracy=0.6397, over 5934.45 frames. ], batch size: 53, lr: 1.05e-02 2024-08-06 16:08:19,949 INFO [trainer.py:765] (0/8) Epoch 8, batch 1600, train_loss[loss=3.239, NarTop10Accuracy=0.6812, over 7329.00 frames. ], tot_loss[loss=3.426, NarTop10Accuracy=0.6401, over 5924.60 frames. ], batch size: 23, lr: 1.05e-02 2024-08-06 16:08:46,618 INFO [trainer.py:765] (0/8) Epoch 8, batch 1700, train_loss[loss=3.416, NarTop10Accuracy=0.6469, over 6309.00 frames. ], tot_loss[loss=3.426, NarTop10Accuracy=0.6404, over 5913.43 frames. ], batch size: 13, lr: 1.05e-02 2024-08-06 16:09:13,106 INFO [trainer.py:765] (0/8) Epoch 8, batch 1800, train_loss[loss=3.352, NarTop10Accuracy=0.6577, over 7119.00 frames. ], tot_loss[loss=3.424, NarTop10Accuracy=0.6411, over 5984.33 frames. ], batch size: 22, lr: 1.04e-02 2024-08-06 16:09:39,636 INFO [trainer.py:765] (0/8) Epoch 8, batch 1900, train_loss[loss=3.709, NarTop10Accuracy=0.5917, over 6039.00 frames. ], tot_loss[loss=3.415, NarTop10Accuracy=0.6431, over 6018.01 frames. ], batch size: 50, lr: 1.04e-02 2024-08-06 16:09:56,940 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 16:10:04,970 INFO [trainer.py:811] (0/8) Epoch 8, validation: loss=3.282, NarTop10Accuracy=0.6699, over 1905321.00 frames. 2024-08-06 16:10:04,970 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 28825MB 2024-08-06 16:10:05,470 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.411e+02 1.814e+02 1.981e+02 2.158e+02 5.862e+02, threshold=3.962e+02, percent-clipped=0.1 2024-08-06 16:10:13,204 INFO [trainer.py:765] (0/8) Epoch 8, batch 2000, train_loss[loss=4.042, NarTop10Accuracy=0.5142, over 6165.00 frames. ], tot_loss[loss=3.424, NarTop10Accuracy=0.641, over 5991.40 frames. ], batch size: 50, lr: 1.04e-02 2024-08-06 16:10:38,515 INFO [trainer.py:765] (0/8) Epoch 8, batch 2100, train_loss[loss=3.329, NarTop10Accuracy=0.6686, over 4071.00 frames. ], tot_loss[loss=3.418, NarTop10Accuracy=0.642, over 5961.37 frames. ], batch size: 4, lr: 1.04e-02 2024-08-06 16:11:03,747 INFO [trainer.py:765] (0/8) Epoch 8, batch 2200, train_loss[loss=3.54, NarTop10Accuracy=0.6168, over 7482.00 frames. ], tot_loss[loss=3.422, NarTop10Accuracy=0.6413, over 6012.01 frames. ], batch size: 31, lr: 1.04e-02 2024-08-06 16:11:28,905 INFO [trainer.py:765] (0/8) Epoch 8, batch 2300, train_loss[loss=3.738, NarTop10Accuracy=0.5715, over 5664.00 frames. ], tot_loss[loss=3.441, NarTop10Accuracy=0.6372, over 6024.93 frames. ], batch size: 9, lr: 1.03e-02 2024-08-06 16:11:53,093 INFO [trainer.py:765] (0/8) Epoch 8, batch 2400, train_loss[loss=3.36, NarTop10Accuracy=0.6562, over 5196.00 frames. ], tot_loss[loss=3.423, NarTop10Accuracy=0.6411, over 5789.77 frames. ], batch size: 7, lr: 1.03e-02 2024-08-06 16:12:16,446 INFO [trainer.py:765] (0/8) Epoch 8, batch 2500, train_loss[loss=3.414, NarTop10Accuracy=0.6395, over 5142.00 frames. ], tot_loss[loss=3.408, NarTop10Accuracy=0.6435, over 5462.32 frames. ], batch size: 7, lr: 1.03e-02 2024-08-06 16:12:36,907 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 16:12:36,910 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-8.pt 2024-08-06 16:13:37,515 INFO [trainer.py:765] (0/8) Epoch 9, batch 100, train_loss[loss=3.186, NarTop10Accuracy=0.6898, over 7053.00 frames. ], tot_loss[loss=3.362, NarTop10Accuracy=0.6546, over 2363.40 frames. ], batch size: 31, lr: 9.72e-03 2024-08-06 16:14:14,441 INFO [trainer.py:765] (0/8) Epoch 9, batch 200, train_loss[loss=3.613, NarTop10Accuracy=0.5957, over 6750.00 frames. ], tot_loss[loss=3.366, NarTop10Accuracy=0.6536, over 3850.63 frames. ], batch size: 17, lr: 9.70e-03 2024-08-06 16:14:44,509 INFO [trainer.py:765] (0/8) Epoch 9, batch 300, train_loss[loss=3.297, NarTop10Accuracy=0.6602, over 6960.00 frames. ], tot_loss[loss=3.372, NarTop10Accuracy=0.652, over 4650.87 frames. ], batch size: 22, lr: 9.68e-03 2024-08-06 16:15:14,915 INFO [trainer.py:765] (0/8) Epoch 9, batch 400, train_loss[loss=2.992, NarTop10Accuracy=0.7159, over 5001.00 frames. ], tot_loss[loss=3.355, NarTop10Accuracy=0.6555, over 5114.48 frames. ], batch size: 7, lr: 9.65e-03 2024-08-06 16:15:50,337 INFO [trainer.py:765] (0/8) Epoch 9, batch 500, train_loss[loss=3.289, NarTop10Accuracy=0.6662, over 6246.00 frames. ], tot_loss[loss=3.349, NarTop10Accuracy=0.6565, over 5384.78 frames. ], batch size: 11, lr: 9.63e-03 2024-08-06 16:16:23,973 INFO [trainer.py:765] (0/8) Epoch 9, batch 600, train_loss[loss=3.544, NarTop10Accuracy=0.6154, over 5769.00 frames. ], tot_loss[loss=3.335, NarTop10Accuracy=0.6597, over 5647.44 frames. ], batch size: 9, lr: 9.61e-03 2024-08-06 16:16:57,146 INFO [trainer.py:765] (0/8) Epoch 9, batch 700, train_loss[loss=3.122, NarTop10Accuracy=0.6896, over 5103.00 frames. ], tot_loss[loss=3.354, NarTop10Accuracy=0.6557, over 5708.11 frames. ], batch size: 6, lr: 9.59e-03 2024-08-06 16:17:32,053 INFO [trainer.py:765] (0/8) Epoch 9, batch 800, train_loss[loss=3.171, NarTop10Accuracy=0.6844, over 4950.00 frames. ], tot_loss[loss=3.383, NarTop10Accuracy=0.6493, over 5787.75 frames. ], batch size: 6, lr: 9.57e-03 2024-08-06 16:18:07,816 INFO [trainer.py:765] (0/8) Epoch 9, batch 900, train_loss[loss=3.301, NarTop10Accuracy=0.6699, over 6228.00 frames. ], tot_loss[loss=3.382, NarTop10Accuracy=0.6495, over 5814.37 frames. ], batch size: 13, lr: 9.55e-03 2024-08-06 16:18:39,345 INFO [trainer.py:765] (0/8) Epoch 9, batch 1000, train_loss[loss=3.087, NarTop10Accuracy=0.7181, over 6645.00 frames. ], tot_loss[loss=3.398, NarTop10Accuracy=0.6466, over 5916.90 frames. ], batch size: 14, lr: 9.53e-03 2024-08-06 16:19:15,383 INFO [trainer.py:765] (0/8) Epoch 9, batch 1100, train_loss[loss=3.438, NarTop10Accuracy=0.6364, over 6729.00 frames. ], tot_loss[loss=3.403, NarTop10Accuracy=0.6455, over 5965.39 frames. ], batch size: 17, lr: 9.50e-03 2024-08-06 16:19:53,878 INFO [trainer.py:765] (0/8) Epoch 9, batch 1200, train_loss[loss=3.835, NarTop10Accuracy=0.5599, over 7203.00 frames. ], tot_loss[loss=3.412, NarTop10Accuracy=0.6435, over 5952.35 frames. ], batch size: 31, lr: 9.48e-03 2024-08-06 16:20:24,907 INFO [trainer.py:765] (0/8) Epoch 9, batch 1300, train_loss[loss=3.219, NarTop10Accuracy=0.685, over 4956.00 frames. ], tot_loss[loss=3.412, NarTop10Accuracy=0.6432, over 6013.00 frames. ], batch size: 6, lr: 9.46e-03 2024-08-06 16:20:56,581 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 16:21:04,483 INFO [trainer.py:811] (0/8) Epoch 9, validation: loss=3.266, NarTop10Accuracy=0.6725, over 1905321.00 frames. 2024-08-06 16:21:04,484 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 28825MB 2024-08-06 16:21:05,036 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.808e+02 1.967e+02 2.142e+02 6.126e+02, threshold=3.935e+02, percent-clipped=0.5 2024-08-06 16:21:06,692 INFO [trainer.py:765] (0/8) Epoch 9, batch 1400, train_loss[loss=3.524, NarTop10Accuracy=0.6184, over 6204.00 frames. ], tot_loss[loss=3.414, NarTop10Accuracy=0.643, over 6024.80 frames. ], batch size: 11, lr: 9.44e-03 2024-08-06 16:21:38,896 INFO [trainer.py:765] (0/8) Epoch 9, batch 1500, train_loss[loss=3.36, NarTop10Accuracy=0.6545, over 5874.00 frames. ], tot_loss[loss=3.388, NarTop10Accuracy=0.6482, over 5962.18 frames. ], batch size: 50, lr: 9.42e-03 2024-08-06 16:22:06,721 INFO [trainer.py:765] (0/8) Epoch 9, batch 1600, train_loss[loss=3.352, NarTop10Accuracy=0.6546, over 7119.00 frames. ], tot_loss[loss=3.377, NarTop10Accuracy=0.6505, over 5946.11 frames. ], batch size: 22, lr: 9.40e-03 2024-08-06 16:22:33,470 INFO [trainer.py:765] (0/8) Epoch 9, batch 1700, train_loss[loss=3.434, NarTop10Accuracy=0.6378, over 6819.00 frames. ], tot_loss[loss=3.391, NarTop10Accuracy=0.6475, over 5926.66 frames. ], batch size: 14, lr: 9.38e-03 2024-08-06 16:23:00,064 INFO [trainer.py:765] (0/8) Epoch 9, batch 1800, train_loss[loss=3.115, NarTop10Accuracy=0.6969, over 7035.00 frames. ], tot_loss[loss=3.38, NarTop10Accuracy=0.6496, over 5986.04 frames. ], batch size: 22, lr: 9.36e-03 2024-08-06 16:23:26,783 INFO [trainer.py:765] (0/8) Epoch 9, batch 1900, train_loss[loss=3.387, NarTop10Accuracy=0.6559, over 6510.00 frames. ], tot_loss[loss=3.383, NarTop10Accuracy=0.6493, over 6033.34 frames. ], batch size: 50, lr: 9.34e-03 2024-08-06 16:23:52,486 INFO [trainer.py:765] (0/8) Epoch 9, batch 2000, train_loss[loss=3.921, NarTop10Accuracy=0.5345, over 6045.00 frames. ], tot_loss[loss=3.386, NarTop10Accuracy=0.6486, over 5998.57 frames. ], batch size: 50, lr: 9.32e-03 2024-08-06 16:24:17,963 INFO [trainer.py:765] (0/8) Epoch 9, batch 2100, train_loss[loss=3.191, NarTop10Accuracy=0.7008, over 4929.00 frames. ], tot_loss[loss=3.382, NarTop10Accuracy=0.6489, over 5976.14 frames. ], batch size: 5, lr: 9.30e-03 2024-08-06 16:24:43,422 INFO [trainer.py:765] (0/8) Epoch 9, batch 2200, train_loss[loss=3.671, NarTop10Accuracy=0.5969, over 7083.00 frames. ], tot_loss[loss=3.386, NarTop10Accuracy=0.6484, over 6005.88 frames. ], batch size: 31, lr: 9.28e-03 2024-08-06 16:25:08,721 INFO [trainer.py:765] (0/8) Epoch 9, batch 2300, train_loss[loss=3.428, NarTop10Accuracy=0.6407, over 5724.00 frames. ], tot_loss[loss=3.403, NarTop10Accuracy=0.6452, over 6014.34 frames. ], batch size: 9, lr: 9.26e-03 2024-08-06 16:25:33,164 INFO [trainer.py:765] (0/8) Epoch 9, batch 2400, train_loss[loss=3.146, NarTop10Accuracy=0.7004, over 5085.00 frames. ], tot_loss[loss=3.395, NarTop10Accuracy=0.6468, over 5766.76 frames. ], batch size: 7, lr: 9.25e-03 2024-08-06 16:25:56,769 INFO [trainer.py:765] (0/8) Epoch 9, batch 2500, train_loss[loss=3.107, NarTop10Accuracy=0.6997, over 5046.00 frames. ], tot_loss[loss=3.364, NarTop10Accuracy=0.6523, over 5464.86 frames. ], batch size: 7, lr: 9.23e-03 2024-08-06 16:26:16,445 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 16:26:16,447 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-9.pt 2024-08-06 16:27:19,584 INFO [trainer.py:765] (0/8) Epoch 10, batch 100, train_loss[loss=3.221, NarTop10Accuracy=0.6773, over 7293.00 frames. ], tot_loss[loss=3.374, NarTop10Accuracy=0.6512, over 2363.26 frames. ], batch size: 31, lr: 8.76e-03 2024-08-06 16:27:52,628 INFO [trainer.py:765] (0/8) Epoch 10, batch 200, train_loss[loss=3.053, NarTop10Accuracy=0.7157, over 6903.00 frames. ], tot_loss[loss=3.357, NarTop10Accuracy=0.6546, over 3859.16 frames. ], batch size: 17, lr: 8.74e-03 2024-08-06 16:28:23,057 INFO [trainer.py:765] (0/8) Epoch 10, batch 300, train_loss[loss=3.104, NarTop10Accuracy=0.7048, over 6993.00 frames. ], tot_loss[loss=3.359, NarTop10Accuracy=0.6543, over 4659.82 frames. ], batch size: 22, lr: 8.72e-03 2024-08-06 16:28:59,200 INFO [trainer.py:765] (0/8) Epoch 10, batch 400, train_loss[loss=3.181, NarTop10Accuracy=0.6848, over 5217.00 frames. ], tot_loss[loss=3.343, NarTop10Accuracy=0.6571, over 5091.27 frames. ], batch size: 7, lr: 8.71e-03 2024-08-06 16:29:29,218 INFO [trainer.py:765] (0/8) Epoch 10, batch 500, train_loss[loss=3.105, NarTop10Accuracy=0.7084, over 6024.00 frames. ], tot_loss[loss=3.341, NarTop10Accuracy=0.6577, over 5366.72 frames. ], batch size: 11, lr: 8.69e-03 2024-08-06 16:30:02,765 INFO [trainer.py:765] (0/8) Epoch 10, batch 600, train_loss[loss=3.286, NarTop10Accuracy=0.665, over 5859.00 frames. ], tot_loss[loss=3.343, NarTop10Accuracy=0.6575, over 5648.06 frames. ], batch size: 9, lr: 8.67e-03 2024-08-06 16:30:34,265 INFO [trainer.py:765] (0/8) Epoch 10, batch 700, train_loss[loss=3.361, NarTop10Accuracy=0.6508, over 4338.00 frames. ], tot_loss[loss=3.346, NarTop10Accuracy=0.6568, over 5703.47 frames. ], batch size: 5, lr: 8.65e-03 2024-08-06 16:31:09,843 INFO [trainer.py:765] (0/8) Epoch 10, batch 800, train_loss[loss=3.43, NarTop10Accuracy=0.6314, over 5013.00 frames. ], tot_loss[loss=3.352, NarTop10Accuracy=0.6549, over 5771.98 frames. ], batch size: 6, lr: 8.64e-03 2024-08-06 16:31:16,258 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 16:31:24,565 INFO [trainer.py:811] (0/8) Epoch 10, validation: loss=3.184, NarTop10Accuracy=0.6898, over 1905321.00 frames. 2024-08-06 16:31:24,566 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 28825MB 2024-08-06 16:31:25,155 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.434e+02 1.851e+02 2.012e+02 2.196e+02 4.599e+02, threshold=4.024e+02, percent-clipped=0.1 2024-08-06 16:31:50,345 INFO [trainer.py:765] (0/8) Epoch 10, batch 900, train_loss[loss=3.155, NarTop10Accuracy=0.6927, over 6762.00 frames. ], tot_loss[loss=3.328, NarTop10Accuracy=0.66, over 5783.23 frames. ], batch size: 14, lr: 8.62e-03 2024-08-06 16:32:28,589 INFO [trainer.py:765] (0/8) Epoch 10, batch 1000, train_loss[loss=2.988, NarTop10Accuracy=0.7299, over 6312.00 frames. ], tot_loss[loss=3.334, NarTop10Accuracy=0.6592, over 5890.91 frames. ], batch size: 13, lr: 8.60e-03 2024-08-06 16:33:06,376 INFO [trainer.py:765] (0/8) Epoch 10, batch 1100, train_loss[loss=3.091, NarTop10Accuracy=0.7033, over 7053.00 frames. ], tot_loss[loss=3.348, NarTop10Accuracy=0.6563, over 5917.42 frames. ], batch size: 17, lr: 8.59e-03 2024-08-06 16:33:40,960 INFO [trainer.py:765] (0/8) Epoch 10, batch 1200, train_loss[loss=3.318, NarTop10Accuracy=0.6628, over 7338.00 frames. ], tot_loss[loss=3.343, NarTop10Accuracy=0.6572, over 5926.24 frames. ], batch size: 31, lr: 8.57e-03 2024-08-06 16:34:16,170 INFO [trainer.py:765] (0/8) Epoch 10, batch 1300, train_loss[loss=3.183, NarTop10Accuracy=0.6867, over 5103.00 frames. ], tot_loss[loss=3.34, NarTop10Accuracy=0.6574, over 5993.91 frames. ], batch size: 6, lr: 8.55e-03 2024-08-06 16:34:51,201 INFO [trainer.py:765] (0/8) Epoch 10, batch 1400, train_loss[loss=3.445, NarTop10Accuracy=0.637, over 6027.00 frames. ], tot_loss[loss=3.363, NarTop10Accuracy=0.6524, over 6014.34 frames. ], batch size: 11, lr: 8.54e-03 2024-08-06 16:35:22,159 INFO [trainer.py:765] (0/8) Epoch 10, batch 1500, train_loss[loss=3.524, NarTop10Accuracy=0.6145, over 6420.00 frames. ], tot_loss[loss=3.341, NarTop10Accuracy=0.657, over 5958.20 frames. ], batch size: 52, lr: 8.52e-03 2024-08-06 16:35:50,136 INFO [trainer.py:765] (0/8) Epoch 10, batch 1600, train_loss[loss=3.666, NarTop10Accuracy=0.5903, over 7020.00 frames. ], tot_loss[loss=3.329, NarTop10Accuracy=0.6598, over 5942.72 frames. ], batch size: 22, lr: 8.50e-03 2024-08-06 16:36:16,976 INFO [trainer.py:765] (0/8) Epoch 10, batch 1700, train_loss[loss=3.42, NarTop10Accuracy=0.6432, over 6741.00 frames. ], tot_loss[loss=3.338, NarTop10Accuracy=0.6579, over 5910.28 frames. ], batch size: 14, lr: 8.49e-03 2024-08-06 16:36:43,647 INFO [trainer.py:765] (0/8) Epoch 10, batch 1800, train_loss[loss=3.13, NarTop10Accuracy=0.7, over 6837.00 frames. ], tot_loss[loss=3.334, NarTop10Accuracy=0.6591, over 5979.15 frames. ], batch size: 22, lr: 8.47e-03 2024-08-06 16:37:10,290 INFO [trainer.py:765] (0/8) Epoch 10, batch 1900, train_loss[loss=3.23, NarTop10Accuracy=0.6844, over 5991.00 frames. ], tot_loss[loss=3.326, NarTop10Accuracy=0.6607, over 6008.35 frames. ], batch size: 50, lr: 8.45e-03 2024-08-06 16:37:36,089 INFO [trainer.py:765] (0/8) Epoch 10, batch 2000, train_loss[loss=3.242, NarTop10Accuracy=0.6785, over 5847.00 frames. ], tot_loss[loss=3.322, NarTop10Accuracy=0.6614, over 5986.47 frames. ], batch size: 51, lr: 8.44e-03 2024-08-06 16:38:01,649 INFO [trainer.py:765] (0/8) Epoch 10, batch 2100, train_loss[loss=3.566, NarTop10Accuracy=0.6161, over 4899.00 frames. ], tot_loss[loss=3.334, NarTop10Accuracy=0.6588, over 5953.88 frames. ], batch size: 5, lr: 8.42e-03 2024-08-06 16:38:27,120 INFO [trainer.py:765] (0/8) Epoch 10, batch 2200, train_loss[loss=3.707, NarTop10Accuracy=0.5842, over 7170.00 frames. ], tot_loss[loss=3.342, NarTop10Accuracy=0.6571, over 5995.01 frames. ], batch size: 31, lr: 8.41e-03 2024-08-06 16:38:52,447 INFO [trainer.py:765] (0/8) Epoch 10, batch 2300, train_loss[loss=3.056, NarTop10Accuracy=0.7226, over 5511.00 frames. ], tot_loss[loss=3.349, NarTop10Accuracy=0.6558, over 6023.48 frames. ], batch size: 9, lr: 8.39e-03 2024-08-06 16:39:17,005 INFO [trainer.py:765] (0/8) Epoch 10, batch 2400, train_loss[loss=3.225, NarTop10Accuracy=0.6735, over 5259.00 frames. ], tot_loss[loss=3.319, NarTop10Accuracy=0.6617, over 5785.01 frames. ], batch size: 7, lr: 8.37e-03 2024-08-06 16:39:40,801 INFO [trainer.py:765] (0/8) Epoch 10, batch 2500, train_loss[loss=3.552, NarTop10Accuracy=0.6136, over 5343.00 frames. ], tot_loss[loss=3.294, NarTop10Accuracy=0.6665, over 5470.92 frames. ], batch size: 7, lr: 8.36e-03 2024-08-06 16:40:00,398 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 16:40:00,401 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-10.pt 2024-08-06 16:41:06,235 INFO [trainer.py:765] (0/8) Epoch 11, batch 100, train_loss[loss=3.592, NarTop10Accuracy=0.6059, over 7398.00 frames. ], tot_loss[loss=3.348, NarTop10Accuracy=0.6552, over 2366.23 frames. ], batch size: 31, lr: 7.97e-03 2024-08-06 16:41:39,022 INFO [trainer.py:765] (0/8) Epoch 11, batch 200, train_loss[loss=3.713, NarTop10Accuracy=0.588, over 6708.00 frames. ], tot_loss[loss=3.328, NarTop10Accuracy=0.66, over 3869.78 frames. ], batch size: 17, lr: 7.95e-03 2024-08-06 16:41:53,191 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 16:42:01,355 INFO [trainer.py:811] (0/8) Epoch 11, validation: loss=3.116, NarTop10Accuracy=0.7034, over 1905321.00 frames. 2024-08-06 16:42:01,356 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 28825MB 2024-08-06 16:42:01,879 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.889e+02 2.046e+02 2.249e+02 5.417e+02, threshold=4.093e+02, percent-clipped=0.2 2024-08-06 16:42:17,975 INFO [trainer.py:765] (0/8) Epoch 11, batch 300, train_loss[loss=3.077, NarTop10Accuracy=0.7129, over 7062.00 frames. ], tot_loss[loss=3.297, NarTop10Accuracy=0.6666, over 4665.36 frames. ], batch size: 22, lr: 7.94e-03 2024-08-06 16:42:55,154 INFO [trainer.py:765] (0/8) Epoch 11, batch 400, train_loss[loss=3.412, NarTop10Accuracy=0.6424, over 5142.00 frames. ], tot_loss[loss=3.292, NarTop10Accuracy=0.6675, over 5103.01 frames. ], batch size: 7, lr: 7.92e-03 2024-08-06 16:43:25,719 INFO [trainer.py:765] (0/8) Epoch 11, batch 500, train_loss[loss=3.137, NarTop10Accuracy=0.6985, over 6051.00 frames. ], tot_loss[loss=3.291, NarTop10Accuracy=0.6678, over 5375.64 frames. ], batch size: 11, lr: 7.91e-03 2024-08-06 16:44:02,242 INFO [trainer.py:765] (0/8) Epoch 11, batch 600, train_loss[loss=3.426, NarTop10Accuracy=0.6382, over 5781.00 frames. ], tot_loss[loss=3.295, NarTop10Accuracy=0.6671, over 5636.60 frames. ], batch size: 9, lr: 7.89e-03 2024-08-06 16:44:35,716 INFO [trainer.py:765] (0/8) Epoch 11, batch 700, train_loss[loss=3.323, NarTop10Accuracy=0.6548, over 4233.00 frames. ], tot_loss[loss=3.295, NarTop10Accuracy=0.6674, over 5709.51 frames. ], batch size: 5, lr: 7.88e-03 2024-08-06 16:45:10,468 INFO [trainer.py:765] (0/8) Epoch 11, batch 800, train_loss[loss=2.846, NarTop10Accuracy=0.7582, over 5133.00 frames. ], tot_loss[loss=3.309, NarTop10Accuracy=0.6646, over 5759.99 frames. ], batch size: 6, lr: 7.86e-03 2024-08-06 16:45:46,457 INFO [trainer.py:765] (0/8) Epoch 11, batch 900, train_loss[loss=3.623, NarTop10Accuracy=0.5958, over 6300.00 frames. ], tot_loss[loss=3.305, NarTop10Accuracy=0.6647, over 5797.17 frames. ], batch size: 13, lr: 7.85e-03 2024-08-06 16:46:20,312 INFO [trainer.py:765] (0/8) Epoch 11, batch 1000, train_loss[loss=3.297, NarTop10Accuracy=0.654, over 6336.00 frames. ], tot_loss[loss=3.3, NarTop10Accuracy=0.6655, over 5900.57 frames. ], batch size: 13, lr: 7.84e-03 2024-08-06 16:46:53,456 INFO [trainer.py:765] (0/8) Epoch 11, batch 1100, train_loss[loss=2.974, NarTop10Accuracy=0.7241, over 6807.00 frames. ], tot_loss[loss=3.293, NarTop10Accuracy=0.667, over 5927.17 frames. ], batch size: 17, lr: 7.82e-03 2024-08-06 16:47:33,030 INFO [trainer.py:765] (0/8) Epoch 11, batch 1200, train_loss[loss=3.401, NarTop10Accuracy=0.6514, over 7329.00 frames. ], tot_loss[loss=3.297, NarTop10Accuracy=0.6659, over 5929.21 frames. ], batch size: 31, lr: 7.81e-03 2024-08-06 16:48:06,482 INFO [trainer.py:765] (0/8) Epoch 11, batch 1300, train_loss[loss=2.863, NarTop10Accuracy=0.7491, over 4989.00 frames. ], tot_loss[loss=3.307, NarTop10Accuracy=0.6638, over 5996.78 frames. ], batch size: 6, lr: 7.79e-03 2024-08-06 16:48:41,353 INFO [trainer.py:765] (0/8) Epoch 11, batch 1400, train_loss[loss=3.548, NarTop10Accuracy=0.6162, over 6045.00 frames. ], tot_loss[loss=3.329, NarTop10Accuracy=0.6594, over 6020.48 frames. ], batch size: 11, lr: 7.78e-03 2024-08-06 16:49:09,345 INFO [trainer.py:765] (0/8) Epoch 11, batch 1500, train_loss[loss=3.266, NarTop10Accuracy=0.676, over 5991.00 frames. ], tot_loss[loss=3.334, NarTop10Accuracy=0.6586, over 5956.98 frames. ], batch size: 50, lr: 7.77e-03 2024-08-06 16:49:37,103 INFO [trainer.py:765] (0/8) Epoch 11, batch 1600, train_loss[loss=3.256, NarTop10Accuracy=0.6747, over 7215.00 frames. ], tot_loss[loss=3.316, NarTop10Accuracy=0.6623, over 5921.96 frames. ], batch size: 22, lr: 7.75e-03 2024-08-06 16:50:03,792 INFO [trainer.py:765] (0/8) Epoch 11, batch 1700, train_loss[loss=3.309, NarTop10Accuracy=0.6605, over 6237.00 frames. ], tot_loss[loss=3.303, NarTop10Accuracy=0.6648, over 5924.89 frames. ], batch size: 13, lr: 7.74e-03 2024-08-06 16:50:30,353 INFO [trainer.py:765] (0/8) Epoch 11, batch 1800, train_loss[loss=3.462, NarTop10Accuracy=0.6467, over 7080.00 frames. ], tot_loss[loss=3.321, NarTop10Accuracy=0.6616, over 5992.79 frames. ], batch size: 22, lr: 7.72e-03 2024-08-06 16:50:56,821 INFO [trainer.py:765] (0/8) Epoch 11, batch 1900, train_loss[loss=3.72, NarTop10Accuracy=0.5857, over 6552.00 frames. ], tot_loss[loss=3.322, NarTop10Accuracy=0.6614, over 6034.18 frames. ], batch size: 51, lr: 7.71e-03 2024-08-06 16:51:22,405 INFO [trainer.py:765] (0/8) Epoch 11, batch 2000, train_loss[loss=3.864, NarTop10Accuracy=0.5522, over 6198.00 frames. ], tot_loss[loss=3.315, NarTop10Accuracy=0.6626, over 5998.34 frames. ], batch size: 50, lr: 7.70e-03 2024-08-06 16:51:47,794 INFO [trainer.py:765] (0/8) Epoch 11, batch 2100, train_loss[loss=2.945, NarTop10Accuracy=0.7423, over 3966.00 frames. ], tot_loss[loss=3.308, NarTop10Accuracy=0.6638, over 5949.04 frames. ], batch size: 4, lr: 7.68e-03 2024-08-06 16:52:13,118 INFO [trainer.py:765] (0/8) Epoch 11, batch 2200, train_loss[loss=3.305, NarTop10Accuracy=0.6592, over 7518.00 frames. ], tot_loss[loss=3.304, NarTop10Accuracy=0.6649, over 5990.45 frames. ], batch size: 31, lr: 7.67e-03 2024-08-06 16:52:23,899 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 16:52:32,079 INFO [trainer.py:811] (0/8) Epoch 11, validation: loss=3.101, NarTop10Accuracy=0.7058, over 1905321.00 frames. 2024-08-06 16:52:32,080 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 28825MB 2024-08-06 16:52:32,594 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.920e+02 2.088e+02 2.244e+02 3.599e+02, threshold=4.177e+02, percent-clipped=0.0 2024-08-06 16:52:46,445 INFO [trainer.py:765] (0/8) Epoch 11, batch 2300, train_loss[loss=3.246, NarTop10Accuracy=0.6813, over 5652.00 frames. ], tot_loss[loss=3.317, NarTop10Accuracy=0.6622, over 6000.54 frames. ], batch size: 9, lr: 7.66e-03 2024-08-06 16:53:10,887 INFO [trainer.py:765] (0/8) Epoch 11, batch 2400, train_loss[loss=3.466, NarTop10Accuracy=0.6391, over 4950.00 frames. ], tot_loss[loss=3.307, NarTop10Accuracy=0.6646, over 5769.98 frames. ], batch size: 7, lr: 7.64e-03 2024-08-06 16:53:34,373 INFO [trainer.py:765] (0/8) Epoch 11, batch 2500, train_loss[loss=3.549, NarTop10Accuracy=0.6083, over 5103.00 frames. ], tot_loss[loss=3.297, NarTop10Accuracy=0.6659, over 5477.71 frames. ], batch size: 7, lr: 7.63e-03 2024-08-06 16:53:54,213 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 16:53:54,216 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-11.pt 2024-08-06 16:54:58,526 INFO [trainer.py:765] (0/8) Epoch 12, batch 100, train_loss[loss=3.69, NarTop10Accuracy=0.5761, over 7134.00 frames. ], tot_loss[loss=3.298, NarTop10Accuracy=0.6663, over 2352.14 frames. ], batch size: 31, lr: 7.30e-03 2024-08-06 16:55:32,432 INFO [trainer.py:765] (0/8) Epoch 12, batch 200, train_loss[loss=3.1, NarTop10Accuracy=0.7153, over 6738.00 frames. ], tot_loss[loss=3.265, NarTop10Accuracy=0.6737, over 3859.40 frames. ], batch size: 17, lr: 7.29e-03 2024-08-06 16:56:05,096 INFO [trainer.py:765] (0/8) Epoch 12, batch 300, train_loss[loss=2.905, NarTop10Accuracy=0.7485, over 7275.00 frames. ], tot_loss[loss=3.245, NarTop10Accuracy=0.6776, over 4660.40 frames. ], batch size: 22, lr: 7.27e-03 2024-08-06 16:56:36,426 INFO [trainer.py:765] (0/8) Epoch 12, batch 400, train_loss[loss=3.057, NarTop10Accuracy=0.7139, over 5763.00 frames. ], tot_loss[loss=3.255, NarTop10Accuracy=0.675, over 5116.91 frames. ], batch size: 8, lr: 7.26e-03 2024-08-06 16:57:10,503 INFO [trainer.py:765] (0/8) Epoch 12, batch 500, train_loss[loss=3.602, NarTop10Accuracy=0.5976, over 6147.00 frames. ], tot_loss[loss=3.273, NarTop10Accuracy=0.6715, over 5379.63 frames. ], batch size: 11, lr: 7.25e-03 2024-08-06 16:57:45,484 INFO [trainer.py:765] (0/8) Epoch 12, batch 600, train_loss[loss=2.987, NarTop10Accuracy=0.7316, over 5889.00 frames. ], tot_loss[loss=3.268, NarTop10Accuracy=0.6727, over 5641.25 frames. ], batch size: 9, lr: 7.24e-03 2024-08-06 16:58:17,005 INFO [trainer.py:765] (0/8) Epoch 12, batch 700, train_loss[loss=3.745, NarTop10Accuracy=0.5687, over 5064.00 frames. ], tot_loss[loss=3.286, NarTop10Accuracy=0.6689, over 5712.70 frames. ], batch size: 6, lr: 7.22e-03 2024-08-06 16:58:53,469 INFO [trainer.py:765] (0/8) Epoch 12, batch 800, train_loss[loss=3.22, NarTop10Accuracy=0.6718, over 5142.00 frames. ], tot_loss[loss=3.291, NarTop10Accuracy=0.6676, over 5783.57 frames. ], batch size: 6, lr: 7.21e-03 2024-08-06 16:59:27,206 INFO [trainer.py:765] (0/8) Epoch 12, batch 900, train_loss[loss=3.124, NarTop10Accuracy=0.7064, over 6657.00 frames. ], tot_loss[loss=3.271, NarTop10Accuracy=0.6718, over 5800.88 frames. ], batch size: 14, lr: 7.20e-03 2024-08-06 17:00:01,574 INFO [trainer.py:765] (0/8) Epoch 12, batch 1000, train_loss[loss=2.989, NarTop10Accuracy=0.7335, over 6780.00 frames. ], tot_loss[loss=3.279, NarTop10Accuracy=0.6696, over 5888.50 frames. ], batch size: 14, lr: 7.19e-03 2024-08-06 17:00:39,189 INFO [trainer.py:765] (0/8) Epoch 12, batch 1100, train_loss[loss=3.656, NarTop10Accuracy=0.5978, over 6717.00 frames. ], tot_loss[loss=3.3, NarTop10Accuracy=0.6654, over 5936.38 frames. ], batch size: 17, lr: 7.18e-03 2024-08-06 17:01:13,964 INFO [trainer.py:765] (0/8) Epoch 12, batch 1200, train_loss[loss=3.025, NarTop10Accuracy=0.7252, over 7227.00 frames. ], tot_loss[loss=3.265, NarTop10Accuracy=0.6726, over 5940.55 frames. ], batch size: 31, lr: 7.17e-03 2024-08-06 17:01:48,108 INFO [trainer.py:765] (0/8) Epoch 12, batch 1300, train_loss[loss=3.316, NarTop10Accuracy=0.6527, over 5007.00 frames. ], tot_loss[loss=3.28, NarTop10Accuracy=0.6694, over 6008.88 frames. ], batch size: 6, lr: 7.15e-03 2024-08-06 17:02:22,324 INFO [trainer.py:765] (0/8) Epoch 12, batch 1400, train_loss[loss=3.541, NarTop10Accuracy=0.6185, over 6045.00 frames. ], tot_loss[loss=3.29, NarTop10Accuracy=0.6679, over 6035.59 frames. ], batch size: 11, lr: 7.14e-03 2024-08-06 17:02:52,877 INFO [trainer.py:765] (0/8) Epoch 12, batch 1500, train_loss[loss=3.344, NarTop10Accuracy=0.6541, over 6192.00 frames. ], tot_loss[loss=3.268, NarTop10Accuracy=0.6723, over 5972.44 frames. ], batch size: 52, lr: 7.13e-03 2024-08-06 17:03:20,691 INFO [trainer.py:765] (0/8) Epoch 12, batch 1600, train_loss[loss=3.304, NarTop10Accuracy=0.6642, over 7119.00 frames. ], tot_loss[loss=3.28, NarTop10Accuracy=0.6698, over 5939.78 frames. ], batch size: 22, lr: 7.12e-03 2024-08-06 17:03:38,297 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 17:03:46,473 INFO [trainer.py:811] (0/8) Epoch 12, validation: loss=3.054, NarTop10Accuracy=0.7153, over 1905321.00 frames. 2024-08-06 17:03:46,474 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 28949MB 2024-08-06 17:03:46,988 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.899e+02 2.078e+02 2.276e+02 5.455e+02, threshold=4.157e+02, percent-clipped=0.1 2024-08-06 17:03:55,603 INFO [trainer.py:765] (0/8) Epoch 12, batch 1700, train_loss[loss=3.363, NarTop10Accuracy=0.6606, over 6192.00 frames. ], tot_loss[loss=3.283, NarTop10Accuracy=0.6697, over 5917.76 frames. ], batch size: 13, lr: 7.11e-03 2024-08-06 17:04:22,121 INFO [trainer.py:765] (0/8) Epoch 12, batch 1800, train_loss[loss=3.509, NarTop10Accuracy=0.6235, over 7122.00 frames. ], tot_loss[loss=3.285, NarTop10Accuracy=0.6686, over 5974.16 frames. ], batch size: 22, lr: 7.10e-03 2024-08-06 17:04:48,591 INFO [trainer.py:765] (0/8) Epoch 12, batch 1900, train_loss[loss=3.305, NarTop10Accuracy=0.6615, over 6597.00 frames. ], tot_loss[loss=3.279, NarTop10Accuracy=0.6698, over 6021.40 frames. ], batch size: 52, lr: 7.08e-03 2024-08-06 17:05:14,198 INFO [trainer.py:765] (0/8) Epoch 12, batch 2000, train_loss[loss=3.493, NarTop10Accuracy=0.6261, over 5973.00 frames. ], tot_loss[loss=3.27, NarTop10Accuracy=0.6719, over 5997.92 frames. ], batch size: 50, lr: 7.07e-03 2024-08-06 17:05:39,468 INFO [trainer.py:765] (0/8) Epoch 12, batch 2100, train_loss[loss=3.326, NarTop10Accuracy=0.6645, over 4794.00 frames. ], tot_loss[loss=3.276, NarTop10Accuracy=0.6704, over 5978.98 frames. ], batch size: 5, lr: 7.06e-03 2024-08-06 17:06:04,692 INFO [trainer.py:765] (0/8) Epoch 12, batch 2200, train_loss[loss=3.351, NarTop10Accuracy=0.6554, over 7260.00 frames. ], tot_loss[loss=3.286, NarTop10Accuracy=0.6686, over 5992.70 frames. ], batch size: 31, lr: 7.05e-03 2024-08-06 17:06:29,847 INFO [trainer.py:765] (0/8) Epoch 12, batch 2300, train_loss[loss=3.358, NarTop10Accuracy=0.6432, over 5727.00 frames. ], tot_loss[loss=3.288, NarTop10Accuracy=0.6684, over 6016.18 frames. ], batch size: 9, lr: 7.04e-03 2024-08-06 17:06:54,200 INFO [trainer.py:765] (0/8) Epoch 12, batch 2400, train_loss[loss=3.173, NarTop10Accuracy=0.7012, over 5196.00 frames. ], tot_loss[loss=3.272, NarTop10Accuracy=0.6714, over 5781.10 frames. ], batch size: 7, lr: 7.03e-03 2024-08-06 17:07:17,646 INFO [trainer.py:765] (0/8) Epoch 12, batch 2500, train_loss[loss=3.141, NarTop10Accuracy=0.706, over 5184.00 frames. ], tot_loss[loss=3.255, NarTop10Accuracy=0.6748, over 5487.45 frames. ], batch size: 7, lr: 7.02e-03 2024-08-06 17:07:37,728 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 17:07:37,732 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-12.pt 2024-08-06 17:08:40,079 INFO [trainer.py:765] (0/8) Epoch 13, batch 100, train_loss[loss=3.039, NarTop10Accuracy=0.732, over 7569.00 frames. ], tot_loss[loss=3.285, NarTop10Accuracy=0.6699, over 2358.13 frames. ], batch size: 31, lr: 6.73e-03 2024-08-06 17:09:14,120 INFO [trainer.py:765] (0/8) Epoch 13, batch 200, train_loss[loss=2.976, NarTop10Accuracy=0.7335, over 6813.00 frames. ], tot_loss[loss=3.286, NarTop10Accuracy=0.6688, over 3861.03 frames. ], batch size: 17, lr: 6.72e-03 2024-08-06 17:09:46,278 INFO [trainer.py:765] (0/8) Epoch 13, batch 300, train_loss[loss=3.642, NarTop10Accuracy=0.5985, over 7386.00 frames. ], tot_loss[loss=3.261, NarTop10Accuracy=0.6738, over 4668.77 frames. ], batch size: 23, lr: 6.71e-03 2024-08-06 17:10:19,165 INFO [trainer.py:765] (0/8) Epoch 13, batch 400, train_loss[loss=2.929, NarTop10Accuracy=0.7463, over 5055.00 frames. ], tot_loss[loss=3.242, NarTop10Accuracy=0.6777, over 5102.25 frames. ], batch size: 7, lr: 6.70e-03 2024-08-06 17:10:49,335 INFO [trainer.py:765] (0/8) Epoch 13, batch 500, train_loss[loss=3.14, NarTop10Accuracy=0.7012, over 6033.00 frames. ], tot_loss[loss=3.237, NarTop10Accuracy=0.6787, over 5379.48 frames. ], batch size: 11, lr: 6.69e-03 2024-08-06 17:11:26,246 INFO [trainer.py:765] (0/8) Epoch 13, batch 600, train_loss[loss=3.057, NarTop10Accuracy=0.7165, over 5820.00 frames. ], tot_loss[loss=3.235, NarTop10Accuracy=0.679, over 5669.48 frames. ], batch size: 9, lr: 6.68e-03 2024-08-06 17:11:57,382 INFO [trainer.py:765] (0/8) Epoch 13, batch 700, train_loss[loss=3.083, NarTop10Accuracy=0.703, over 4242.00 frames. ], tot_loss[loss=3.241, NarTop10Accuracy=0.6781, over 5727.25 frames. ], batch size: 5, lr: 6.67e-03 2024-08-06 17:12:33,442 INFO [trainer.py:765] (0/8) Epoch 13, batch 800, train_loss[loss=2.877, NarTop10Accuracy=0.7537, over 4353.00 frames. ], tot_loss[loss=3.245, NarTop10Accuracy=0.6772, over 5768.78 frames. ], batch size: 5, lr: 6.66e-03 2024-08-06 17:13:10,032 INFO [trainer.py:765] (0/8) Epoch 13, batch 900, train_loss[loss=3.237, NarTop10Accuracy=0.6854, over 6711.00 frames. ], tot_loss[loss=3.24, NarTop10Accuracy=0.6783, over 5800.80 frames. ], batch size: 14, lr: 6.65e-03 2024-08-06 17:13:41,442 INFO [trainer.py:765] (0/8) Epoch 13, batch 1000, train_loss[loss=3.487, NarTop10Accuracy=0.6253, over 6738.00 frames. ], tot_loss[loss=3.234, NarTop10Accuracy=0.6792, over 5915.04 frames. ], batch size: 14, lr: 6.64e-03 2024-08-06 17:14:15,537 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 17:14:23,644 INFO [trainer.py:811] (0/8) Epoch 13, validation: loss=3.099, NarTop10Accuracy=0.7062, over 1905321.00 frames. 2024-08-06 17:14:23,645 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 28949MB 2024-08-06 17:14:24,471 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.948e+02 2.091e+02 2.295e+02 3.353e+02, threshold=4.181e+02, percent-clipped=0.0 2024-08-06 17:14:26,698 INFO [trainer.py:765] (0/8) Epoch 13, batch 1100, train_loss[loss=3.511, NarTop10Accuracy=0.62, over 6777.00 frames. ], tot_loss[loss=3.241, NarTop10Accuracy=0.6776, over 5925.35 frames. ], batch size: 17, lr: 6.63e-03 2024-08-06 17:15:03,475 INFO [trainer.py:765] (0/8) Epoch 13, batch 1200, train_loss[loss=3.475, NarTop10Accuracy=0.6265, over 7149.00 frames. ], tot_loss[loss=3.252, NarTop10Accuracy=0.6752, over 5929.20 frames. ], batch size: 31, lr: 6.62e-03 2024-08-06 17:15:35,514 INFO [trainer.py:765] (0/8) Epoch 13, batch 1300, train_loss[loss=2.82, NarTop10Accuracy=0.7671, over 4281.00 frames. ], tot_loss[loss=3.257, NarTop10Accuracy=0.6741, over 5997.50 frames. ], batch size: 5, lr: 6.61e-03 2024-08-06 17:16:11,783 INFO [trainer.py:765] (0/8) Epoch 13, batch 1400, train_loss[loss=2.965, NarTop10Accuracy=0.7273, over 6072.00 frames. ], tot_loss[loss=3.259, NarTop10Accuracy=0.674, over 5998.95 frames. ], batch size: 11, lr: 6.60e-03 2024-08-06 17:16:39,788 INFO [trainer.py:765] (0/8) Epoch 13, batch 1500, train_loss[loss=3.556, NarTop10Accuracy=0.61, over 6171.00 frames. ], tot_loss[loss=3.258, NarTop10Accuracy=0.6744, over 5962.24 frames. ], batch size: 52, lr: 6.59e-03 2024-08-06 17:17:07,603 INFO [trainer.py:765] (0/8) Epoch 13, batch 1600, train_loss[loss=3.087, NarTop10Accuracy=0.714, over 6921.00 frames. ], tot_loss[loss=3.265, NarTop10Accuracy=0.6731, over 5946.39 frames. ], batch size: 22, lr: 6.58e-03 2024-08-06 17:17:34,261 INFO [trainer.py:765] (0/8) Epoch 13, batch 1700, train_loss[loss=3.339, NarTop10Accuracy=0.6568, over 6204.00 frames. ], tot_loss[loss=3.265, NarTop10Accuracy=0.6727, over 5939.43 frames. ], batch size: 13, lr: 6.57e-03 2024-08-06 17:18:00,762 INFO [trainer.py:765] (0/8) Epoch 13, batch 1800, train_loss[loss=3.147, NarTop10Accuracy=0.7004, over 7143.00 frames. ], tot_loss[loss=3.256, NarTop10Accuracy=0.6747, over 6005.28 frames. ], batch size: 22, lr: 6.56e-03 2024-08-06 17:18:27,244 INFO [trainer.py:765] (0/8) Epoch 13, batch 1900, train_loss[loss=3.611, NarTop10Accuracy=0.6019, over 5484.00 frames. ], tot_loss[loss=3.256, NarTop10Accuracy=0.6751, over 6039.69 frames. ], batch size: 50, lr: 6.55e-03 2024-08-06 17:18:52,779 INFO [trainer.py:765] (0/8) Epoch 13, batch 2000, train_loss[loss=3.529, NarTop10Accuracy=0.6254, over 5853.00 frames. ], tot_loss[loss=3.239, NarTop10Accuracy=0.6788, over 5991.69 frames. ], batch size: 50, lr: 6.54e-03 2024-08-06 17:19:18,148 INFO [trainer.py:765] (0/8) Epoch 13, batch 2100, train_loss[loss=2.864, NarTop10Accuracy=0.7589, over 4788.00 frames. ], tot_loss[loss=3.235, NarTop10Accuracy=0.6793, over 5972.48 frames. ], batch size: 5, lr: 6.53e-03 2024-08-06 17:19:43,412 INFO [trainer.py:765] (0/8) Epoch 13, batch 2200, train_loss[loss=3.385, NarTop10Accuracy=0.6444, over 7482.00 frames. ], tot_loss[loss=3.248, NarTop10Accuracy=0.6766, over 5996.03 frames. ], batch size: 31, lr: 6.52e-03 2024-08-06 17:20:08,543 INFO [trainer.py:765] (0/8) Epoch 13, batch 2300, train_loss[loss=3.626, NarTop10Accuracy=0.6013, over 5571.00 frames. ], tot_loss[loss=3.263, NarTop10Accuracy=0.6734, over 6031.19 frames. ], batch size: 9, lr: 6.51e-03 2024-08-06 17:20:32,940 INFO [trainer.py:765] (0/8) Epoch 13, batch 2400, train_loss[loss=3.568, NarTop10Accuracy=0.6098, over 5253.00 frames. ], tot_loss[loss=3.237, NarTop10Accuracy=0.6785, over 5789.09 frames. ], batch size: 7, lr: 6.50e-03 2024-08-06 17:20:56,409 INFO [trainer.py:765] (0/8) Epoch 13, batch 2500, train_loss[loss=3.547, NarTop10Accuracy=0.6164, over 5034.00 frames. ], tot_loss[loss=3.22, NarTop10Accuracy=0.6815, over 5483.74 frames. ], batch size: 7, lr: 6.49e-03 2024-08-06 17:21:16,347 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 17:21:16,350 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-13.pt 2024-08-06 17:22:19,316 INFO [trainer.py:765] (0/8) Epoch 14, batch 100, train_loss[loss=3.075, NarTop10Accuracy=0.7211, over 7191.00 frames. ], tot_loss[loss=3.219, NarTop10Accuracy=0.6835, over 2371.11 frames. ], batch size: 31, lr: 6.24e-03 2024-08-06 17:22:50,379 INFO [trainer.py:765] (0/8) Epoch 14, batch 200, train_loss[loss=3.243, NarTop10Accuracy=0.6733, over 6819.00 frames. ], tot_loss[loss=3.234, NarTop10Accuracy=0.6799, over 3850.18 frames. ], batch size: 17, lr: 6.23e-03 2024-08-06 17:23:23,881 INFO [trainer.py:765] (0/8) Epoch 14, batch 300, train_loss[loss=3.097, NarTop10Accuracy=0.7019, over 7062.00 frames. ], tot_loss[loss=3.207, NarTop10Accuracy=0.6853, over 4675.03 frames. ], batch size: 22, lr: 6.22e-03 2024-08-06 17:23:57,485 INFO [trainer.py:765] (0/8) Epoch 14, batch 400, train_loss[loss=2.941, NarTop10Accuracy=0.7354, over 5082.00 frames. ], tot_loss[loss=3.226, NarTop10Accuracy=0.6813, over 5136.42 frames. ], batch size: 7, lr: 6.22e-03 2024-08-06 17:24:32,115 INFO [trainer.py:765] (0/8) Epoch 14, batch 500, train_loss[loss=3.21, NarTop10Accuracy=0.6837, over 6105.00 frames. ], tot_loss[loss=3.231, NarTop10Accuracy=0.6797, over 5397.84 frames. ], batch size: 11, lr: 6.21e-03 2024-08-06 17:24:36,214 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 17:24:44,275 INFO [trainer.py:811] (0/8) Epoch 14, validation: loss=3.004, NarTop10Accuracy=0.726, over 1905321.00 frames. 2024-08-06 17:24:44,275 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 28949MB 2024-08-06 17:24:44,823 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.969e+02 2.114e+02 2.287e+02 4.406e+02, threshold=4.227e+02, percent-clipped=0.1 2024-08-06 17:25:12,915 INFO [trainer.py:765] (0/8) Epoch 14, batch 600, train_loss[loss=3.002, NarTop10Accuracy=0.7339, over 5832.00 frames. ], tot_loss[loss=3.238, NarTop10Accuracy=0.6785, over 5648.08 frames. ], batch size: 9, lr: 6.20e-03 2024-08-06 17:25:48,548 INFO [trainer.py:765] (0/8) Epoch 14, batch 700, train_loss[loss=3.524, NarTop10Accuracy=0.6203, over 4995.00 frames. ], tot_loss[loss=3.224, NarTop10Accuracy=0.6814, over 5718.15 frames. ], batch size: 6, lr: 6.19e-03 2024-08-06 17:26:25,279 INFO [trainer.py:765] (0/8) Epoch 14, batch 800, train_loss[loss=2.8, NarTop10Accuracy=0.7669, over 5073.00 frames. ], tot_loss[loss=3.209, NarTop10Accuracy=0.6843, over 5776.56 frames. ], batch size: 6, lr: 6.18e-03 2024-08-06 17:26:57,663 INFO [trainer.py:765] (0/8) Epoch 14, batch 900, train_loss[loss=3.226, NarTop10Accuracy=0.6717, over 6231.00 frames. ], tot_loss[loss=3.21, NarTop10Accuracy=0.6838, over 5803.29 frames. ], batch size: 13, lr: 6.17e-03 2024-08-06 17:27:31,717 INFO [trainer.py:765] (0/8) Epoch 14, batch 1000, train_loss[loss=3.389, NarTop10Accuracy=0.6431, over 6228.00 frames. ], tot_loss[loss=3.226, NarTop10Accuracy=0.6803, over 5903.49 frames. ], batch size: 13, lr: 6.16e-03 2024-08-06 17:28:11,597 INFO [trainer.py:765] (0/8) Epoch 14, batch 1100, train_loss[loss=2.999, NarTop10Accuracy=0.7347, over 6900.00 frames. ], tot_loss[loss=3.222, NarTop10Accuracy=0.6814, over 5946.06 frames. ], batch size: 17, lr: 6.15e-03 2024-08-06 17:28:40,734 INFO [trainer.py:765] (0/8) Epoch 14, batch 1200, train_loss[loss=3.47, NarTop10Accuracy=0.6276, over 7509.00 frames. ], tot_loss[loss=3.221, NarTop10Accuracy=0.6815, over 5950.46 frames. ], batch size: 32, lr: 6.15e-03 2024-08-06 17:29:16,214 INFO [trainer.py:765] (0/8) Epoch 14, batch 1300, train_loss[loss=3.515, NarTop10Accuracy=0.6237, over 5064.00 frames. ], tot_loss[loss=3.22, NarTop10Accuracy=0.6818, over 5985.62 frames. ], batch size: 6, lr: 6.14e-03 2024-08-06 17:29:54,603 INFO [trainer.py:765] (0/8) Epoch 14, batch 1400, train_loss[loss=3.356, NarTop10Accuracy=0.6543, over 6006.00 frames. ], tot_loss[loss=3.234, NarTop10Accuracy=0.6791, over 6009.42 frames. ], batch size: 11, lr: 6.13e-03 2024-08-06 17:30:25,315 INFO [trainer.py:765] (0/8) Epoch 14, batch 1500, train_loss[loss=3.739, NarTop10Accuracy=0.576, over 6477.00 frames. ], tot_loss[loss=3.242, NarTop10Accuracy=0.6774, over 5964.93 frames. ], batch size: 50, lr: 6.12e-03 2024-08-06 17:30:53,043 INFO [trainer.py:765] (0/8) Epoch 14, batch 1600, train_loss[loss=3.009, NarTop10Accuracy=0.7205, over 7251.00 frames. ], tot_loss[loss=3.227, NarTop10Accuracy=0.6804, over 5937.09 frames. ], batch size: 23, lr: 6.11e-03 2024-08-06 17:31:19,728 INFO [trainer.py:765] (0/8) Epoch 14, batch 1700, train_loss[loss=3.235, NarTop10Accuracy=0.6789, over 6180.00 frames. ], tot_loss[loss=3.208, NarTop10Accuracy=0.6843, over 5919.55 frames. ], batch size: 13, lr: 6.10e-03 2024-08-06 17:31:46,289 INFO [trainer.py:765] (0/8) Epoch 14, batch 1800, train_loss[loss=3.033, NarTop10Accuracy=0.7301, over 7149.00 frames. ], tot_loss[loss=3.188, NarTop10Accuracy=0.6883, over 5971.12 frames. ], batch size: 22, lr: 6.09e-03 2024-08-06 17:32:12,727 INFO [trainer.py:765] (0/8) Epoch 14, batch 1900, train_loss[loss=3.694, NarTop10Accuracy=0.585, over 5901.00 frames. ], tot_loss[loss=3.2, NarTop10Accuracy=0.6858, over 6011.86 frames. ], batch size: 50, lr: 6.09e-03 2024-08-06 17:32:38,283 INFO [trainer.py:765] (0/8) Epoch 14, batch 2000, train_loss[loss=3.303, NarTop10Accuracy=0.662, over 6213.00 frames. ], tot_loss[loss=3.212, NarTop10Accuracy=0.6834, over 5985.57 frames. ], batch size: 51, lr: 6.08e-03 2024-08-06 17:33:03,646 INFO [trainer.py:765] (0/8) Epoch 14, batch 2100, train_loss[loss=3.046, NarTop10Accuracy=0.7126, over 3948.00 frames. ], tot_loss[loss=3.22, NarTop10Accuracy=0.6817, over 5973.14 frames. ], batch size: 4, lr: 6.07e-03 2024-08-06 17:33:28,999 INFO [trainer.py:765] (0/8) Epoch 14, batch 2200, train_loss[loss=3.24, NarTop10Accuracy=0.6791, over 7212.00 frames. ], tot_loss[loss=3.217, NarTop10Accuracy=0.6824, over 6011.55 frames. ], batch size: 31, lr: 6.06e-03 2024-08-06 17:33:54,087 INFO [trainer.py:765] (0/8) Epoch 14, batch 2300, train_loss[loss=2.853, NarTop10Accuracy=0.7613, over 5661.00 frames. ], tot_loss[loss=3.235, NarTop10Accuracy=0.6792, over 6035.11 frames. ], batch size: 9, lr: 6.05e-03 2024-08-06 17:34:18,534 INFO [trainer.py:765] (0/8) Epoch 14, batch 2400, train_loss[loss=3.087, NarTop10Accuracy=0.7075, over 5718.00 frames. ], tot_loss[loss=3.237, NarTop10Accuracy=0.6783, over 5783.32 frames. ], batch size: 8, lr: 6.04e-03 2024-08-06 17:34:42,116 INFO [trainer.py:765] (0/8) Epoch 14, batch 2500, train_loss[loss=2.719, NarTop10Accuracy=0.7818, over 5022.00 frames. ], tot_loss[loss=3.206, NarTop10Accuracy=0.685, over 5468.76 frames. ], batch size: 7, lr: 6.04e-03 2024-08-06 17:34:45,395 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 17:34:53,209 INFO [trainer.py:811] (0/8) Epoch 14, validation: loss=3.062, NarTop10Accuracy=0.7136, over 1905321.00 frames. 2024-08-06 17:34:53,209 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 17:34:53,680 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.975e+02 2.132e+02 2.304e+02 3.875e+02, threshold=4.265e+02, percent-clipped=0.0 2024-08-06 17:35:09,685 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 17:35:09,688 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-14.pt 2024-08-06 17:36:11,739 INFO [trainer.py:765] (0/8) Epoch 15, batch 100, train_loss[loss=3.001, NarTop10Accuracy=0.7215, over 7086.00 frames. ], tot_loss[loss=3.22, NarTop10Accuracy=0.6817, over 2365.35 frames. ], batch size: 31, lr: 5.82e-03 2024-08-06 17:36:44,335 INFO [trainer.py:765] (0/8) Epoch 15, batch 200, train_loss[loss=3.455, NarTop10Accuracy=0.6308, over 6765.00 frames. ], tot_loss[loss=3.196, NarTop10Accuracy=0.6868, over 3850.82 frames. ], batch size: 17, lr: 5.81e-03 2024-08-06 17:37:17,715 INFO [trainer.py:765] (0/8) Epoch 15, batch 300, train_loss[loss=3.418, NarTop10Accuracy=0.6403, over 7209.00 frames. ], tot_loss[loss=3.198, NarTop10Accuracy=0.6861, over 4661.39 frames. ], batch size: 22, lr: 5.80e-03 2024-08-06 17:37:48,904 INFO [trainer.py:765] (0/8) Epoch 15, batch 400, train_loss[loss=3.091, NarTop10Accuracy=0.7085, over 5091.00 frames. ], tot_loss[loss=3.187, NarTop10Accuracy=0.6887, over 5099.78 frames. ], batch size: 7, lr: 5.80e-03 2024-08-06 17:38:22,354 INFO [trainer.py:765] (0/8) Epoch 15, batch 500, train_loss[loss=2.905, NarTop10Accuracy=0.7441, over 5946.00 frames. ], tot_loss[loss=3.192, NarTop10Accuracy=0.6874, over 5379.46 frames. ], batch size: 11, lr: 5.79e-03 2024-08-06 17:38:53,094 INFO [trainer.py:765] (0/8) Epoch 15, batch 600, train_loss[loss=3, NarTop10Accuracy=0.7298, over 5712.00 frames. ], tot_loss[loss=3.206, NarTop10Accuracy=0.6848, over 5653.20 frames. ], batch size: 9, lr: 5.78e-03 2024-08-06 17:39:27,922 INFO [trainer.py:765] (0/8) Epoch 15, batch 700, train_loss[loss=2.812, NarTop10Accuracy=0.7682, over 5070.00 frames. ], tot_loss[loss=3.209, NarTop10Accuracy=0.684, over 5715.95 frames. ], batch size: 6, lr: 5.77e-03 2024-08-06 17:40:05,565 INFO [trainer.py:765] (0/8) Epoch 15, batch 800, train_loss[loss=3.469, NarTop10Accuracy=0.6243, over 4227.00 frames. ], tot_loss[loss=3.228, NarTop10Accuracy=0.6802, over 5762.62 frames. ], batch size: 5, lr: 5.76e-03 2024-08-06 17:40:35,791 INFO [trainer.py:765] (0/8) Epoch 15, batch 900, train_loss[loss=3.392, NarTop10Accuracy=0.6436, over 6594.00 frames. ], tot_loss[loss=3.211, NarTop10Accuracy=0.6835, over 5800.27 frames. ], batch size: 14, lr: 5.76e-03 2024-08-06 17:41:11,251 INFO [trainer.py:765] (0/8) Epoch 15, batch 1000, train_loss[loss=3.18, NarTop10Accuracy=0.6969, over 6138.00 frames. ], tot_loss[loss=3.2, NarTop10Accuracy=0.6862, over 5910.69 frames. ], batch size: 13, lr: 5.75e-03 2024-08-06 17:41:46,452 INFO [trainer.py:765] (0/8) Epoch 15, batch 1100, train_loss[loss=3.111, NarTop10Accuracy=0.7018, over 6825.00 frames. ], tot_loss[loss=3.201, NarTop10Accuracy=0.6861, over 5961.40 frames. ], batch size: 17, lr: 5.74e-03 2024-08-06 17:42:19,456 INFO [trainer.py:765] (0/8) Epoch 15, batch 1200, train_loss[loss=3.394, NarTop10Accuracy=0.6501, over 7278.00 frames. ], tot_loss[loss=3.227, NarTop10Accuracy=0.6805, over 5941.87 frames. ], batch size: 31, lr: 5.73e-03 2024-08-06 17:42:54,428 INFO [trainer.py:765] (0/8) Epoch 15, batch 1300, train_loss[loss=3.062, NarTop10Accuracy=0.7173, over 5103.00 frames. ], tot_loss[loss=3.21, NarTop10Accuracy=0.6842, over 5995.19 frames. ], batch size: 6, lr: 5.73e-03 2024-08-06 17:43:26,608 INFO [trainer.py:765] (0/8) Epoch 15, batch 1400, train_loss[loss=3.447, NarTop10Accuracy=0.6329, over 6180.00 frames. ], tot_loss[loss=3.224, NarTop10Accuracy=0.6812, over 6018.80 frames. ], batch size: 11, lr: 5.72e-03 2024-08-06 17:43:56,558 INFO [trainer.py:765] (0/8) Epoch 15, batch 1500, train_loss[loss=3.126, NarTop10Accuracy=0.6993, over 6267.00 frames. ], tot_loss[loss=3.224, NarTop10Accuracy=0.6809, over 5934.90 frames. ], batch size: 51, lr: 5.71e-03 2024-08-06 17:44:24,242 INFO [trainer.py:765] (0/8) Epoch 15, batch 1600, train_loss[loss=3.585, NarTop10Accuracy=0.6123, over 6951.00 frames. ], tot_loss[loss=3.2, NarTop10Accuracy=0.6857, over 5910.68 frames. ], batch size: 22, lr: 5.70e-03 2024-08-06 17:44:50,856 INFO [trainer.py:765] (0/8) Epoch 15, batch 1700, train_loss[loss=2.926, NarTop10Accuracy=0.7407, over 6141.00 frames. ], tot_loss[loss=3.191, NarTop10Accuracy=0.6874, over 5893.57 frames. ], batch size: 13, lr: 5.70e-03 2024-08-06 17:45:17,294 INFO [trainer.py:765] (0/8) Epoch 15, batch 1800, train_loss[loss=3.281, NarTop10Accuracy=0.6724, over 7119.00 frames. ], tot_loss[loss=3.19, NarTop10Accuracy=0.6879, over 5985.83 frames. ], batch size: 22, lr: 5.69e-03 2024-08-06 17:45:43,679 INFO [trainer.py:765] (0/8) Epoch 15, batch 1900, train_loss[loss=3.112, NarTop10Accuracy=0.7147, over 6072.00 frames. ], tot_loss[loss=3.214, NarTop10Accuracy=0.6829, over 6032.65 frames. ], batch size: 50, lr: 5.68e-03 2024-08-06 17:45:53,542 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 17:46:01,742 INFO [trainer.py:811] (0/8) Epoch 15, validation: loss=3.006, NarTop10Accuracy=0.725, over 1905321.00 frames. 2024-08-06 17:46:01,743 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 17:46:02,217 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 2.004e+02 2.149e+02 2.324e+02 3.721e+02, threshold=4.298e+02, percent-clipped=0.0 2024-08-06 17:46:17,372 INFO [trainer.py:765] (0/8) Epoch 15, batch 2000, train_loss[loss=3.275, NarTop10Accuracy=0.676, over 6285.00 frames. ], tot_loss[loss=3.205, NarTop10Accuracy=0.6848, over 6001.58 frames. ], batch size: 50, lr: 5.67e-03 2024-08-06 17:46:42,773 INFO [trainer.py:765] (0/8) Epoch 15, batch 2100, train_loss[loss=3.199, NarTop10Accuracy=0.6903, over 3942.00 frames. ], tot_loss[loss=3.199, NarTop10Accuracy=0.6862, over 5973.99 frames. ], batch size: 4, lr: 5.67e-03 2024-08-06 17:47:08,033 INFO [trainer.py:765] (0/8) Epoch 15, batch 2200, train_loss[loss=3.069, NarTop10Accuracy=0.7179, over 7128.00 frames. ], tot_loss[loss=3.204, NarTop10Accuracy=0.6851, over 6017.02 frames. ], batch size: 31, lr: 5.66e-03 2024-08-06 17:47:33,291 INFO [trainer.py:765] (0/8) Epoch 15, batch 2300, train_loss[loss=3.414, NarTop10Accuracy=0.6349, over 5559.00 frames. ], tot_loss[loss=3.206, NarTop10Accuracy=0.6847, over 6016.67 frames. ], batch size: 9, lr: 5.65e-03 2024-08-06 17:47:57,640 INFO [trainer.py:765] (0/8) Epoch 15, batch 2400, train_loss[loss=3.148, NarTop10Accuracy=0.6963, over 5124.00 frames. ], tot_loss[loss=3.186, NarTop10Accuracy=0.6891, over 5778.05 frames. ], batch size: 7, lr: 5.65e-03 2024-08-06 17:48:21,162 INFO [trainer.py:765] (0/8) Epoch 15, batch 2500, train_loss[loss=2.822, NarTop10Accuracy=0.7628, over 5163.00 frames. ], tot_loss[loss=3.162, NarTop10Accuracy=0.6938, over 5488.94 frames. ], batch size: 7, lr: 5.64e-03 2024-08-06 17:48:41,347 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 17:48:41,350 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-15.pt 2024-08-06 17:49:41,222 INFO [trainer.py:765] (0/8) Epoch 16, batch 100, train_loss[loss=3.483, NarTop10Accuracy=0.636, over 7254.00 frames. ], tot_loss[loss=3.154, NarTop10Accuracy=0.6959, over 2364.04 frames. ], batch size: 32, lr: 5.45e-03 2024-08-06 17:50:12,158 INFO [trainer.py:765] (0/8) Epoch 16, batch 200, train_loss[loss=2.956, NarTop10Accuracy=0.742, over 6765.00 frames. ], tot_loss[loss=3.198, NarTop10Accuracy=0.6865, over 3859.22 frames. ], batch size: 17, lr: 5.44e-03 2024-08-06 17:50:45,160 INFO [trainer.py:765] (0/8) Epoch 16, batch 300, train_loss[loss=3.146, NarTop10Accuracy=0.697, over 7098.00 frames. ], tot_loss[loss=3.18, NarTop10Accuracy=0.69, over 4659.62 frames. ], batch size: 22, lr: 5.43e-03 2024-08-06 17:51:15,977 INFO [trainer.py:765] (0/8) Epoch 16, batch 400, train_loss[loss=3.31, NarTop10Accuracy=0.6672, over 5163.00 frames. ], tot_loss[loss=3.192, NarTop10Accuracy=0.6877, over 5123.84 frames. ], batch size: 7, lr: 5.43e-03 2024-08-06 17:51:50,324 INFO [trainer.py:765] (0/8) Epoch 16, batch 500, train_loss[loss=2.946, NarTop10Accuracy=0.7386, over 6189.00 frames. ], tot_loss[loss=3.173, NarTop10Accuracy=0.6907, over 5395.53 frames. ], batch size: 11, lr: 5.42e-03 2024-08-06 17:52:24,252 INFO [trainer.py:765] (0/8) Epoch 16, batch 600, train_loss[loss=2.906, NarTop10Accuracy=0.7454, over 5748.00 frames. ], tot_loss[loss=3.189, NarTop10Accuracy=0.6877, over 5652.25 frames. ], batch size: 9, lr: 5.41e-03 2024-08-06 17:52:55,388 INFO [trainer.py:765] (0/8) Epoch 16, batch 700, train_loss[loss=2.916, NarTop10Accuracy=0.7491, over 4317.00 frames. ], tot_loss[loss=3.186, NarTop10Accuracy=0.6885, over 5705.25 frames. ], batch size: 5, lr: 5.41e-03 2024-08-06 17:53:33,816 INFO [trainer.py:765] (0/8) Epoch 16, batch 800, train_loss[loss=3.505, NarTop10Accuracy=0.621, over 4359.00 frames. ], tot_loss[loss=3.184, NarTop10Accuracy=0.6895, over 5782.06 frames. ], batch size: 5, lr: 5.40e-03 2024-08-06 17:54:03,924 INFO [trainer.py:765] (0/8) Epoch 16, batch 900, train_loss[loss=3.433, NarTop10Accuracy=0.6375, over 6750.00 frames. ], tot_loss[loss=3.171, NarTop10Accuracy=0.6921, over 5805.50 frames. ], batch size: 14, lr: 5.39e-03 2024-08-06 17:54:37,608 INFO [trainer.py:765] (0/8) Epoch 16, batch 1000, train_loss[loss=3.001, NarTop10Accuracy=0.7229, over 6210.00 frames. ], tot_loss[loss=3.157, NarTop10Accuracy=0.6944, over 5910.27 frames. ], batch size: 13, lr: 5.39e-03 2024-08-06 17:55:17,197 INFO [trainer.py:765] (0/8) Epoch 16, batch 1100, train_loss[loss=3.136, NarTop10Accuracy=0.6959, over 6990.00 frames. ], tot_loss[loss=3.194, NarTop10Accuracy=0.6871, over 5936.22 frames. ], batch size: 17, lr: 5.38e-03 2024-08-06 17:55:46,210 INFO [trainer.py:765] (0/8) Epoch 16, batch 1200, train_loss[loss=3.473, NarTop10Accuracy=0.6271, over 7092.00 frames. ], tot_loss[loss=3.197, NarTop10Accuracy=0.6865, over 5928.04 frames. ], batch size: 31, lr: 5.37e-03 2024-08-06 17:56:22,776 INFO [trainer.py:765] (0/8) Epoch 16, batch 1300, train_loss[loss=3.351, NarTop10Accuracy=0.6526, over 4332.00 frames. ], tot_loss[loss=3.19, NarTop10Accuracy=0.6878, over 5994.89 frames. ], batch size: 5, lr: 5.37e-03 2024-08-06 17:56:44,649 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 17:56:53,428 INFO [trainer.py:811] (0/8) Epoch 16, validation: loss=3.112, NarTop10Accuracy=0.703, over 1905321.00 frames. 2024-08-06 17:56:53,429 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 17:56:54,007 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.974e+02 2.136e+02 2.310e+02 5.351e+02, threshold=4.271e+02, percent-clipped=0.2 2024-08-06 17:57:06,172 INFO [trainer.py:765] (0/8) Epoch 16, batch 1400, train_loss[loss=3.13, NarTop10Accuracy=0.7066, over 6018.00 frames. ], tot_loss[loss=3.187, NarTop10Accuracy=0.6885, over 6018.44 frames. ], batch size: 11, lr: 5.36e-03 2024-08-06 17:57:34,034 INFO [trainer.py:765] (0/8) Epoch 16, batch 1500, train_loss[loss=3.337, NarTop10Accuracy=0.6617, over 6204.00 frames. ], tot_loss[loss=3.188, NarTop10Accuracy=0.6884, over 5957.50 frames. ], batch size: 50, lr: 5.35e-03 2024-08-06 17:58:01,775 INFO [trainer.py:765] (0/8) Epoch 16, batch 1600, train_loss[loss=3.097, NarTop10Accuracy=0.7113, over 6999.00 frames. ], tot_loss[loss=3.182, NarTop10Accuracy=0.6897, over 5927.64 frames. ], batch size: 22, lr: 5.35e-03 2024-08-06 17:58:28,475 INFO [trainer.py:765] (0/8) Epoch 16, batch 1700, train_loss[loss=2.948, NarTop10Accuracy=0.7383, over 6663.00 frames. ], tot_loss[loss=3.195, NarTop10Accuracy=0.6867, over 5921.45 frames. ], batch size: 14, lr: 5.34e-03 2024-08-06 17:58:54,976 INFO [trainer.py:765] (0/8) Epoch 16, batch 1800, train_loss[loss=3.077, NarTop10Accuracy=0.7017, over 7164.00 frames. ], tot_loss[loss=3.182, NarTop10Accuracy=0.6896, over 5984.18 frames. ], batch size: 22, lr: 5.33e-03 2024-08-06 17:59:21,360 INFO [trainer.py:765] (0/8) Epoch 16, batch 1900, train_loss[loss=3.43, NarTop10Accuracy=0.6393, over 6300.00 frames. ], tot_loss[loss=3.211, NarTop10Accuracy=0.6839, over 6025.45 frames. ], batch size: 50, lr: 5.33e-03 2024-08-06 17:59:46,857 INFO [trainer.py:765] (0/8) Epoch 16, batch 2000, train_loss[loss=3.112, NarTop10Accuracy=0.7041, over 6258.00 frames. ], tot_loss[loss=3.176, NarTop10Accuracy=0.6909, over 5998.99 frames. ], batch size: 50, lr: 5.32e-03 2024-08-06 18:00:12,117 INFO [trainer.py:765] (0/8) Epoch 16, batch 2100, train_loss[loss=3.589, NarTop10Accuracy=0.6167, over 4869.00 frames. ], tot_loss[loss=3.202, NarTop10Accuracy=0.6852, over 5966.16 frames. ], batch size: 5, lr: 5.32e-03 2024-08-06 18:00:37,333 INFO [trainer.py:765] (0/8) Epoch 16, batch 2200, train_loss[loss=3.335, NarTop10Accuracy=0.6617, over 7338.00 frames. ], tot_loss[loss=3.212, NarTop10Accuracy=0.6832, over 5994.09 frames. ], batch size: 31, lr: 5.31e-03 2024-08-06 18:01:02,502 INFO [trainer.py:765] (0/8) Epoch 16, batch 2300, train_loss[loss=2.908, NarTop10Accuracy=0.7404, over 5757.00 frames. ], tot_loss[loss=3.212, NarTop10Accuracy=0.6833, over 6017.76 frames. ], batch size: 9, lr: 5.30e-03 2024-08-06 18:01:26,883 INFO [trainer.py:765] (0/8) Epoch 16, batch 2400, train_loss[loss=2.972, NarTop10Accuracy=0.7393, over 5112.00 frames. ], tot_loss[loss=3.192, NarTop10Accuracy=0.6876, over 5777.34 frames. ], batch size: 7, lr: 5.30e-03 2024-08-06 18:01:50,406 INFO [trainer.py:765] (0/8) Epoch 16, batch 2500, train_loss[loss=3.062, NarTop10Accuracy=0.7069, over 5130.00 frames. ], tot_loss[loss=3.163, NarTop10Accuracy=0.6928, over 5459.22 frames. ], batch size: 7, lr: 5.29e-03 2024-08-06 18:02:11,228 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 18:02:11,233 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-16.pt 2024-08-06 18:03:08,531 INFO [trainer.py:765] (0/8) Epoch 17, batch 100, train_loss[loss=3.107, NarTop10Accuracy=0.7048, over 7176.00 frames. ], tot_loss[loss=3.139, NarTop10Accuracy=0.6986, over 2371.95 frames. ], batch size: 31, lr: 5.12e-03 2024-08-06 18:03:45,146 INFO [trainer.py:765] (0/8) Epoch 17, batch 200, train_loss[loss=3.44, NarTop10Accuracy=0.6419, over 6897.00 frames. ], tot_loss[loss=3.146, NarTop10Accuracy=0.6971, over 3859.46 frames. ], batch size: 17, lr: 5.12e-03 2024-08-06 18:04:19,591 INFO [trainer.py:765] (0/8) Epoch 17, batch 300, train_loss[loss=3.26, NarTop10Accuracy=0.6687, over 7083.00 frames. ], tot_loss[loss=3.166, NarTop10Accuracy=0.6927, over 4651.45 frames. ], batch size: 22, lr: 5.11e-03 2024-08-06 18:04:48,402 INFO [trainer.py:765] (0/8) Epoch 17, batch 400, train_loss[loss=3.361, NarTop10Accuracy=0.6498, over 5007.00 frames. ], tot_loss[loss=3.165, NarTop10Accuracy=0.6929, over 5081.18 frames. ], batch size: 7, lr: 5.10e-03 2024-08-06 18:05:24,681 INFO [trainer.py:765] (0/8) Epoch 17, batch 500, train_loss[loss=2.869, NarTop10Accuracy=0.7504, over 6006.00 frames. ], tot_loss[loss=3.152, NarTop10Accuracy=0.6956, over 5371.02 frames. ], batch size: 11, lr: 5.10e-03 2024-08-06 18:05:58,740 INFO [trainer.py:765] (0/8) Epoch 17, batch 600, train_loss[loss=3.145, NarTop10Accuracy=0.7011, over 5781.00 frames. ], tot_loss[loss=3.169, NarTop10Accuracy=0.6923, over 5647.89 frames. ], batch size: 9, lr: 5.09e-03 2024-08-06 18:06:32,476 INFO [trainer.py:765] (0/8) Epoch 17, batch 700, train_loss[loss=3.03, NarTop10Accuracy=0.72, over 4950.00 frames. ], tot_loss[loss=3.168, NarTop10Accuracy=0.6926, over 5716.50 frames. ], batch size: 6, lr: 5.08e-03 2024-08-06 18:07:02,726 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 18:07:10,763 INFO [trainer.py:811] (0/8) Epoch 17, validation: loss=3.018, NarTop10Accuracy=0.7223, over 1905321.00 frames. 2024-08-06 18:07:10,764 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 18:07:11,312 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 2.005e+02 2.161e+02 2.341e+02 3.806e+02, threshold=4.323e+02, percent-clipped=0.0 2024-08-06 18:07:14,354 INFO [trainer.py:765] (0/8) Epoch 17, batch 800, train_loss[loss=3.117, NarTop10Accuracy=0.6991, over 5052.00 frames. ], tot_loss[loss=3.181, NarTop10Accuracy=0.6901, over 5775.79 frames. ], batch size: 6, lr: 5.08e-03 2024-08-06 18:07:49,722 INFO [trainer.py:765] (0/8) Epoch 17, batch 900, train_loss[loss=3.445, NarTop10Accuracy=0.6338, over 6261.00 frames. ], tot_loss[loss=3.158, NarTop10Accuracy=0.6946, over 5787.70 frames. ], batch size: 13, lr: 5.07e-03 2024-08-06 18:08:21,598 INFO [trainer.py:765] (0/8) Epoch 17, batch 1000, train_loss[loss=3.29, NarTop10Accuracy=0.6699, over 6660.00 frames. ], tot_loss[loss=3.168, NarTop10Accuracy=0.6927, over 5893.70 frames. ], batch size: 14, lr: 5.07e-03 2024-08-06 18:09:03,107 INFO [trainer.py:765] (0/8) Epoch 17, batch 1100, train_loss[loss=2.899, NarTop10Accuracy=0.747, over 7047.00 frames. ], tot_loss[loss=3.175, NarTop10Accuracy=0.6909, over 5930.68 frames. ], batch size: 18, lr: 5.06e-03 2024-08-06 18:09:36,746 INFO [trainer.py:765] (0/8) Epoch 17, batch 1200, train_loss[loss=3.099, NarTop10Accuracy=0.709, over 7302.00 frames. ], tot_loss[loss=3.171, NarTop10Accuracy=0.6916, over 5925.34 frames. ], batch size: 31, lr: 5.06e-03 2024-08-06 18:10:10,689 INFO [trainer.py:765] (0/8) Epoch 17, batch 1300, train_loss[loss=3.084, NarTop10Accuracy=0.7013, over 5244.00 frames. ], tot_loss[loss=3.169, NarTop10Accuracy=0.6916, over 5994.39 frames. ], batch size: 6, lr: 5.05e-03 2024-08-06 18:10:48,027 INFO [trainer.py:765] (0/8) Epoch 17, batch 1400, train_loss[loss=3.331, NarTop10Accuracy=0.6575, over 6144.00 frames. ], tot_loss[loss=3.181, NarTop10Accuracy=0.6893, over 6008.24 frames. ], batch size: 11, lr: 5.04e-03 2024-08-06 18:11:19,106 INFO [trainer.py:765] (0/8) Epoch 17, batch 1500, train_loss[loss=3.521, NarTop10Accuracy=0.6243, over 6057.00 frames. ], tot_loss[loss=3.171, NarTop10Accuracy=0.6913, over 5971.59 frames. ], batch size: 50, lr: 5.04e-03 2024-08-06 18:11:46,855 INFO [trainer.py:765] (0/8) Epoch 17, batch 1600, train_loss[loss=3.096, NarTop10Accuracy=0.7115, over 7062.00 frames. ], tot_loss[loss=3.157, NarTop10Accuracy=0.6944, over 5934.37 frames. ], batch size: 22, lr: 5.03e-03 2024-08-06 18:12:13,509 INFO [trainer.py:765] (0/8) Epoch 17, batch 1700, train_loss[loss=3.595, NarTop10Accuracy=0.6023, over 6174.00 frames. ], tot_loss[loss=3.174, NarTop10Accuracy=0.6909, over 5912.50 frames. ], batch size: 13, lr: 5.03e-03 2024-08-06 18:12:40,002 INFO [trainer.py:765] (0/8) Epoch 17, batch 1800, train_loss[loss=2.894, NarTop10Accuracy=0.7465, over 7209.00 frames. ], tot_loss[loss=3.18, NarTop10Accuracy=0.6896, over 5974.11 frames. ], batch size: 23, lr: 5.02e-03 2024-08-06 18:13:06,380 INFO [trainer.py:765] (0/8) Epoch 17, batch 1900, train_loss[loss=3.074, NarTop10Accuracy=0.7157, over 6939.00 frames. ], tot_loss[loss=3.196, NarTop10Accuracy=0.6865, over 6031.39 frames. ], batch size: 51, lr: 5.01e-03 2024-08-06 18:13:31,923 INFO [trainer.py:765] (0/8) Epoch 17, batch 2000, train_loss[loss=3.543, NarTop10Accuracy=0.6241, over 6264.00 frames. ], tot_loss[loss=3.173, NarTop10Accuracy=0.6913, over 5998.19 frames. ], batch size: 50, lr: 5.01e-03 2024-08-06 18:13:57,229 INFO [trainer.py:765] (0/8) Epoch 17, batch 2100, train_loss[loss=2.858, NarTop10Accuracy=0.7498, over 3933.00 frames. ], tot_loss[loss=3.176, NarTop10Accuracy=0.6906, over 5974.67 frames. ], batch size: 4, lr: 5.00e-03 2024-08-06 18:14:22,435 INFO [trainer.py:765] (0/8) Epoch 17, batch 2200, train_loss[loss=2.983, NarTop10Accuracy=0.7339, over 7374.00 frames. ], tot_loss[loss=3.193, NarTop10Accuracy=0.6869, over 6007.88 frames. ], batch size: 31, lr: 5.00e-03 2024-08-06 18:14:47,592 INFO [trainer.py:765] (0/8) Epoch 17, batch 2300, train_loss[loss=2.966, NarTop10Accuracy=0.7362, over 5673.00 frames. ], tot_loss[loss=3.185, NarTop10Accuracy=0.6887, over 6026.88 frames. ], batch size: 9, lr: 4.99e-03 2024-08-06 18:15:12,061 INFO [trainer.py:765] (0/8) Epoch 17, batch 2400, train_loss[loss=2.834, NarTop10Accuracy=0.7567, over 5094.00 frames. ], tot_loss[loss=3.18, NarTop10Accuracy=0.6891, over 5781.03 frames. ], batch size: 7, lr: 4.99e-03 2024-08-06 18:15:35,515 INFO [trainer.py:765] (0/8) Epoch 17, batch 2500, train_loss[loss=2.984, NarTop10Accuracy=0.7275, over 5166.00 frames. ], tot_loss[loss=3.166, NarTop10Accuracy=0.6915, over 5475.78 frames. ], batch size: 7, lr: 4.98e-03 2024-08-06 18:15:55,790 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 18:15:55,795 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-17.pt 2024-08-06 18:16:49,908 INFO [trainer.py:765] (0/8) Epoch 18, batch 100, train_loss[loss=2.982, NarTop10Accuracy=0.7334, over 7287.00 frames. ], tot_loss[loss=3.172, NarTop10Accuracy=0.6917, over 2363.43 frames. ], batch size: 31, lr: 4.83e-03 2024-08-06 18:17:24,749 INFO [trainer.py:765] (0/8) Epoch 18, batch 200, train_loss[loss=2.958, NarTop10Accuracy=0.742, over 6777.00 frames. ], tot_loss[loss=3.165, NarTop10Accuracy=0.6931, over 3857.58 frames. ], batch size: 17, lr: 4.83e-03 2024-08-06 18:17:27,715 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 18:17:35,927 INFO [trainer.py:811] (0/8) Epoch 18, validation: loss=3.062, NarTop10Accuracy=0.7137, over 1905321.00 frames. 2024-08-06 18:17:35,928 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 18:17:36,528 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 2.024e+02 2.164e+02 2.334e+02 7.024e+02, threshold=4.329e+02, percent-clipped=0.1 2024-08-06 18:18:06,912 INFO [trainer.py:765] (0/8) Epoch 18, batch 300, train_loss[loss=3.497, NarTop10Accuracy=0.6309, over 7188.00 frames. ], tot_loss[loss=3.164, NarTop10Accuracy=0.6937, over 4661.37 frames. ], batch size: 22, lr: 4.82e-03 2024-08-06 18:18:38,183 INFO [trainer.py:765] (0/8) Epoch 18, batch 400, train_loss[loss=3.36, NarTop10Accuracy=0.6571, over 5130.00 frames. ], tot_loss[loss=3.153, NarTop10Accuracy=0.6956, over 5100.37 frames. ], batch size: 7, lr: 4.81e-03 2024-08-06 18:19:13,599 INFO [trainer.py:765] (0/8) Epoch 18, batch 500, train_loss[loss=3.259, NarTop10Accuracy=0.6812, over 6081.00 frames. ], tot_loss[loss=3.151, NarTop10Accuracy=0.6955, over 5398.59 frames. ], batch size: 11, lr: 4.81e-03 2024-08-06 18:19:48,151 INFO [trainer.py:765] (0/8) Epoch 18, batch 600, train_loss[loss=3.272, NarTop10Accuracy=0.6705, over 5670.00 frames. ], tot_loss[loss=3.154, NarTop10Accuracy=0.6951, over 5662.72 frames. ], batch size: 9, lr: 4.80e-03 2024-08-06 18:20:23,869 INFO [trainer.py:765] (0/8) Epoch 18, batch 700, train_loss[loss=3.371, NarTop10Accuracy=0.6585, over 5010.00 frames. ], tot_loss[loss=3.159, NarTop10Accuracy=0.6941, over 5737.39 frames. ], batch size: 6, lr: 4.80e-03 2024-08-06 18:21:01,026 INFO [trainer.py:765] (0/8) Epoch 18, batch 800, train_loss[loss=2.648, NarTop10Accuracy=0.7882, over 4251.00 frames. ], tot_loss[loss=3.163, NarTop10Accuracy=0.6934, over 5790.69 frames. ], batch size: 5, lr: 4.79e-03 2024-08-06 18:21:32,408 INFO [trainer.py:765] (0/8) Epoch 18, batch 900, train_loss[loss=3.035, NarTop10Accuracy=0.7194, over 6681.00 frames. ], tot_loss[loss=3.147, NarTop10Accuracy=0.6966, over 5806.67 frames. ], batch size: 14, lr: 4.79e-03 2024-08-06 18:22:11,192 INFO [trainer.py:765] (0/8) Epoch 18, batch 1000, train_loss[loss=2.987, NarTop10Accuracy=0.7268, over 6237.00 frames. ], tot_loss[loss=3.164, NarTop10Accuracy=0.6928, over 5909.81 frames. ], batch size: 13, lr: 4.78e-03 2024-08-06 18:22:46,969 INFO [trainer.py:765] (0/8) Epoch 18, batch 1100, train_loss[loss=3.373, NarTop10Accuracy=0.6557, over 6801.00 frames. ], tot_loss[loss=3.159, NarTop10Accuracy=0.6942, over 5932.48 frames. ], batch size: 17, lr: 4.78e-03 2024-08-06 18:23:18,604 INFO [trainer.py:765] (0/8) Epoch 18, batch 1200, train_loss[loss=3.535, NarTop10Accuracy=0.6123, over 7419.00 frames. ], tot_loss[loss=3.178, NarTop10Accuracy=0.6905, over 5934.77 frames. ], batch size: 31, lr: 4.77e-03 2024-08-06 18:24:00,099 INFO [trainer.py:765] (0/8) Epoch 18, batch 1300, train_loss[loss=3.231, NarTop10Accuracy=0.6853, over 5112.00 frames. ], tot_loss[loss=3.159, NarTop10Accuracy=0.6941, over 6018.75 frames. ], batch size: 6, lr: 4.77e-03 2024-08-06 18:24:29,574 INFO [trainer.py:765] (0/8) Epoch 18, batch 1400, train_loss[loss=3.052, NarTop10Accuracy=0.7142, over 6051.00 frames. ], tot_loss[loss=3.157, NarTop10Accuracy=0.6946, over 6013.57 frames. ], batch size: 11, lr: 4.76e-03 2024-08-06 18:25:00,307 INFO [trainer.py:765] (0/8) Epoch 18, batch 1500, train_loss[loss=3.143, NarTop10Accuracy=0.7065, over 6234.00 frames. ], tot_loss[loss=3.148, NarTop10Accuracy=0.6966, over 5944.61 frames. ], batch size: 51, lr: 4.76e-03 2024-08-06 18:25:28,085 INFO [trainer.py:765] (0/8) Epoch 18, batch 1600, train_loss[loss=3.033, NarTop10Accuracy=0.7189, over 7125.00 frames. ], tot_loss[loss=3.159, NarTop10Accuracy=0.6944, over 5922.43 frames. ], batch size: 22, lr: 4.75e-03 2024-08-06 18:25:54,687 INFO [trainer.py:765] (0/8) Epoch 18, batch 1700, train_loss[loss=3.13, NarTop10Accuracy=0.6891, over 6288.00 frames. ], tot_loss[loss=3.164, NarTop10Accuracy=0.6936, over 5908.33 frames. ], batch size: 13, lr: 4.75e-03 2024-08-06 18:26:21,196 INFO [trainer.py:765] (0/8) Epoch 18, batch 1800, train_loss[loss=3.554, NarTop10Accuracy=0.6131, over 6999.00 frames. ], tot_loss[loss=3.158, NarTop10Accuracy=0.6947, over 5980.28 frames. ], batch size: 22, lr: 4.74e-03 2024-08-06 18:26:47,566 INFO [trainer.py:765] (0/8) Epoch 18, batch 1900, train_loss[loss=3.023, NarTop10Accuracy=0.725, over 5964.00 frames. ], tot_loss[loss=3.173, NarTop10Accuracy=0.6915, over 6015.40 frames. ], batch size: 50, lr: 4.74e-03 2024-08-06 18:27:13,176 INFO [trainer.py:765] (0/8) Epoch 18, batch 2000, train_loss[loss=3.136, NarTop10Accuracy=0.7086, over 5664.00 frames. ], tot_loss[loss=3.16, NarTop10Accuracy=0.694, over 6006.64 frames. ], batch size: 51, lr: 4.73e-03 2024-08-06 18:27:38,528 INFO [trainer.py:765] (0/8) Epoch 18, batch 2100, train_loss[loss=3.302, NarTop10Accuracy=0.6638, over 3990.00 frames. ], tot_loss[loss=3.159, NarTop10Accuracy=0.6942, over 5979.73 frames. ], batch size: 4, lr: 4.73e-03 2024-08-06 18:28:03,811 INFO [trainer.py:765] (0/8) Epoch 18, batch 2200, train_loss[loss=3.095, NarTop10Accuracy=0.7078, over 7290.00 frames. ], tot_loss[loss=3.16, NarTop10Accuracy=0.694, over 6009.81 frames. ], batch size: 31, lr: 4.72e-03 2024-08-06 18:28:06,570 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 18:28:14,649 INFO [trainer.py:811] (0/8) Epoch 18, validation: loss=3.028, NarTop10Accuracy=0.7201, over 1905321.00 frames. 2024-08-06 18:28:14,650 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 18:28:15,147 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 2.054e+02 2.220e+02 2.384e+02 3.992e+02, threshold=4.441e+02, percent-clipped=0.0 2024-08-06 18:28:37,097 INFO [trainer.py:765] (0/8) Epoch 18, batch 2300, train_loss[loss=2.902, NarTop10Accuracy=0.7491, over 5712.00 frames. ], tot_loss[loss=3.175, NarTop10Accuracy=0.6908, over 6013.15 frames. ], batch size: 9, lr: 4.72e-03 2024-08-06 18:29:01,593 INFO [trainer.py:765] (0/8) Epoch 18, batch 2400, train_loss[loss=2.812, NarTop10Accuracy=0.7603, over 5217.00 frames. ], tot_loss[loss=3.152, NarTop10Accuracy=0.6956, over 5776.49 frames. ], batch size: 7, lr: 4.71e-03 2024-08-06 18:29:25,028 INFO [trainer.py:765] (0/8) Epoch 18, batch 2500, train_loss[loss=2.93, NarTop10Accuracy=0.7465, over 5658.00 frames. ], tot_loss[loss=3.128, NarTop10Accuracy=0.7001, over 5483.81 frames. ], batch size: 8, lr: 4.71e-03 2024-08-06 18:29:45,449 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 18:29:45,453 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-18.pt 2024-08-06 18:30:41,231 INFO [trainer.py:765] (0/8) Epoch 19, batch 100, train_loss[loss=2.951, NarTop10Accuracy=0.7314, over 7236.00 frames. ], tot_loss[loss=3.165, NarTop10Accuracy=0.6932, over 2364.78 frames. ], batch size: 31, lr: 4.57e-03 2024-08-06 18:31:15,602 INFO [trainer.py:765] (0/8) Epoch 19, batch 200, train_loss[loss=3.001, NarTop10Accuracy=0.726, over 6852.00 frames. ], tot_loss[loss=3.156, NarTop10Accuracy=0.6956, over 3863.82 frames. ], batch size: 17, lr: 4.57e-03 2024-08-06 18:31:47,468 INFO [trainer.py:765] (0/8) Epoch 19, batch 300, train_loss[loss=3.46, NarTop10Accuracy=0.6228, over 7287.00 frames. ], tot_loss[loss=3.137, NarTop10Accuracy=0.6984, over 4669.20 frames. ], batch size: 22, lr: 4.56e-03 2024-08-06 18:32:20,355 INFO [trainer.py:765] (0/8) Epoch 19, batch 400, train_loss[loss=3.167, NarTop10Accuracy=0.6891, over 5181.00 frames. ], tot_loss[loss=3.14, NarTop10Accuracy=0.6984, over 5115.75 frames. ], batch size: 7, lr: 4.56e-03 2024-08-06 18:32:50,335 INFO [trainer.py:765] (0/8) Epoch 19, batch 500, train_loss[loss=2.915, NarTop10Accuracy=0.7288, over 6084.00 frames. ], tot_loss[loss=3.139, NarTop10Accuracy=0.698, over 5381.05 frames. ], batch size: 11, lr: 4.55e-03 2024-08-06 18:33:29,610 INFO [trainer.py:765] (0/8) Epoch 19, batch 600, train_loss[loss=2.934, NarTop10Accuracy=0.739, over 5718.00 frames. ], tot_loss[loss=3.145, NarTop10Accuracy=0.6967, over 5657.15 frames. ], batch size: 9, lr: 4.55e-03 2024-08-06 18:34:03,591 INFO [trainer.py:765] (0/8) Epoch 19, batch 700, train_loss[loss=3.029, NarTop10Accuracy=0.7277, over 5022.00 frames. ], tot_loss[loss=3.15, NarTop10Accuracy=0.6956, over 5712.90 frames. ], batch size: 6, lr: 4.54e-03 2024-08-06 18:34:35,179 INFO [trainer.py:765] (0/8) Epoch 19, batch 800, train_loss[loss=3.31, NarTop10Accuracy=0.6664, over 5106.00 frames. ], tot_loss[loss=3.156, NarTop10Accuracy=0.6949, over 5771.82 frames. ], batch size: 6, lr: 4.54e-03 2024-08-06 18:35:10,263 INFO [trainer.py:765] (0/8) Epoch 19, batch 900, train_loss[loss=2.845, NarTop10Accuracy=0.752, over 6180.00 frames. ], tot_loss[loss=3.143, NarTop10Accuracy=0.6975, over 5800.61 frames. ], batch size: 13, lr: 4.53e-03 2024-08-06 18:35:48,637 INFO [trainer.py:765] (0/8) Epoch 19, batch 1000, train_loss[loss=3.449, NarTop10Accuracy=0.6276, over 6219.00 frames. ], tot_loss[loss=3.143, NarTop10Accuracy=0.6977, over 5903.86 frames. ], batch size: 13, lr: 4.53e-03 2024-08-06 18:36:20,938 INFO [trainer.py:765] (0/8) Epoch 19, batch 1100, train_loss[loss=2.986, NarTop10Accuracy=0.734, over 6816.00 frames. ], tot_loss[loss=3.153, NarTop10Accuracy=0.6952, over 5949.82 frames. ], batch size: 17, lr: 4.52e-03 2024-08-06 18:36:57,130 INFO [trainer.py:765] (0/8) Epoch 19, batch 1200, train_loss[loss=3.055, NarTop10Accuracy=0.7205, over 7224.00 frames. ], tot_loss[loss=3.168, NarTop10Accuracy=0.692, over 5938.06 frames. ], batch size: 31, lr: 4.52e-03 2024-08-06 18:37:35,315 INFO [trainer.py:765] (0/8) Epoch 19, batch 1300, train_loss[loss=2.974, NarTop10Accuracy=0.7273, over 4980.00 frames. ], tot_loss[loss=3.166, NarTop10Accuracy=0.6923, over 5993.88 frames. ], batch size: 6, lr: 4.51e-03 2024-08-06 18:38:04,679 INFO [trainer.py:765] (0/8) Epoch 19, batch 1400, train_loss[loss=3.054, NarTop10Accuracy=0.7176, over 6147.00 frames. ], tot_loss[loss=3.169, NarTop10Accuracy=0.6919, over 6025.52 frames. ], batch size: 11, lr: 4.51e-03 2024-08-06 18:38:34,550 INFO [trainer.py:765] (0/8) Epoch 19, batch 1500, train_loss[loss=3.432, NarTop10Accuracy=0.641, over 6135.00 frames. ], tot_loss[loss=3.15, NarTop10Accuracy=0.6956, over 5967.31 frames. ], batch size: 53, lr: 4.50e-03 2024-08-06 18:39:02,311 INFO [trainer.py:765] (0/8) Epoch 19, batch 1600, train_loss[loss=3.485, NarTop10Accuracy=0.6285, over 7185.00 frames. ], tot_loss[loss=3.144, NarTop10Accuracy=0.6969, over 5942.89 frames. ], batch size: 22, lr: 4.50e-03 2024-08-06 18:39:11,589 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 18:39:19,795 INFO [trainer.py:811] (0/8) Epoch 19, validation: loss=2.958, NarTop10Accuracy=0.7345, over 1905321.00 frames. 2024-08-06 18:39:19,795 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 18:39:20,378 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 2.040e+02 2.194e+02 2.364e+02 6.410e+02, threshold=4.387e+02, percent-clipped=0.2 2024-08-06 18:39:37,192 INFO [trainer.py:765] (0/8) Epoch 19, batch 1700, train_loss[loss=3.485, NarTop10Accuracy=0.6302, over 6240.00 frames. ], tot_loss[loss=3.145, NarTop10Accuracy=0.6964, over 5922.00 frames. ], batch size: 13, lr: 4.49e-03 2024-08-06 18:40:03,789 INFO [trainer.py:765] (0/8) Epoch 19, batch 1800, train_loss[loss=3.538, NarTop10Accuracy=0.617, over 7275.00 frames. ], tot_loss[loss=3.148, NarTop10Accuracy=0.6961, over 5984.17 frames. ], batch size: 23, lr: 4.49e-03 2024-08-06 18:40:30,217 INFO [trainer.py:765] (0/8) Epoch 19, batch 1900, train_loss[loss=3.096, NarTop10Accuracy=0.7076, over 5937.00 frames. ], tot_loss[loss=3.148, NarTop10Accuracy=0.6965, over 6011.84 frames. ], batch size: 50, lr: 4.49e-03 2024-08-06 18:40:55,793 INFO [trainer.py:765] (0/8) Epoch 19, batch 2000, train_loss[loss=3.303, NarTop10Accuracy=0.6625, over 6414.00 frames. ], tot_loss[loss=3.148, NarTop10Accuracy=0.6964, over 5986.55 frames. ], batch size: 50, lr: 4.48e-03 2024-08-06 18:41:21,183 INFO [trainer.py:765] (0/8) Epoch 19, batch 2100, train_loss[loss=3.075, NarTop10Accuracy=0.7175, over 3891.00 frames. ], tot_loss[loss=3.138, NarTop10Accuracy=0.6984, over 5966.23 frames. ], batch size: 4, lr: 4.48e-03 2024-08-06 18:41:46,455 INFO [trainer.py:765] (0/8) Epoch 19, batch 2200, train_loss[loss=3.117, NarTop10Accuracy=0.6987, over 7449.00 frames. ], tot_loss[loss=3.148, NarTop10Accuracy=0.6961, over 5995.22 frames. ], batch size: 33, lr: 4.47e-03 2024-08-06 18:42:11,559 INFO [trainer.py:765] (0/8) Epoch 19, batch 2300, train_loss[loss=3.109, NarTop10Accuracy=0.6974, over 5631.00 frames. ], tot_loss[loss=3.161, NarTop10Accuracy=0.6939, over 6011.12 frames. ], batch size: 9, lr: 4.47e-03 2024-08-06 18:42:35,987 INFO [trainer.py:765] (0/8) Epoch 19, batch 2400, train_loss[loss=2.936, NarTop10Accuracy=0.7396, over 5184.00 frames. ], tot_loss[loss=3.148, NarTop10Accuracy=0.6967, over 5787.86 frames. ], batch size: 7, lr: 4.46e-03 2024-08-06 18:42:59,690 INFO [trainer.py:765] (0/8) Epoch 19, batch 2500, train_loss[loss=2.78, NarTop10Accuracy=0.7644, over 5244.00 frames. ], tot_loss[loss=3.131, NarTop10Accuracy=0.6994, over 5470.51 frames. ], batch size: 7, lr: 4.46e-03 2024-08-06 18:43:19,776 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 18:43:19,779 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-19.pt 2024-08-06 18:44:22,974 INFO [trainer.py:765] (0/8) Epoch 20, batch 100, train_loss[loss=3.325, NarTop10Accuracy=0.6555, over 7494.00 frames. ], tot_loss[loss=3.158, NarTop10Accuracy=0.6945, over 2374.45 frames. ], batch size: 31, lr: 4.34e-03 2024-08-06 18:44:58,379 INFO [trainer.py:765] (0/8) Epoch 20, batch 200, train_loss[loss=3.37, NarTop10Accuracy=0.6608, over 6822.00 frames. ], tot_loss[loss=3.132, NarTop10Accuracy=0.6996, over 3850.26 frames. ], batch size: 17, lr: 4.33e-03 2024-08-06 18:45:32,279 INFO [trainer.py:765] (0/8) Epoch 20, batch 300, train_loss[loss=3.455, NarTop10Accuracy=0.6299, over 7338.00 frames. ], tot_loss[loss=3.12, NarTop10Accuracy=0.7018, over 4667.75 frames. ], batch size: 23, lr: 4.33e-03 2024-08-06 18:46:05,128 INFO [trainer.py:765] (0/8) Epoch 20, batch 400, train_loss[loss=2.753, NarTop10Accuracy=0.7702, over 5148.00 frames. ], tot_loss[loss=3.117, NarTop10Accuracy=0.7026, over 5114.08 frames. ], batch size: 7, lr: 4.32e-03 2024-08-06 18:46:35,770 INFO [trainer.py:765] (0/8) Epoch 20, batch 500, train_loss[loss=2.818, NarTop10Accuracy=0.7584, over 6183.00 frames. ], tot_loss[loss=3.125, NarTop10Accuracy=0.7009, over 5379.38 frames. ], batch size: 11, lr: 4.32e-03 2024-08-06 18:47:13,255 INFO [trainer.py:765] (0/8) Epoch 20, batch 600, train_loss[loss=3.077, NarTop10Accuracy=0.702, over 5787.00 frames. ], tot_loss[loss=3.123, NarTop10Accuracy=0.7014, over 5638.99 frames. ], batch size: 9, lr: 4.31e-03 2024-08-06 18:47:44,482 INFO [trainer.py:765] (0/8) Epoch 20, batch 700, train_loss[loss=2.742, NarTop10Accuracy=0.7766, over 4998.00 frames. ], tot_loss[loss=3.111, NarTop10Accuracy=0.704, over 5705.02 frames. ], batch size: 6, lr: 4.31e-03 2024-08-06 18:48:21,016 INFO [trainer.py:765] (0/8) Epoch 20, batch 800, train_loss[loss=2.753, NarTop10Accuracy=0.7758, over 4305.00 frames. ], tot_loss[loss=3.129, NarTop10Accuracy=0.7002, over 5766.57 frames. ], batch size: 5, lr: 4.31e-03 2024-08-06 18:48:56,535 INFO [trainer.py:765] (0/8) Epoch 20, batch 900, train_loss[loss=2.914, NarTop10Accuracy=0.7482, over 6681.00 frames. ], tot_loss[loss=3.126, NarTop10Accuracy=0.7007, over 5788.27 frames. ], batch size: 14, lr: 4.30e-03 2024-08-06 18:49:29,805 INFO [trainer.py:765] (0/8) Epoch 20, batch 1000, train_loss[loss=3.133, NarTop10Accuracy=0.6954, over 6756.00 frames. ], tot_loss[loss=3.151, NarTop10Accuracy=0.6952, over 5885.12 frames. ], batch size: 14, lr: 4.30e-03 2024-08-06 18:49:52,237 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 18:50:00,326 INFO [trainer.py:811] (0/8) Epoch 20, validation: loss=2.962, NarTop10Accuracy=0.7336, over 1905321.00 frames. 2024-08-06 18:50:00,327 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 18:50:00,875 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.681e+02 2.061e+02 2.223e+02 2.401e+02 3.871e+02, threshold=4.447e+02, percent-clipped=0.0 2024-08-06 18:50:15,428 INFO [trainer.py:765] (0/8) Epoch 20, batch 1100, train_loss[loss=3.149, NarTop10Accuracy=0.6895, over 7005.00 frames. ], tot_loss[loss=3.141, NarTop10Accuracy=0.6972, over 5922.15 frames. ], batch size: 17, lr: 4.29e-03 2024-08-06 18:50:53,776 INFO [trainer.py:765] (0/8) Epoch 20, batch 1200, train_loss[loss=3.085, NarTop10Accuracy=0.7123, over 7068.00 frames. ], tot_loss[loss=3.145, NarTop10Accuracy=0.6963, over 5924.01 frames. ], batch size: 31, lr: 4.29e-03 2024-08-06 18:51:25,130 INFO [trainer.py:765] (0/8) Epoch 20, batch 1300, train_loss[loss=3.232, NarTop10Accuracy=0.6779, over 5085.00 frames. ], tot_loss[loss=3.144, NarTop10Accuracy=0.6965, over 6003.78 frames. ], batch size: 6, lr: 4.29e-03 2024-08-06 18:51:59,315 INFO [trainer.py:765] (0/8) Epoch 20, batch 1400, train_loss[loss=2.93, NarTop10Accuracy=0.7325, over 5955.00 frames. ], tot_loss[loss=3.131, NarTop10Accuracy=0.6993, over 6020.72 frames. ], batch size: 11, lr: 4.28e-03 2024-08-06 18:52:32,806 INFO [trainer.py:765] (0/8) Epoch 20, batch 1500, train_loss[loss=3.341, NarTop10Accuracy=0.6577, over 6567.00 frames. ], tot_loss[loss=3.14, NarTop10Accuracy=0.6971, over 5965.48 frames. ], batch size: 50, lr: 4.28e-03 2024-08-06 18:53:00,635 INFO [trainer.py:765] (0/8) Epoch 20, batch 1600, train_loss[loss=2.921, NarTop10Accuracy=0.7344, over 7455.00 frames. ], tot_loss[loss=3.148, NarTop10Accuracy=0.6957, over 5947.51 frames. ], batch size: 23, lr: 4.27e-03 2024-08-06 18:53:27,328 INFO [trainer.py:765] (0/8) Epoch 20, batch 1700, train_loss[loss=3.523, NarTop10Accuracy=0.615, over 6288.00 frames. ], tot_loss[loss=3.147, NarTop10Accuracy=0.6961, over 5937.10 frames. ], batch size: 13, lr: 4.27e-03 2024-08-06 18:53:53,851 INFO [trainer.py:765] (0/8) Epoch 20, batch 1800, train_loss[loss=3.113, NarTop10Accuracy=0.6993, over 7227.00 frames. ], tot_loss[loss=3.132, NarTop10Accuracy=0.699, over 5999.40 frames. ], batch size: 22, lr: 4.26e-03 2024-08-06 18:54:20,316 INFO [trainer.py:765] (0/8) Epoch 20, batch 1900, train_loss[loss=3.105, NarTop10Accuracy=0.7113, over 5943.00 frames. ], tot_loss[loss=3.162, NarTop10Accuracy=0.6935, over 6035.14 frames. ], batch size: 50, lr: 4.26e-03 2024-08-06 18:54:45,890 INFO [trainer.py:765] (0/8) Epoch 20, batch 2000, train_loss[loss=3.606, NarTop10Accuracy=0.5992, over 6099.00 frames. ], tot_loss[loss=3.162, NarTop10Accuracy=0.6932, over 6003.14 frames. ], batch size: 50, lr: 4.26e-03 2024-08-06 18:55:11,183 INFO [trainer.py:765] (0/8) Epoch 20, batch 2100, train_loss[loss=3.428, NarTop10Accuracy=0.6317, over 4725.00 frames. ], tot_loss[loss=3.159, NarTop10Accuracy=0.6936, over 5967.21 frames. ], batch size: 5, lr: 4.25e-03 2024-08-06 18:55:36,415 INFO [trainer.py:765] (0/8) Epoch 20, batch 2200, train_loss[loss=2.939, NarTop10Accuracy=0.7404, over 7365.00 frames. ], tot_loss[loss=3.158, NarTop10Accuracy=0.6939, over 6015.14 frames. ], batch size: 31, lr: 4.25e-03 2024-08-06 18:56:01,636 INFO [trainer.py:765] (0/8) Epoch 20, batch 2300, train_loss[loss=3.199, NarTop10Accuracy=0.6775, over 5595.00 frames. ], tot_loss[loss=3.165, NarTop10Accuracy=0.6925, over 6038.35 frames. ], batch size: 9, lr: 4.24e-03 2024-08-06 18:56:26,050 INFO [trainer.py:765] (0/8) Epoch 20, batch 2400, train_loss[loss=2.885, NarTop10Accuracy=0.7565, over 5094.00 frames. ], tot_loss[loss=3.151, NarTop10Accuracy=0.6951, over 5772.83 frames. ], batch size: 7, lr: 4.24e-03 2024-08-06 18:56:49,566 INFO [trainer.py:765] (0/8) Epoch 20, batch 2500, train_loss[loss=2.936, NarTop10Accuracy=0.7448, over 5118.00 frames. ], tot_loss[loss=3.116, NarTop10Accuracy=0.7019, over 5469.27 frames. ], batch size: 7, lr: 4.24e-03 2024-08-06 18:57:09,280 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 18:57:09,284 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-20.pt 2024-08-06 18:58:09,585 INFO [trainer.py:765] (0/8) Epoch 21, batch 100, train_loss[loss=3.083, NarTop10Accuracy=0.7084, over 7230.00 frames. ], tot_loss[loss=3.114, NarTop10Accuracy=0.7039, over 2380.81 frames. ], batch size: 31, lr: 4.13e-03 2024-08-06 18:58:40,417 INFO [trainer.py:765] (0/8) Epoch 21, batch 200, train_loss[loss=2.868, NarTop10Accuracy=0.7501, over 6966.00 frames. ], tot_loss[loss=3.131, NarTop10Accuracy=0.7, over 3874.31 frames. ], batch size: 17, lr: 4.12e-03 2024-08-06 18:59:13,333 INFO [trainer.py:765] (0/8) Epoch 21, batch 300, train_loss[loss=2.873, NarTop10Accuracy=0.7475, over 7107.00 frames. ], tot_loss[loss=3.134, NarTop10Accuracy=0.6991, over 4680.79 frames. ], batch size: 22, lr: 4.12e-03 2024-08-06 18:59:48,151 INFO [trainer.py:765] (0/8) Epoch 21, batch 400, train_loss[loss=2.859, NarTop10Accuracy=0.7488, over 5289.00 frames. ], tot_loss[loss=3.116, NarTop10Accuracy=0.7027, over 5100.97 frames. ], batch size: 7, lr: 4.11e-03 2024-08-06 19:00:16,840 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 19:00:25,075 INFO [trainer.py:811] (0/8) Epoch 21, validation: loss=2.992, NarTop10Accuracy=0.7268, over 1905321.00 frames. 2024-08-06 19:00:25,076 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 19:00:25,622 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.727e+02 2.071e+02 2.224e+02 2.387e+02 3.839e+02, threshold=4.447e+02, percent-clipped=0.0 2024-08-06 19:00:29,890 INFO [trainer.py:765] (0/8) Epoch 21, batch 500, train_loss[loss=2.895, NarTop10Accuracy=0.747, over 5961.00 frames. ], tot_loss[loss=3.113, NarTop10Accuracy=0.703, over 5369.01 frames. ], batch size: 11, lr: 4.11e-03 2024-08-06 19:01:03,329 INFO [trainer.py:765] (0/8) Epoch 21, batch 600, train_loss[loss=3.431, NarTop10Accuracy=0.6379, over 5709.00 frames. ], tot_loss[loss=3.104, NarTop10Accuracy=0.7053, over 5646.07 frames. ], batch size: 9, lr: 4.11e-03 2024-08-06 19:01:39,388 INFO [trainer.py:765] (0/8) Epoch 21, batch 700, train_loss[loss=2.807, NarTop10Accuracy=0.7708, over 5163.00 frames. ], tot_loss[loss=3.116, NarTop10Accuracy=0.7026, over 5708.51 frames. ], batch size: 6, lr: 4.10e-03 2024-08-06 19:02:18,047 INFO [trainer.py:765] (0/8) Epoch 21, batch 800, train_loss[loss=2.956, NarTop10Accuracy=0.7246, over 5166.00 frames. ], tot_loss[loss=3.127, NarTop10Accuracy=0.7004, over 5771.30 frames. ], batch size: 6, lr: 4.10e-03 2024-08-06 19:02:48,663 INFO [trainer.py:765] (0/8) Epoch 21, batch 900, train_loss[loss=2.977, NarTop10Accuracy=0.7316, over 6678.00 frames. ], tot_loss[loss=3.123, NarTop10Accuracy=0.7008, over 5799.40 frames. ], batch size: 14, lr: 4.09e-03 2024-08-06 19:03:25,801 INFO [trainer.py:765] (0/8) Epoch 21, batch 1000, train_loss[loss=3.005, NarTop10Accuracy=0.7235, over 6312.00 frames. ], tot_loss[loss=3.13, NarTop10Accuracy=0.6995, over 5903.12 frames. ], batch size: 13, lr: 4.09e-03 2024-08-06 19:04:07,206 INFO [trainer.py:765] (0/8) Epoch 21, batch 1100, train_loss[loss=3.434, NarTop10Accuracy=0.6399, over 6723.00 frames. ], tot_loss[loss=3.145, NarTop10Accuracy=0.6962, over 5932.78 frames. ], batch size: 17, lr: 4.09e-03 2024-08-06 19:04:38,462 INFO [trainer.py:765] (0/8) Epoch 21, batch 1200, train_loss[loss=3.292, NarTop10Accuracy=0.6671, over 7023.00 frames. ], tot_loss[loss=3.128, NarTop10Accuracy=0.6998, over 5937.17 frames. ], batch size: 31, lr: 4.08e-03 2024-08-06 19:05:15,316 INFO [trainer.py:765] (0/8) Epoch 21, batch 1300, train_loss[loss=2.962, NarTop10Accuracy=0.7338, over 5046.00 frames. ], tot_loss[loss=3.112, NarTop10Accuracy=0.7034, over 5998.40 frames. ], batch size: 6, lr: 4.08e-03 2024-08-06 19:05:55,559 INFO [trainer.py:765] (0/8) Epoch 21, batch 1400, train_loss[loss=3.536, NarTop10Accuracy=0.6178, over 6081.00 frames. ], tot_loss[loss=3.114, NarTop10Accuracy=0.7028, over 6025.23 frames. ], batch size: 11, lr: 4.07e-03 2024-08-06 19:06:23,599 INFO [trainer.py:765] (0/8) Epoch 21, batch 1500, train_loss[loss=3.36, NarTop10Accuracy=0.6572, over 5397.00 frames. ], tot_loss[loss=3.132, NarTop10Accuracy=0.6991, over 5953.59 frames. ], batch size: 50, lr: 4.07e-03 2024-08-06 19:06:51,461 INFO [trainer.py:765] (0/8) Epoch 21, batch 1600, train_loss[loss=2.929, NarTop10Accuracy=0.7384, over 7104.00 frames. ], tot_loss[loss=3.128, NarTop10Accuracy=0.7001, over 5939.40 frames. ], batch size: 22, lr: 4.07e-03 2024-08-06 19:07:18,211 INFO [trainer.py:765] (0/8) Epoch 21, batch 1700, train_loss[loss=3.258, NarTop10Accuracy=0.6826, over 6306.00 frames. ], tot_loss[loss=3.136, NarTop10Accuracy=0.6984, over 5903.55 frames. ], batch size: 13, lr: 4.06e-03 2024-08-06 19:07:44,809 INFO [trainer.py:765] (0/8) Epoch 21, batch 1800, train_loss[loss=2.873, NarTop10Accuracy=0.7524, over 6984.00 frames. ], tot_loss[loss=3.137, NarTop10Accuracy=0.6984, over 5986.30 frames. ], batch size: 22, lr: 4.06e-03 2024-08-06 19:08:11,369 INFO [trainer.py:765] (0/8) Epoch 21, batch 1900, train_loss[loss=3.637, NarTop10Accuracy=0.5894, over 6534.00 frames. ], tot_loss[loss=3.149, NarTop10Accuracy=0.696, over 6035.72 frames. ], batch size: 50, lr: 4.06e-03 2024-08-06 19:08:37,105 INFO [trainer.py:765] (0/8) Epoch 21, batch 2000, train_loss[loss=3.503, NarTop10Accuracy=0.6335, over 6084.00 frames. ], tot_loss[loss=3.143, NarTop10Accuracy=0.6972, over 6000.16 frames. ], batch size: 50, lr: 4.05e-03 2024-08-06 19:09:02,507 INFO [trainer.py:765] (0/8) Epoch 21, batch 2100, train_loss[loss=2.897, NarTop10Accuracy=0.7413, over 4929.00 frames. ], tot_loss[loss=3.145, NarTop10Accuracy=0.6968, over 5971.51 frames. ], batch size: 5, lr: 4.05e-03 2024-08-06 19:09:27,891 INFO [trainer.py:765] (0/8) Epoch 21, batch 2200, train_loss[loss=3.032, NarTop10Accuracy=0.7198, over 7413.00 frames. ], tot_loss[loss=3.146, NarTop10Accuracy=0.6965, over 6007.56 frames. ], batch size: 31, lr: 4.04e-03 2024-08-06 19:09:53,222 INFO [trainer.py:765] (0/8) Epoch 21, batch 2300, train_loss[loss=3.212, NarTop10Accuracy=0.6867, over 5661.00 frames. ], tot_loss[loss=3.168, NarTop10Accuracy=0.692, over 6018.09 frames. ], batch size: 9, lr: 4.04e-03 2024-08-06 19:10:17,596 INFO [trainer.py:765] (0/8) Epoch 21, batch 2400, train_loss[loss=3.356, NarTop10Accuracy=0.6497, over 5169.00 frames. ], tot_loss[loss=3.141, NarTop10Accuracy=0.6976, over 5774.84 frames. ], batch size: 7, lr: 4.04e-03 2024-08-06 19:10:37,230 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 19:10:45,275 INFO [trainer.py:811] (0/8) Epoch 21, validation: loss=2.971, NarTop10Accuracy=0.7316, over 1905321.00 frames. 2024-08-06 19:10:45,276 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 19:10:45,741 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.703e+02 2.100e+02 2.242e+02 2.407e+02 6.546e+02, threshold=4.484e+02, percent-clipped=0.1 2024-08-06 19:10:49,272 INFO [trainer.py:765] (0/8) Epoch 21, batch 2500, train_loss[loss=3.321, NarTop10Accuracy=0.6576, over 5130.00 frames. ], tot_loss[loss=3.098, NarTop10Accuracy=0.7061, over 5480.44 frames. ], batch size: 7, lr: 4.03e-03 2024-08-06 19:11:09,071 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 19:11:09,073 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-21.pt 2024-08-06 19:12:09,054 INFO [trainer.py:765] (0/8) Epoch 22, batch 100, train_loss[loss=2.914, NarTop10Accuracy=0.7461, over 7416.00 frames. ], tot_loss[loss=3.084, NarTop10Accuracy=0.7088, over 2360.88 frames. ], batch size: 31, lr: 3.93e-03 2024-08-06 19:12:44,462 INFO [trainer.py:765] (0/8) Epoch 22, batch 200, train_loss[loss=3.218, NarTop10Accuracy=0.6775, over 6882.00 frames. ], tot_loss[loss=3.104, NarTop10Accuracy=0.705, over 3852.07 frames. ], batch size: 17, lr: 3.93e-03 2024-08-06 19:13:14,533 INFO [trainer.py:765] (0/8) Epoch 22, batch 300, train_loss[loss=2.869, NarTop10Accuracy=0.7556, over 6999.00 frames. ], tot_loss[loss=3.098, NarTop10Accuracy=0.7063, over 4668.34 frames. ], batch size: 22, lr: 3.93e-03 2024-08-06 19:13:49,229 INFO [trainer.py:765] (0/8) Epoch 22, batch 400, train_loss[loss=2.816, NarTop10Accuracy=0.7637, over 5064.00 frames. ], tot_loss[loss=3.088, NarTop10Accuracy=0.7081, over 5124.43 frames. ], batch size: 7, lr: 3.92e-03 2024-08-06 19:14:24,850 INFO [trainer.py:765] (0/8) Epoch 22, batch 500, train_loss[loss=3.19, NarTop10Accuracy=0.69, over 6030.00 frames. ], tot_loss[loss=3.091, NarTop10Accuracy=0.708, over 5395.45 frames. ], batch size: 11, lr: 3.92e-03 2024-08-06 19:14:55,702 INFO [trainer.py:765] (0/8) Epoch 22, batch 600, train_loss[loss=3.215, NarTop10Accuracy=0.6948, over 5730.00 frames. ], tot_loss[loss=3.121, NarTop10Accuracy=0.7013, over 5631.75 frames. ], batch size: 9, lr: 3.92e-03 2024-08-06 19:15:30,867 INFO [trainer.py:765] (0/8) Epoch 22, batch 700, train_loss[loss=3.479, NarTop10Accuracy=0.6266, over 4242.00 frames. ], tot_loss[loss=3.127, NarTop10Accuracy=0.7003, over 5706.08 frames. ], batch size: 5, lr: 3.91e-03 2024-08-06 19:16:10,664 INFO [trainer.py:765] (0/8) Epoch 22, batch 800, train_loss[loss=3.04, NarTop10Accuracy=0.7249, over 5076.00 frames. ], tot_loss[loss=3.122, NarTop10Accuracy=0.7016, over 5783.61 frames. ], batch size: 6, lr: 3.91e-03 2024-08-06 19:16:40,952 INFO [trainer.py:765] (0/8) Epoch 22, batch 900, train_loss[loss=2.925, NarTop10Accuracy=0.7333, over 6261.00 frames. ], tot_loss[loss=3.12, NarTop10Accuracy=0.7018, over 5806.92 frames. ], batch size: 13, lr: 3.90e-03 2024-08-06 19:17:16,434 INFO [trainer.py:765] (0/8) Epoch 22, batch 1000, train_loss[loss=3.049, NarTop10Accuracy=0.7137, over 6126.00 frames. ], tot_loss[loss=3.111, NarTop10Accuracy=0.7034, over 5909.67 frames. ], batch size: 13, lr: 3.90e-03 2024-08-06 19:17:52,085 INFO [trainer.py:765] (0/8) Epoch 22, batch 1100, train_loss[loss=3.045, NarTop10Accuracy=0.7118, over 7164.00 frames. ], tot_loss[loss=3.122, NarTop10Accuracy=0.7014, over 5929.03 frames. ], batch size: 18, lr: 3.90e-03 2024-08-06 19:18:25,926 INFO [trainer.py:765] (0/8) Epoch 22, batch 1200, train_loss[loss=2.991, NarTop10Accuracy=0.7268, over 7362.00 frames. ], tot_loss[loss=3.103, NarTop10Accuracy=0.7055, over 5946.35 frames. ], batch size: 31, lr: 3.89e-03 2024-08-06 19:19:01,253 INFO [trainer.py:765] (0/8) Epoch 22, batch 1300, train_loss[loss=2.806, NarTop10Accuracy=0.7583, over 5127.00 frames. ], tot_loss[loss=3.098, NarTop10Accuracy=0.7064, over 6018.84 frames. ], batch size: 6, lr: 3.89e-03 2024-08-06 19:19:33,317 INFO [trainer.py:765] (0/8) Epoch 22, batch 1400, train_loss[loss=2.761, NarTop10Accuracy=0.7796, over 5955.00 frames. ], tot_loss[loss=3.113, NarTop10Accuracy=0.7033, over 6032.79 frames. ], batch size: 11, lr: 3.89e-03 2024-08-06 19:20:03,830 INFO [trainer.py:765] (0/8) Epoch 22, batch 1500, train_loss[loss=3.495, NarTop10Accuracy=0.6285, over 6045.00 frames. ], tot_loss[loss=3.112, NarTop10Accuracy=0.7035, over 5962.59 frames. ], batch size: 53, lr: 3.88e-03 2024-08-06 19:20:31,647 INFO [trainer.py:765] (0/8) Epoch 22, batch 1600, train_loss[loss=3.15, NarTop10Accuracy=0.6979, over 6978.00 frames. ], tot_loss[loss=3.132, NarTop10Accuracy=0.6993, over 5939.92 frames. ], batch size: 22, lr: 3.88e-03 2024-08-06 19:20:58,418 INFO [trainer.py:765] (0/8) Epoch 22, batch 1700, train_loss[loss=3.131, NarTop10Accuracy=0.6916, over 6261.00 frames. ], tot_loss[loss=3.128, NarTop10Accuracy=0.6996, over 5926.41 frames. ], batch size: 13, lr: 3.88e-03 2024-08-06 19:21:25,010 INFO [trainer.py:765] (0/8) Epoch 22, batch 1800, train_loss[loss=3.056, NarTop10Accuracy=0.7131, over 7137.00 frames. ], tot_loss[loss=3.128, NarTop10Accuracy=0.6998, over 5993.25 frames. ], batch size: 22, lr: 3.87e-03 2024-08-06 19:21:51,372 INFO [trainer.py:765] (0/8) Epoch 22, batch 1900, train_loss[loss=3.057, NarTop10Accuracy=0.7168, over 6276.00 frames. ], tot_loss[loss=3.15, NarTop10Accuracy=0.6951, over 6038.33 frames. ], batch size: 52, lr: 3.87e-03 2024-08-06 19:21:53,110 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 19:22:01,088 INFO [trainer.py:811] (0/8) Epoch 22, validation: loss=3.009, NarTop10Accuracy=0.7241, over 1905321.00 frames. 2024-08-06 19:22:01,089 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 19:22:01,575 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.670e+02 2.114e+02 2.276e+02 2.445e+02 4.438e+02, threshold=4.551e+02, percent-clipped=0.0 2024-08-06 19:22:24,818 INFO [trainer.py:765] (0/8) Epoch 22, batch 2000, train_loss[loss=3.566, NarTop10Accuracy=0.6135, over 6039.00 frames. ], tot_loss[loss=3.126, NarTop10Accuracy=0.7002, over 6006.04 frames. ], batch size: 50, lr: 3.87e-03 2024-08-06 19:22:50,040 INFO [trainer.py:765] (0/8) Epoch 22, batch 2100, train_loss[loss=3.426, NarTop10Accuracy=0.6339, over 4908.00 frames. ], tot_loss[loss=3.118, NarTop10Accuracy=0.7021, over 5985.07 frames. ], batch size: 5, lr: 3.86e-03 2024-08-06 19:23:15,229 INFO [trainer.py:765] (0/8) Epoch 22, batch 2200, train_loss[loss=3.091, NarTop10Accuracy=0.7141, over 7050.00 frames. ], tot_loss[loss=3.114, NarTop10Accuracy=0.7031, over 6018.52 frames. ], batch size: 31, lr: 3.86e-03 2024-08-06 19:23:40,314 INFO [trainer.py:765] (0/8) Epoch 22, batch 2300, train_loss[loss=3.222, NarTop10Accuracy=0.6792, over 5655.00 frames. ], tot_loss[loss=3.134, NarTop10Accuracy=0.6993, over 6046.07 frames. ], batch size: 9, lr: 3.86e-03 2024-08-06 19:24:04,601 INFO [trainer.py:765] (0/8) Epoch 22, batch 2400, train_loss[loss=3.011, NarTop10Accuracy=0.7226, over 5346.00 frames. ], tot_loss[loss=3.118, NarTop10Accuracy=0.7024, over 5790.91 frames. ], batch size: 7, lr: 3.85e-03 2024-08-06 19:24:28,024 INFO [trainer.py:765] (0/8) Epoch 22, batch 2500, train_loss[loss=3.162, NarTop10Accuracy=0.6975, over 5244.00 frames. ], tot_loss[loss=3.102, NarTop10Accuracy=0.705, over 5482.98 frames. ], batch size: 7, lr: 3.85e-03 2024-08-06 19:24:47,604 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 19:24:47,607 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-22.pt 2024-08-06 19:25:45,385 INFO [trainer.py:765] (0/8) Epoch 23, batch 100, train_loss[loss=3.048, NarTop10Accuracy=0.7266, over 6879.00 frames. ], tot_loss[loss=3.118, NarTop10Accuracy=0.7015, over 2367.69 frames. ], batch size: 31, lr: 3.76e-03 2024-08-06 19:26:21,309 INFO [trainer.py:765] (0/8) Epoch 23, batch 200, train_loss[loss=3.457, NarTop10Accuracy=0.6366, over 6573.00 frames. ], tot_loss[loss=3.129, NarTop10Accuracy=0.6997, over 3855.86 frames. ], batch size: 17, lr: 3.76e-03 2024-08-06 19:26:57,603 INFO [trainer.py:765] (0/8) Epoch 23, batch 300, train_loss[loss=2.893, NarTop10Accuracy=0.7498, over 7248.00 frames. ], tot_loss[loss=3.104, NarTop10Accuracy=0.7051, over 4650.89 frames. ], batch size: 22, lr: 3.75e-03 2024-08-06 19:27:26,540 INFO [trainer.py:765] (0/8) Epoch 23, batch 400, train_loss[loss=3.173, NarTop10Accuracy=0.6811, over 5037.00 frames. ], tot_loss[loss=3.113, NarTop10Accuracy=0.703, over 5091.99 frames. ], batch size: 7, lr: 3.75e-03 2024-08-06 19:27:59,713 INFO [trainer.py:765] (0/8) Epoch 23, batch 500, train_loss[loss=3.371, NarTop10Accuracy=0.6512, over 6243.00 frames. ], tot_loss[loss=3.117, NarTop10Accuracy=0.7017, over 5365.62 frames. ], batch size: 11, lr: 3.75e-03 2024-08-06 19:28:35,883 INFO [trainer.py:765] (0/8) Epoch 23, batch 600, train_loss[loss=3.262, NarTop10Accuracy=0.6714, over 5706.00 frames. ], tot_loss[loss=3.109, NarTop10Accuracy=0.7037, over 5640.50 frames. ], batch size: 9, lr: 3.74e-03 2024-08-06 19:29:11,367 INFO [trainer.py:765] (0/8) Epoch 23, batch 700, train_loss[loss=3.101, NarTop10Accuracy=0.7063, over 5130.00 frames. ], tot_loss[loss=3.098, NarTop10Accuracy=0.7063, over 5725.55 frames. ], batch size: 6, lr: 3.74e-03 2024-08-06 19:29:43,613 INFO [trainer.py:765] (0/8) Epoch 23, batch 800, train_loss[loss=2.964, NarTop10Accuracy=0.7444, over 4314.00 frames. ], tot_loss[loss=3.104, NarTop10Accuracy=0.7049, over 5777.89 frames. ], batch size: 5, lr: 3.74e-03 2024-08-06 19:30:19,390 INFO [trainer.py:765] (0/8) Epoch 23, batch 900, train_loss[loss=3.261, NarTop10Accuracy=0.6655, over 6564.00 frames. ], tot_loss[loss=3.093, NarTop10Accuracy=0.7071, over 5803.39 frames. ], batch size: 14, lr: 3.73e-03 2024-08-06 19:30:58,195 INFO [trainer.py:765] (0/8) Epoch 23, batch 1000, train_loss[loss=2.97, NarTop10Accuracy=0.7348, over 6207.00 frames. ], tot_loss[loss=3.093, NarTop10Accuracy=0.7075, over 5893.95 frames. ], batch size: 13, lr: 3.73e-03 2024-08-06 19:31:31,521 INFO [trainer.py:765] (0/8) Epoch 23, batch 1100, train_loss[loss=3.053, NarTop10Accuracy=0.7173, over 6588.00 frames. ], tot_loss[loss=3.096, NarTop10Accuracy=0.7069, over 5923.62 frames. ], batch size: 17, lr: 3.73e-03 2024-08-06 19:32:08,518 INFO [trainer.py:765] (0/8) Epoch 23, batch 1200, train_loss[loss=3.019, NarTop10Accuracy=0.724, over 7374.00 frames. ], tot_loss[loss=3.103, NarTop10Accuracy=0.705, over 5919.75 frames. ], batch size: 31, lr: 3.72e-03 2024-08-06 19:32:46,937 INFO [trainer.py:765] (0/8) Epoch 23, batch 1300, train_loss[loss=3.162, NarTop10Accuracy=0.6922, over 5073.00 frames. ], tot_loss[loss=3.103, NarTop10Accuracy=0.7045, over 5992.69 frames. ], batch size: 6, lr: 3.72e-03 2024-08-06 19:32:56,402 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 19:33:04,722 INFO [trainer.py:811] (0/8) Epoch 23, validation: loss=2.893, NarTop10Accuracy=0.7468, over 1905321.00 frames. 2024-08-06 19:33:04,723 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 19:33:05,263 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.759e+02 2.108e+02 2.273e+02 2.457e+02 3.966e+02, threshold=4.546e+02, percent-clipped=0.0 2024-08-06 19:33:27,410 INFO [trainer.py:765] (0/8) Epoch 23, batch 1400, train_loss[loss=2.872, NarTop10Accuracy=0.7613, over 6153.00 frames. ], tot_loss[loss=3.109, NarTop10Accuracy=0.7039, over 6022.29 frames. ], batch size: 11, lr: 3.72e-03 2024-08-06 19:33:58,216 INFO [trainer.py:765] (0/8) Epoch 23, batch 1500, train_loss[loss=3.338, NarTop10Accuracy=0.657, over 6066.00 frames. ], tot_loss[loss=3.099, NarTop10Accuracy=0.7058, over 5942.79 frames. ], batch size: 51, lr: 3.71e-03 2024-08-06 19:34:26,015 INFO [trainer.py:765] (0/8) Epoch 23, batch 1600, train_loss[loss=2.802, NarTop10Accuracy=0.7616, over 6864.00 frames. ], tot_loss[loss=3.101, NarTop10Accuracy=0.7052, over 5938.90 frames. ], batch size: 22, lr: 3.71e-03 2024-08-06 19:34:52,783 INFO [trainer.py:765] (0/8) Epoch 23, batch 1700, train_loss[loss=3.398, NarTop10Accuracy=0.6497, over 6114.00 frames. ], tot_loss[loss=3.125, NarTop10Accuracy=0.7003, over 5905.78 frames. ], batch size: 13, lr: 3.71e-03 2024-08-06 19:35:19,262 INFO [trainer.py:765] (0/8) Epoch 23, batch 1800, train_loss[loss=3.007, NarTop10Accuracy=0.7272, over 7020.00 frames. ], tot_loss[loss=3.114, NarTop10Accuracy=0.7026, over 5965.70 frames. ], batch size: 22, lr: 3.70e-03 2024-08-06 19:35:45,596 INFO [trainer.py:765] (0/8) Epoch 23, batch 1900, train_loss[loss=3.386, NarTop10Accuracy=0.6504, over 6378.00 frames. ], tot_loss[loss=3.129, NarTop10Accuracy=0.6996, over 6017.65 frames. ], batch size: 50, lr: 3.70e-03 2024-08-06 19:36:11,171 INFO [trainer.py:765] (0/8) Epoch 23, batch 2000, train_loss[loss=3.577, NarTop10Accuracy=0.6125, over 5976.00 frames. ], tot_loss[loss=3.114, NarTop10Accuracy=0.703, over 5996.30 frames. ], batch size: 51, lr: 3.70e-03 2024-08-06 19:36:36,518 INFO [trainer.py:765] (0/8) Epoch 23, batch 2100, train_loss[loss=3.465, NarTop10Accuracy=0.622, over 4773.00 frames. ], tot_loss[loss=3.119, NarTop10Accuracy=0.702, over 5968.14 frames. ], batch size: 5, lr: 3.69e-03 2024-08-06 19:37:01,909 INFO [trainer.py:765] (0/8) Epoch 23, batch 2200, train_loss[loss=3.109, NarTop10Accuracy=0.701, over 6960.00 frames. ], tot_loss[loss=3.127, NarTop10Accuracy=0.7002, over 5990.37 frames. ], batch size: 31, lr: 3.69e-03 2024-08-06 19:37:27,061 INFO [trainer.py:765] (0/8) Epoch 23, batch 2300, train_loss[loss=2.809, NarTop10Accuracy=0.7649, over 5742.00 frames. ], tot_loss[loss=3.125, NarTop10Accuracy=0.7009, over 6012.09 frames. ], batch size: 9, lr: 3.69e-03 2024-08-06 19:37:51,424 INFO [trainer.py:765] (0/8) Epoch 23, batch 2400, train_loss[loss=3.113, NarTop10Accuracy=0.7019, over 5127.00 frames. ], tot_loss[loss=3.118, NarTop10Accuracy=0.7017, over 5768.10 frames. ], batch size: 7, lr: 3.69e-03 2024-08-06 19:38:15,053 INFO [trainer.py:765] (0/8) Epoch 23, batch 2500, train_loss[loss=3.322, NarTop10Accuracy=0.6559, over 5184.00 frames. ], tot_loss[loss=3.093, NarTop10Accuracy=0.7065, over 5482.83 frames. ], batch size: 7, lr: 3.68e-03 2024-08-06 19:38:35,103 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 19:38:35,106 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-23.pt 2024-08-06 19:39:37,632 INFO [trainer.py:765] (0/8) Epoch 24, batch 100, train_loss[loss=3.448, NarTop10Accuracy=0.6339, over 7455.00 frames. ], tot_loss[loss=3.128, NarTop10Accuracy=0.7007, over 2360.94 frames. ], batch size: 31, lr: 3.60e-03 2024-08-06 19:40:10,190 INFO [trainer.py:765] (0/8) Epoch 24, batch 200, train_loss[loss=2.769, NarTop10Accuracy=0.7747, over 6624.00 frames. ], tot_loss[loss=3.097, NarTop10Accuracy=0.7066, over 3857.77 frames. ], batch size: 17, lr: 3.60e-03 2024-08-06 19:40:40,556 INFO [trainer.py:765] (0/8) Epoch 24, batch 300, train_loss[loss=2.79, NarTop10Accuracy=0.765, over 6885.00 frames. ], tot_loss[loss=3.092, NarTop10Accuracy=0.7075, over 4665.05 frames. ], batch size: 22, lr: 3.59e-03 2024-08-06 19:41:18,234 INFO [trainer.py:765] (0/8) Epoch 24, batch 400, train_loss[loss=2.934, NarTop10Accuracy=0.734, over 5631.00 frames. ], tot_loss[loss=3.094, NarTop10Accuracy=0.7065, over 5124.13 frames. ], batch size: 8, lr: 3.59e-03 2024-08-06 19:41:50,322 INFO [trainer.py:765] (0/8) Epoch 24, batch 500, train_loss[loss=2.99, NarTop10Accuracy=0.7414, over 5985.00 frames. ], tot_loss[loss=3.084, NarTop10Accuracy=0.7087, over 5382.61 frames. ], batch size: 11, lr: 3.59e-03 2024-08-06 19:42:21,452 INFO [trainer.py:765] (0/8) Epoch 24, batch 600, train_loss[loss=2.799, NarTop10Accuracy=0.7634, over 5763.00 frames. ], tot_loss[loss=3.09, NarTop10Accuracy=0.7076, over 5647.36 frames. ], batch size: 9, lr: 3.58e-03 2024-08-06 19:42:52,843 INFO [trainer.py:765] (0/8) Epoch 24, batch 700, train_loss[loss=2.8, NarTop10Accuracy=0.7627, over 5151.00 frames. ], tot_loss[loss=3.092, NarTop10Accuracy=0.7074, over 5718.92 frames. ], batch size: 6, lr: 3.58e-03 2024-08-06 19:43:17,381 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 19:43:25,410 INFO [trainer.py:811] (0/8) Epoch 24, validation: loss=3.021, NarTop10Accuracy=0.7204, over 1905321.00 frames. 2024-08-06 19:43:25,411 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 19:43:28,561 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.744e+02 2.113e+02 2.282e+02 2.472e+02 2.357e+03, threshold=4.564e+02, percent-clipped=0.2 2024-08-06 19:43:40,815 INFO [trainer.py:765] (0/8) Epoch 24, batch 800, train_loss[loss=2.789, NarTop10Accuracy=0.7694, over 5118.00 frames. ], tot_loss[loss=3.09, NarTop10Accuracy=0.7083, over 5780.60 frames. ], batch size: 6, lr: 3.58e-03 2024-08-06 19:44:11,410 INFO [trainer.py:765] (0/8) Epoch 24, batch 900, train_loss[loss=2.803, NarTop10Accuracy=0.7739, over 6261.00 frames. ], tot_loss[loss=3.086, NarTop10Accuracy=0.7088, over 5781.55 frames. ], batch size: 13, lr: 3.57e-03 2024-08-06 19:44:47,490 INFO [trainer.py:765] (0/8) Epoch 24, batch 1000, train_loss[loss=3.141, NarTop10Accuracy=0.6996, over 6162.00 frames. ], tot_loss[loss=3.099, NarTop10Accuracy=0.7062, over 5892.96 frames. ], batch size: 13, lr: 3.57e-03 2024-08-06 19:45:27,108 INFO [trainer.py:765] (0/8) Epoch 24, batch 1100, train_loss[loss=3.3, NarTop10Accuracy=0.6566, over 6900.00 frames. ], tot_loss[loss=3.113, NarTop10Accuracy=0.7031, over 5933.46 frames. ], batch size: 17, lr: 3.57e-03 2024-08-06 19:45:58,438 INFO [trainer.py:765] (0/8) Epoch 24, batch 1200, train_loss[loss=3.095, NarTop10Accuracy=0.7116, over 7158.00 frames. ], tot_loss[loss=3.11, NarTop10Accuracy=0.7038, over 5923.83 frames. ], batch size: 31, lr: 3.57e-03 2024-08-06 19:46:30,295 INFO [trainer.py:765] (0/8) Epoch 24, batch 1300, train_loss[loss=3.184, NarTop10Accuracy=0.6852, over 5133.00 frames. ], tot_loss[loss=3.102, NarTop10Accuracy=0.7052, over 5991.82 frames. ], batch size: 6, lr: 3.56e-03 2024-08-06 19:47:07,860 INFO [trainer.py:765] (0/8) Epoch 24, batch 1400, train_loss[loss=3.15, NarTop10Accuracy=0.6961, over 5997.00 frames. ], tot_loss[loss=3.113, NarTop10Accuracy=0.7029, over 6009.56 frames. ], batch size: 11, lr: 3.56e-03 2024-08-06 19:47:40,958 INFO [trainer.py:765] (0/8) Epoch 24, batch 1500, train_loss[loss=3.417, NarTop10Accuracy=0.6465, over 6282.00 frames. ], tot_loss[loss=3.124, NarTop10Accuracy=0.7004, over 5961.78 frames. ], batch size: 50, lr: 3.56e-03 2024-08-06 19:48:08,676 INFO [trainer.py:765] (0/8) Epoch 24, batch 1600, train_loss[loss=3.439, NarTop10Accuracy=0.6335, over 7260.00 frames. ], tot_loss[loss=3.128, NarTop10Accuracy=0.6996, over 5938.73 frames. ], batch size: 22, lr: 3.55e-03 2024-08-06 19:48:35,267 INFO [trainer.py:765] (0/8) Epoch 24, batch 1700, train_loss[loss=2.786, NarTop10Accuracy=0.7659, over 6204.00 frames. ], tot_loss[loss=3.123, NarTop10Accuracy=0.7004, over 5922.16 frames. ], batch size: 13, lr: 3.55e-03 2024-08-06 19:49:01,638 INFO [trainer.py:765] (0/8) Epoch 24, batch 1800, train_loss[loss=2.863, NarTop10Accuracy=0.7494, over 7203.00 frames. ], tot_loss[loss=3.129, NarTop10Accuracy=0.6995, over 5965.89 frames. ], batch size: 22, lr: 3.55e-03 2024-08-06 19:49:28,042 INFO [trainer.py:765] (0/8) Epoch 24, batch 1900, train_loss[loss=3.463, NarTop10Accuracy=0.6291, over 6162.00 frames. ], tot_loss[loss=3.139, NarTop10Accuracy=0.6975, over 6012.45 frames. ], batch size: 50, lr: 3.55e-03 2024-08-06 19:49:53,534 INFO [trainer.py:765] (0/8) Epoch 24, batch 2000, train_loss[loss=3.595, NarTop10Accuracy=0.6035, over 5871.00 frames. ], tot_loss[loss=3.112, NarTop10Accuracy=0.7027, over 5992.16 frames. ], batch size: 50, lr: 3.54e-03 2024-08-06 19:50:18,821 INFO [trainer.py:765] (0/8) Epoch 24, batch 2100, train_loss[loss=2.704, NarTop10Accuracy=0.7788, over 4038.00 frames. ], tot_loss[loss=3.115, NarTop10Accuracy=0.7027, over 5965.35 frames. ], batch size: 4, lr: 3.54e-03 2024-08-06 19:50:43,942 INFO [trainer.py:765] (0/8) Epoch 24, batch 2200, train_loss[loss=3.457, NarTop10Accuracy=0.6208, over 7101.00 frames. ], tot_loss[loss=3.111, NarTop10Accuracy=0.703, over 6004.67 frames. ], batch size: 31, lr: 3.54e-03 2024-08-06 19:51:09,024 INFO [trainer.py:765] (0/8) Epoch 24, batch 2300, train_loss[loss=2.843, NarTop10Accuracy=0.7489, over 5772.00 frames. ], tot_loss[loss=3.107, NarTop10Accuracy=0.7039, over 6024.40 frames. ], batch size: 9, lr: 3.53e-03 2024-08-06 19:51:33,349 INFO [trainer.py:765] (0/8) Epoch 24, batch 2400, train_loss[loss=3.022, NarTop10Accuracy=0.7153, over 5232.00 frames. ], tot_loss[loss=3.096, NarTop10Accuracy=0.7064, over 5784.55 frames. ], batch size: 7, lr: 3.53e-03 2024-08-06 19:51:56,783 INFO [trainer.py:765] (0/8) Epoch 24, batch 2500, train_loss[loss=2.93, NarTop10Accuracy=0.7388, over 5136.00 frames. ], tot_loss[loss=3.076, NarTop10Accuracy=0.7104, over 5476.33 frames. ], batch size: 7, lr: 3.53e-03 2024-08-06 19:52:16,751 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 19:52:16,754 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-24.pt 2024-08-06 19:53:22,198 INFO [trainer.py:765] (0/8) Epoch 25, batch 100, train_loss[loss=3.356, NarTop10Accuracy=0.6517, over 7527.00 frames. ], tot_loss[loss=3.083, NarTop10Accuracy=0.71, over 2358.94 frames. ], batch size: 31, lr: 3.45e-03 2024-08-06 19:53:47,263 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 19:53:55,329 INFO [trainer.py:811] (0/8) Epoch 25, validation: loss=2.96, NarTop10Accuracy=0.7332, over 1905321.00 frames. 2024-08-06 19:53:55,330 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 19:53:55,916 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.693e+02 2.155e+02 2.306e+02 2.475e+02 6.485e+02, threshold=4.611e+02, percent-clipped=0.1 2024-08-06 19:54:01,177 INFO [trainer.py:765] (0/8) Epoch 25, batch 200, train_loss[loss=2.885, NarTop10Accuracy=0.7498, over 6792.00 frames. ], tot_loss[loss=3.09, NarTop10Accuracy=0.7081, over 3844.08 frames. ], batch size: 17, lr: 3.45e-03 2024-08-06 19:54:35,648 INFO [trainer.py:765] (0/8) Epoch 25, batch 300, train_loss[loss=3.178, NarTop10Accuracy=0.6943, over 7080.00 frames. ], tot_loss[loss=3.083, NarTop10Accuracy=0.7093, over 4663.88 frames. ], batch size: 22, lr: 3.45e-03 2024-08-06 19:55:12,958 INFO [trainer.py:765] (0/8) Epoch 25, batch 400, train_loss[loss=3.068, NarTop10Accuracy=0.7084, over 5220.00 frames. ], tot_loss[loss=3.087, NarTop10Accuracy=0.7083, over 5103.21 frames. ], batch size: 7, lr: 3.44e-03 2024-08-06 19:55:43,739 INFO [trainer.py:765] (0/8) Epoch 25, batch 500, train_loss[loss=2.868, NarTop10Accuracy=0.7602, over 6114.00 frames. ], tot_loss[loss=3.081, NarTop10Accuracy=0.7094, over 5363.83 frames. ], batch size: 11, lr: 3.44e-03 2024-08-06 19:56:14,815 INFO [trainer.py:765] (0/8) Epoch 25, batch 600, train_loss[loss=2.734, NarTop10Accuracy=0.7768, over 5670.00 frames. ], tot_loss[loss=3.082, NarTop10Accuracy=0.709, over 5652.15 frames. ], batch size: 9, lr: 3.44e-03 2024-08-06 19:56:55,497 INFO [trainer.py:765] (0/8) Epoch 25, batch 700, train_loss[loss=2.625, NarTop10Accuracy=0.7956, over 4926.00 frames. ], tot_loss[loss=3.078, NarTop10Accuracy=0.71, over 5707.99 frames. ], batch size: 6, lr: 3.43e-03 2024-08-06 19:57:30,137 INFO [trainer.py:765] (0/8) Epoch 25, batch 800, train_loss[loss=2.9, NarTop10Accuracy=0.7408, over 4293.00 frames. ], tot_loss[loss=3.082, NarTop10Accuracy=0.7091, over 5762.51 frames. ], batch size: 5, lr: 3.43e-03 2024-08-06 19:58:00,679 INFO [trainer.py:765] (0/8) Epoch 25, batch 900, train_loss[loss=3.145, NarTop10Accuracy=0.6957, over 6321.00 frames. ], tot_loss[loss=3.081, NarTop10Accuracy=0.7095, over 5787.48 frames. ], batch size: 13, lr: 3.43e-03 2024-08-06 19:58:37,640 INFO [trainer.py:765] (0/8) Epoch 25, batch 1000, train_loss[loss=2.849, NarTop10Accuracy=0.7637, over 6396.00 frames. ], tot_loss[loss=3.092, NarTop10Accuracy=0.7069, over 5884.36 frames. ], batch size: 13, lr: 3.43e-03 2024-08-06 19:59:14,856 INFO [trainer.py:765] (0/8) Epoch 25, batch 1100, train_loss[loss=3.329, NarTop10Accuracy=0.6469, over 6840.00 frames. ], tot_loss[loss=3.098, NarTop10Accuracy=0.7061, over 5929.45 frames. ], batch size: 17, lr: 3.42e-03 2024-08-06 19:59:49,040 INFO [trainer.py:765] (0/8) Epoch 25, batch 1200, train_loss[loss=3.389, NarTop10Accuracy=0.6459, over 7326.00 frames. ], tot_loss[loss=3.095, NarTop10Accuracy=0.7064, over 5936.07 frames. ], batch size: 31, lr: 3.42e-03 2024-08-06 20:00:25,599 INFO [trainer.py:765] (0/8) Epoch 25, batch 1300, train_loss[loss=3.032, NarTop10Accuracy=0.724, over 5049.00 frames. ], tot_loss[loss=3.09, NarTop10Accuracy=0.7076, over 5993.29 frames. ], batch size: 6, lr: 3.42e-03 2024-08-06 20:01:02,016 INFO [trainer.py:765] (0/8) Epoch 25, batch 1400, train_loss[loss=2.936, NarTop10Accuracy=0.744, over 5979.00 frames. ], tot_loss[loss=3.087, NarTop10Accuracy=0.7084, over 6018.26 frames. ], batch size: 11, lr: 3.42e-03 2024-08-06 20:01:32,823 INFO [trainer.py:765] (0/8) Epoch 25, batch 1500, train_loss[loss=3.187, NarTop10Accuracy=0.6888, over 6459.00 frames. ], tot_loss[loss=3.093, NarTop10Accuracy=0.7074, over 5954.16 frames. ], batch size: 50, lr: 3.41e-03 2024-08-06 20:02:00,625 INFO [trainer.py:765] (0/8) Epoch 25, batch 1600, train_loss[loss=2.931, NarTop10Accuracy=0.7378, over 7110.00 frames. ], tot_loss[loss=3.084, NarTop10Accuracy=0.7091, over 5944.66 frames. ], batch size: 22, lr: 3.41e-03 2024-08-06 20:02:27,360 INFO [trainer.py:765] (0/8) Epoch 25, batch 1700, train_loss[loss=2.92, NarTop10Accuracy=0.7405, over 6240.00 frames. ], tot_loss[loss=3.082, NarTop10Accuracy=0.7094, over 5941.12 frames. ], batch size: 13, lr: 3.41e-03 2024-08-06 20:02:53,854 INFO [trainer.py:765] (0/8) Epoch 25, batch 1800, train_loss[loss=3.333, NarTop10Accuracy=0.6622, over 7188.00 frames. ], tot_loss[loss=3.092, NarTop10Accuracy=0.7069, over 5988.11 frames. ], batch size: 22, lr: 3.40e-03 2024-08-06 20:03:20,341 INFO [trainer.py:765] (0/8) Epoch 25, batch 1900, train_loss[loss=3.264, NarTop10Accuracy=0.6804, over 6246.00 frames. ], tot_loss[loss=3.105, NarTop10Accuracy=0.7049, over 6034.88 frames. ], batch size: 51, lr: 3.40e-03 2024-08-06 20:03:45,934 INFO [trainer.py:765] (0/8) Epoch 25, batch 2000, train_loss[loss=3.515, NarTop10Accuracy=0.6243, over 6087.00 frames. ], tot_loss[loss=3.12, NarTop10Accuracy=0.7016, over 5992.65 frames. ], batch size: 50, lr: 3.40e-03 2024-08-06 20:04:11,246 INFO [trainer.py:765] (0/8) Epoch 25, batch 2100, train_loss[loss=2.727, NarTop10Accuracy=0.771, over 4005.00 frames. ], tot_loss[loss=3.105, NarTop10Accuracy=0.7046, over 5976.45 frames. ], batch size: 4, lr: 3.40e-03 2024-08-06 20:04:31,410 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 20:04:39,343 INFO [trainer.py:811] (0/8) Epoch 25, validation: loss=2.999, NarTop10Accuracy=0.7251, over 1905321.00 frames. 2024-08-06 20:04:39,344 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 20:04:39,840 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.755e+02 2.185e+02 2.339e+02 2.507e+02 3.640e+02, threshold=4.678e+02, percent-clipped=0.0 2024-08-06 20:04:44,513 INFO [trainer.py:765] (0/8) Epoch 25, batch 2200, train_loss[loss=3.188, NarTop10Accuracy=0.687, over 7182.00 frames. ], tot_loss[loss=3.113, NarTop10Accuracy=0.7028, over 6020.18 frames. ], batch size: 31, lr: 3.39e-03 2024-08-06 20:05:09,645 INFO [trainer.py:765] (0/8) Epoch 25, batch 2300, train_loss[loss=2.921, NarTop10Accuracy=0.7395, over 5835.00 frames. ], tot_loss[loss=3.117, NarTop10Accuracy=0.702, over 6030.89 frames. ], batch size: 9, lr: 3.39e-03 2024-08-06 20:05:34,141 INFO [trainer.py:765] (0/8) Epoch 25, batch 2400, train_loss[loss=2.808, NarTop10Accuracy=0.7686, over 5226.00 frames. ], tot_loss[loss=3.097, NarTop10Accuracy=0.7062, over 5796.32 frames. ], batch size: 7, lr: 3.39e-03 2024-08-06 20:05:57,846 INFO [trainer.py:765] (0/8) Epoch 25, batch 2500, train_loss[loss=2.764, NarTop10Accuracy=0.7656, over 5013.00 frames. ], tot_loss[loss=3.063, NarTop10Accuracy=0.7125, over 5507.11 frames. ], batch size: 7, lr: 3.39e-03 2024-08-06 20:06:17,995 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 20:06:17,997 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-25.pt 2024-08-06 20:07:19,305 INFO [trainer.py:765] (0/8) Epoch 26, batch 100, train_loss[loss=3.051, NarTop10Accuracy=0.712, over 7386.00 frames. ], tot_loss[loss=3.083, NarTop10Accuracy=0.7092, over 2346.51 frames. ], batch size: 32, lr: 3.32e-03 2024-08-06 20:07:52,383 INFO [trainer.py:765] (0/8) Epoch 26, batch 200, train_loss[loss=2.878, NarTop10Accuracy=0.7567, over 6597.00 frames. ], tot_loss[loss=3.082, NarTop10Accuracy=0.7092, over 3849.10 frames. ], batch size: 17, lr: 3.31e-03 2024-08-06 20:08:24,734 INFO [trainer.py:765] (0/8) Epoch 26, batch 300, train_loss[loss=2.927, NarTop10Accuracy=0.7405, over 7068.00 frames. ], tot_loss[loss=3.085, NarTop10Accuracy=0.7088, over 4644.56 frames. ], batch size: 22, lr: 3.31e-03 2024-08-06 20:08:58,185 INFO [trainer.py:765] (0/8) Epoch 26, batch 400, train_loss[loss=2.895, NarTop10Accuracy=0.7408, over 5145.00 frames. ], tot_loss[loss=3.083, NarTop10Accuracy=0.7091, over 5090.50 frames. ], batch size: 7, lr: 3.31e-03 2024-08-06 20:09:33,148 INFO [trainer.py:765] (0/8) Epoch 26, batch 500, train_loss[loss=2.908, NarTop10Accuracy=0.7452, over 6084.00 frames. ], tot_loss[loss=3.089, NarTop10Accuracy=0.7079, over 5382.31 frames. ], batch size: 11, lr: 3.30e-03 2024-08-06 20:10:03,891 INFO [trainer.py:765] (0/8) Epoch 26, batch 600, train_loss[loss=2.67, NarTop10Accuracy=0.7919, over 5622.00 frames. ], tot_loss[loss=3.068, NarTop10Accuracy=0.7125, over 5666.54 frames. ], batch size: 9, lr: 3.30e-03 2024-08-06 20:10:39,873 INFO [trainer.py:765] (0/8) Epoch 26, batch 700, train_loss[loss=3.254, NarTop10Accuracy=0.6656, over 4956.00 frames. ], tot_loss[loss=3.089, NarTop10Accuracy=0.7081, over 5719.12 frames. ], batch size: 6, lr: 3.30e-03 2024-08-06 20:11:19,061 INFO [trainer.py:765] (0/8) Epoch 26, batch 800, train_loss[loss=2.916, NarTop10Accuracy=0.7403, over 5151.00 frames. ], tot_loss[loss=3.086, NarTop10Accuracy=0.7087, over 5775.64 frames. ], batch size: 6, lr: 3.30e-03 2024-08-06 20:11:49,316 INFO [trainer.py:765] (0/8) Epoch 26, batch 900, train_loss[loss=2.858, NarTop10Accuracy=0.7559, over 6219.00 frames. ], tot_loss[loss=3.086, NarTop10Accuracy=0.7088, over 5796.02 frames. ], batch size: 13, lr: 3.29e-03 2024-08-06 20:12:25,974 INFO [trainer.py:765] (0/8) Epoch 26, batch 1000, train_loss[loss=2.843, NarTop10Accuracy=0.7492, over 6114.00 frames. ], tot_loss[loss=3.09, NarTop10Accuracy=0.7076, over 5887.13 frames. ], batch size: 13, lr: 3.29e-03 2024-08-06 20:13:06,377 INFO [trainer.py:765] (0/8) Epoch 26, batch 1100, train_loss[loss=3.213, NarTop10Accuracy=0.6776, over 6795.00 frames. ], tot_loss[loss=3.098, NarTop10Accuracy=0.7059, over 5932.37 frames. ], batch size: 17, lr: 3.29e-03 2024-08-06 20:13:37,536 INFO [trainer.py:765] (0/8) Epoch 26, batch 1200, train_loss[loss=3.389, NarTop10Accuracy=0.6397, over 6942.00 frames. ], tot_loss[loss=3.084, NarTop10Accuracy=0.7087, over 5913.20 frames. ], batch size: 31, lr: 3.29e-03 2024-08-06 20:14:13,696 INFO [trainer.py:765] (0/8) Epoch 26, batch 1300, train_loss[loss=2.818, NarTop10Accuracy=0.7712, over 5070.00 frames. ], tot_loss[loss=3.082, NarTop10Accuracy=0.7093, over 5992.31 frames. ], batch size: 6, lr: 3.28e-03 2024-08-06 20:14:50,538 INFO [trainer.py:765] (0/8) Epoch 26, batch 1400, train_loss[loss=2.849, NarTop10Accuracy=0.7567, over 6018.00 frames. ], tot_loss[loss=3.085, NarTop10Accuracy=0.7088, over 5999.88 frames. ], batch size: 11, lr: 3.28e-03 2024-08-06 20:15:21,156 INFO [trainer.py:765] (0/8) Epoch 26, batch 1500, train_loss[loss=3.165, NarTop10Accuracy=0.6924, over 6462.00 frames. ], tot_loss[loss=3.083, NarTop10Accuracy=0.7087, over 5957.13 frames. ], batch size: 50, lr: 3.28e-03 2024-08-06 20:15:48,980 INFO [trainer.py:765] (0/8) Epoch 26, batch 1600, train_loss[loss=2.88, NarTop10Accuracy=0.7425, over 7185.00 frames. ], tot_loss[loss=3.081, NarTop10Accuracy=0.7093, over 5947.54 frames. ], batch size: 22, lr: 3.28e-03 2024-08-06 20:15:50,002 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 20:15:58,239 INFO [trainer.py:811] (0/8) Epoch 26, validation: loss=2.899, NarTop10Accuracy=0.7457, over 1905321.00 frames. 2024-08-06 20:15:58,239 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 20:15:58,779 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.752e+02 2.166e+02 2.322e+02 2.511e+02 3.952e+02, threshold=4.644e+02, percent-clipped=0.0 2024-08-06 20:16:23,953 INFO [trainer.py:765] (0/8) Epoch 26, batch 1700, train_loss[loss=3.269, NarTop10Accuracy=0.6728, over 6201.00 frames. ], tot_loss[loss=3.073, NarTop10Accuracy=0.7112, over 5905.74 frames. ], batch size: 13, lr: 3.28e-03 2024-08-06 20:16:50,427 INFO [trainer.py:765] (0/8) Epoch 26, batch 1800, train_loss[loss=2.815, NarTop10Accuracy=0.7671, over 7095.00 frames. ], tot_loss[loss=3.072, NarTop10Accuracy=0.7111, over 5982.70 frames. ], batch size: 22, lr: 3.27e-03 2024-08-06 20:17:16,841 INFO [trainer.py:765] (0/8) Epoch 26, batch 1900, train_loss[loss=3.083, NarTop10Accuracy=0.7169, over 5655.00 frames. ], tot_loss[loss=3.089, NarTop10Accuracy=0.7077, over 6007.31 frames. ], batch size: 50, lr: 3.27e-03 2024-08-06 20:17:42,380 INFO [trainer.py:765] (0/8) Epoch 26, batch 2000, train_loss[loss=3.593, NarTop10Accuracy=0.6028, over 6330.00 frames. ], tot_loss[loss=3.088, NarTop10Accuracy=0.7078, over 5983.26 frames. ], batch size: 50, lr: 3.27e-03 2024-08-06 20:18:07,564 INFO [trainer.py:765] (0/8) Epoch 26, batch 2100, train_loss[loss=2.971, NarTop10Accuracy=0.7258, over 4929.00 frames. ], tot_loss[loss=3.098, NarTop10Accuracy=0.7059, over 5958.33 frames. ], batch size: 5, lr: 3.27e-03 2024-08-06 20:18:32,778 INFO [trainer.py:765] (0/8) Epoch 26, batch 2200, train_loss[loss=2.917, NarTop10Accuracy=0.7514, over 7293.00 frames. ], tot_loss[loss=3.091, NarTop10Accuracy=0.7071, over 6015.88 frames. ], batch size: 31, lr: 3.26e-03 2024-08-06 20:18:57,898 INFO [trainer.py:765] (0/8) Epoch 26, batch 2300, train_loss[loss=3.095, NarTop10Accuracy=0.707, over 5724.00 frames. ], tot_loss[loss=3.097, NarTop10Accuracy=0.7061, over 6037.89 frames. ], batch size: 9, lr: 3.26e-03 2024-08-06 20:19:22,206 INFO [trainer.py:765] (0/8) Epoch 26, batch 2400, train_loss[loss=2.854, NarTop10Accuracy=0.7535, over 5076.00 frames. ], tot_loss[loss=3.075, NarTop10Accuracy=0.7102, over 5795.61 frames. ], batch size: 7, lr: 3.26e-03 2024-08-06 20:19:45,652 INFO [trainer.py:765] (0/8) Epoch 26, batch 2500, train_loss[loss=2.817, NarTop10Accuracy=0.7665, over 5238.00 frames. ], tot_loss[loss=3.053, NarTop10Accuracy=0.7147, over 5483.06 frames. ], batch size: 7, lr: 3.26e-03 2024-08-06 20:20:05,616 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 20:20:05,618 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-26.pt 2024-08-06 20:21:04,874 INFO [trainer.py:765] (0/8) Epoch 27, batch 100, train_loss[loss=3.253, NarTop10Accuracy=0.6781, over 7272.00 frames. ], tot_loss[loss=3.078, NarTop10Accuracy=0.7101, over 2363.40 frames. ], batch size: 31, lr: 3.19e-03 2024-08-06 20:21:39,783 INFO [trainer.py:765] (0/8) Epoch 27, batch 200, train_loss[loss=2.862, NarTop10Accuracy=0.7523, over 6744.00 frames. ], tot_loss[loss=3.077, NarTop10Accuracy=0.7101, over 3848.36 frames. ], batch size: 17, lr: 3.19e-03 2024-08-06 20:22:13,049 INFO [trainer.py:765] (0/8) Epoch 27, batch 300, train_loss[loss=2.928, NarTop10Accuracy=0.7475, over 7116.00 frames. ], tot_loss[loss=3.079, NarTop10Accuracy=0.7095, over 4648.48 frames. ], batch size: 22, lr: 3.18e-03 2024-08-06 20:22:43,557 INFO [trainer.py:765] (0/8) Epoch 27, batch 400, train_loss[loss=2.987, NarTop10Accuracy=0.7405, over 5172.00 frames. ], tot_loss[loss=3.066, NarTop10Accuracy=0.7126, over 5107.51 frames. ], batch size: 7, lr: 3.18e-03 2024-08-06 20:23:18,084 INFO [trainer.py:765] (0/8) Epoch 27, batch 500, train_loss[loss=2.701, NarTop10Accuracy=0.8009, over 6087.00 frames. ], tot_loss[loss=3.056, NarTop10Accuracy=0.7146, over 5376.10 frames. ], batch size: 11, lr: 3.18e-03 2024-08-06 20:23:51,436 INFO [trainer.py:765] (0/8) Epoch 27, batch 600, train_loss[loss=3.115, NarTop10Accuracy=0.6975, over 5676.00 frames. ], tot_loss[loss=3.053, NarTop10Accuracy=0.7151, over 5628.18 frames. ], batch size: 9, lr: 3.18e-03 2024-08-06 20:24:24,976 INFO [trainer.py:765] (0/8) Epoch 27, batch 700, train_loss[loss=2.879, NarTop10Accuracy=0.758, over 4278.00 frames. ], tot_loss[loss=3.05, NarTop10Accuracy=0.7162, over 5698.84 frames. ], batch size: 5, lr: 3.18e-03 2024-08-06 20:25:03,408 INFO [trainer.py:765] (0/8) Epoch 27, batch 800, train_loss[loss=3.122, NarTop10Accuracy=0.6966, over 4992.00 frames. ], tot_loss[loss=3.072, NarTop10Accuracy=0.7119, over 5773.65 frames. ], batch size: 6, lr: 3.17e-03 2024-08-06 20:25:34,176 INFO [trainer.py:765] (0/8) Epoch 27, batch 900, train_loss[loss=3.28, NarTop10Accuracy=0.6675, over 6219.00 frames. ], tot_loss[loss=3.072, NarTop10Accuracy=0.7111, over 5789.23 frames. ], batch size: 13, lr: 3.17e-03 2024-08-06 20:26:10,097 INFO [trainer.py:765] (0/8) Epoch 27, batch 1000, train_loss[loss=2.91, NarTop10Accuracy=0.7549, over 6147.00 frames. ], tot_loss[loss=3.078, NarTop10Accuracy=0.7101, over 5892.94 frames. ], batch size: 13, lr: 3.17e-03 2024-08-06 20:26:18,316 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 20:26:26,346 INFO [trainer.py:811] (0/8) Epoch 27, validation: loss=2.95, NarTop10Accuracy=0.735, over 1905321.00 frames. 2024-08-06 20:26:26,347 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 20:26:26,877 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.789e+02 2.166e+02 2.331e+02 2.512e+02 4.284e+02, threshold=4.663e+02, percent-clipped=0.0 2024-08-06 20:26:50,899 INFO [trainer.py:765] (0/8) Epoch 27, batch 1100, train_loss[loss=3.003, NarTop10Accuracy=0.7228, over 6789.00 frames. ], tot_loss[loss=3.084, NarTop10Accuracy=0.7087, over 5933.36 frames. ], batch size: 17, lr: 3.17e-03 2024-08-06 20:27:24,545 INFO [trainer.py:765] (0/8) Epoch 27, batch 1200, train_loss[loss=2.878, NarTop10Accuracy=0.7453, over 7182.00 frames. ], tot_loss[loss=3.081, NarTop10Accuracy=0.7096, over 5926.45 frames. ], batch size: 31, lr: 3.16e-03 2024-08-06 20:27:58,568 INFO [trainer.py:765] (0/8) Epoch 27, batch 1300, train_loss[loss=2.747, NarTop10Accuracy=0.7731, over 5025.00 frames. ], tot_loss[loss=3.073, NarTop10Accuracy=0.711, over 5984.52 frames. ], batch size: 6, lr: 3.16e-03 2024-08-06 20:28:36,745 INFO [trainer.py:765] (0/8) Epoch 27, batch 1400, train_loss[loss=3.538, NarTop10Accuracy=0.6219, over 6114.00 frames. ], tot_loss[loss=3.09, NarTop10Accuracy=0.7075, over 6003.97 frames. ], batch size: 11, lr: 3.16e-03 2024-08-06 20:29:04,632 INFO [trainer.py:765] (0/8) Epoch 27, batch 1500, train_loss[loss=3.078, NarTop10Accuracy=0.7123, over 6093.00 frames. ], tot_loss[loss=3.087, NarTop10Accuracy=0.7085, over 5937.07 frames. ], batch size: 50, lr: 3.16e-03 2024-08-06 20:29:32,362 INFO [trainer.py:765] (0/8) Epoch 27, batch 1600, train_loss[loss=2.904, NarTop10Accuracy=0.7492, over 7149.00 frames. ], tot_loss[loss=3.095, NarTop10Accuracy=0.7068, over 5934.06 frames. ], batch size: 22, lr: 3.15e-03 2024-08-06 20:29:58,978 INFO [trainer.py:765] (0/8) Epoch 27, batch 1700, train_loss[loss=3.153, NarTop10Accuracy=0.6926, over 6237.00 frames. ], tot_loss[loss=3.079, NarTop10Accuracy=0.7099, over 5915.57 frames. ], batch size: 13, lr: 3.15e-03 2024-08-06 20:30:25,463 INFO [trainer.py:765] (0/8) Epoch 27, batch 1800, train_loss[loss=3.422, NarTop10Accuracy=0.6414, over 7146.00 frames. ], tot_loss[loss=3.082, NarTop10Accuracy=0.7089, over 5983.19 frames. ], batch size: 22, lr: 3.15e-03 2024-08-06 20:30:51,845 INFO [trainer.py:765] (0/8) Epoch 27, batch 1900, train_loss[loss=3.123, NarTop10Accuracy=0.7077, over 6348.00 frames. ], tot_loss[loss=3.081, NarTop10Accuracy=0.7096, over 6029.45 frames. ], batch size: 50, lr: 3.15e-03 2024-08-06 20:31:17,390 INFO [trainer.py:765] (0/8) Epoch 27, batch 2000, train_loss[loss=3.133, NarTop10Accuracy=0.7067, over 5910.00 frames. ], tot_loss[loss=3.069, NarTop10Accuracy=0.7119, over 6017.61 frames. ], batch size: 50, lr: 3.15e-03 2024-08-06 20:31:42,660 INFO [trainer.py:765] (0/8) Epoch 27, batch 2100, train_loss[loss=2.892, NarTop10Accuracy=0.7511, over 3957.00 frames. ], tot_loss[loss=3.074, NarTop10Accuracy=0.7109, over 5977.21 frames. ], batch size: 4, lr: 3.14e-03 2024-08-06 20:32:07,804 INFO [trainer.py:765] (0/8) Epoch 27, batch 2200, train_loss[loss=3.478, NarTop10Accuracy=0.628, over 7176.00 frames. ], tot_loss[loss=3.087, NarTop10Accuracy=0.7084, over 6025.53 frames. ], batch size: 31, lr: 3.14e-03 2024-08-06 20:32:32,942 INFO [trainer.py:765] (0/8) Epoch 27, batch 2300, train_loss[loss=2.769, NarTop10Accuracy=0.77, over 5769.00 frames. ], tot_loss[loss=3.088, NarTop10Accuracy=0.7082, over 6044.97 frames. ], batch size: 9, lr: 3.14e-03 2024-08-06 20:32:57,246 INFO [trainer.py:765] (0/8) Epoch 27, batch 2400, train_loss[loss=2.7, NarTop10Accuracy=0.789, over 5130.00 frames. ], tot_loss[loss=3.087, NarTop10Accuracy=0.7082, over 5785.10 frames. ], batch size: 7, lr: 3.14e-03 2024-08-06 20:33:20,615 INFO [trainer.py:765] (0/8) Epoch 27, batch 2500, train_loss[loss=3.357, NarTop10Accuracy=0.6494, over 5211.00 frames. ], tot_loss[loss=3.06, NarTop10Accuracy=0.7136, over 5489.39 frames. ], batch size: 7, lr: 3.13e-03 2024-08-06 20:33:40,481 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 20:33:40,486 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-27.pt 2024-08-06 20:34:35,829 INFO [trainer.py:765] (0/8) Epoch 28, batch 100, train_loss[loss=2.87, NarTop10Accuracy=0.7505, over 7086.00 frames. ], tot_loss[loss=3.09, NarTop10Accuracy=0.7073, over 2388.53 frames. ], batch size: 32, lr: 3.07e-03 2024-08-06 20:35:07,393 INFO [trainer.py:765] (0/8) Epoch 28, batch 200, train_loss[loss=2.671, NarTop10Accuracy=0.7829, over 6801.00 frames. ], tot_loss[loss=3.098, NarTop10Accuracy=0.7057, over 3859.63 frames. ], batch size: 17, lr: 3.07e-03 2024-08-06 20:35:45,422 INFO [trainer.py:765] (0/8) Epoch 28, batch 300, train_loss[loss=3.066, NarTop10Accuracy=0.724, over 7200.00 frames. ], tot_loss[loss=3.083, NarTop10Accuracy=0.7091, over 4649.48 frames. ], batch size: 22, lr: 3.07e-03 2024-08-06 20:36:15,865 INFO [trainer.py:765] (0/8) Epoch 28, batch 400, train_loss[loss=3.31, NarTop10Accuracy=0.6536, over 5085.00 frames. ], tot_loss[loss=3.086, NarTop10Accuracy=0.7083, over 5107.47 frames. ], batch size: 7, lr: 3.07e-03 2024-08-06 20:36:32,407 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 20:36:40,530 INFO [trainer.py:811] (0/8) Epoch 28, validation: loss=2.963, NarTop10Accuracy=0.7327, over 1905321.00 frames. 2024-08-06 20:36:40,531 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 20:36:41,102 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.761e+02 2.179e+02 2.348e+02 2.536e+02 3.573e+02, threshold=4.696e+02, percent-clipped=0.0 2024-08-06 20:36:56,662 INFO [trainer.py:765] (0/8) Epoch 28, batch 500, train_loss[loss=3.357, NarTop10Accuracy=0.6565, over 6168.00 frames. ], tot_loss[loss=3.072, NarTop10Accuracy=0.7109, over 5377.54 frames. ], batch size: 11, lr: 3.06e-03 2024-08-06 20:37:29,461 INFO [trainer.py:765] (0/8) Epoch 28, batch 600, train_loss[loss=3, NarTop10Accuracy=0.7267, over 5655.00 frames. ], tot_loss[loss=3.076, NarTop10Accuracy=0.7102, over 5641.28 frames. ], batch size: 9, lr: 3.06e-03 2024-08-06 20:38:08,890 INFO [trainer.py:765] (0/8) Epoch 28, batch 700, train_loss[loss=3.183, NarTop10Accuracy=0.6784, over 5007.00 frames. ], tot_loss[loss=3.084, NarTop10Accuracy=0.7084, over 5714.36 frames. ], batch size: 6, lr: 3.06e-03 2024-08-06 20:38:42,488 INFO [trainer.py:765] (0/8) Epoch 28, batch 800, train_loss[loss=2.771, NarTop10Accuracy=0.7751, over 4989.00 frames. ], tot_loss[loss=3.058, NarTop10Accuracy=0.7138, over 5781.38 frames. ], batch size: 6, lr: 3.06e-03 2024-08-06 20:39:15,505 INFO [trainer.py:765] (0/8) Epoch 28, batch 900, train_loss[loss=3.329, NarTop10Accuracy=0.6571, over 6657.00 frames. ], tot_loss[loss=3.057, NarTop10Accuracy=0.7141, over 5802.27 frames. ], batch size: 14, lr: 3.06e-03 2024-08-06 20:39:53,239 INFO [trainer.py:765] (0/8) Epoch 28, batch 1000, train_loss[loss=3.284, NarTop10Accuracy=0.6726, over 6660.00 frames. ], tot_loss[loss=3.058, NarTop10Accuracy=0.7138, over 5900.27 frames. ], batch size: 14, lr: 3.05e-03 2024-08-06 20:40:25,866 INFO [trainer.py:765] (0/8) Epoch 28, batch 1100, train_loss[loss=2.788, NarTop10Accuracy=0.7701, over 6744.00 frames. ], tot_loss[loss=3.074, NarTop10Accuracy=0.7107, over 5942.01 frames. ], batch size: 17, lr: 3.05e-03 2024-08-06 20:40:59,417 INFO [trainer.py:765] (0/8) Epoch 28, batch 1200, train_loss[loss=3.293, NarTop10Accuracy=0.6604, over 7401.00 frames. ], tot_loss[loss=3.078, NarTop10Accuracy=0.7098, over 5928.91 frames. ], batch size: 31, lr: 3.05e-03 2024-08-06 20:41:38,680 INFO [trainer.py:765] (0/8) Epoch 28, batch 1300, train_loss[loss=3.304, NarTop10Accuracy=0.675, over 4374.00 frames. ], tot_loss[loss=3.075, NarTop10Accuracy=0.7105, over 5980.61 frames. ], batch size: 5, lr: 3.05e-03 2024-08-06 20:42:13,046 INFO [trainer.py:765] (0/8) Epoch 28, batch 1400, train_loss[loss=2.94, NarTop10Accuracy=0.7302, over 6069.00 frames. ], tot_loss[loss=3.084, NarTop10Accuracy=0.7088, over 6014.47 frames. ], batch size: 11, lr: 3.04e-03 2024-08-06 20:42:43,170 INFO [trainer.py:765] (0/8) Epoch 28, batch 1500, train_loss[loss=3.446, NarTop10Accuracy=0.6362, over 6213.00 frames. ], tot_loss[loss=3.072, NarTop10Accuracy=0.7113, over 5971.79 frames. ], batch size: 50, lr: 3.04e-03 2024-08-06 20:43:11,079 INFO [trainer.py:765] (0/8) Epoch 28, batch 1600, train_loss[loss=2.887, NarTop10Accuracy=0.748, over 7182.00 frames. ], tot_loss[loss=3.07, NarTop10Accuracy=0.7111, over 5935.61 frames. ], batch size: 22, lr: 3.04e-03 2024-08-06 20:43:37,784 INFO [trainer.py:765] (0/8) Epoch 28, batch 1700, train_loss[loss=3.098, NarTop10Accuracy=0.705, over 6624.00 frames. ], tot_loss[loss=3.075, NarTop10Accuracy=0.7105, over 5923.70 frames. ], batch size: 14, lr: 3.04e-03 2024-08-06 20:44:04,325 INFO [trainer.py:765] (0/8) Epoch 28, batch 1800, train_loss[loss=3.081, NarTop10Accuracy=0.7053, over 6942.00 frames. ], tot_loss[loss=3.075, NarTop10Accuracy=0.7105, over 5983.06 frames. ], batch size: 22, lr: 3.04e-03 2024-08-06 20:44:30,756 INFO [trainer.py:765] (0/8) Epoch 28, batch 1900, train_loss[loss=3.076, NarTop10Accuracy=0.7136, over 5343.00 frames. ], tot_loss[loss=3.071, NarTop10Accuracy=0.7111, over 6019.24 frames. ], batch size: 50, lr: 3.03e-03 2024-08-06 20:44:56,327 INFO [trainer.py:765] (0/8) Epoch 28, batch 2000, train_loss[loss=2.978, NarTop10Accuracy=0.7281, over 6255.00 frames. ], tot_loss[loss=3.056, NarTop10Accuracy=0.7141, over 6008.22 frames. ], batch size: 51, lr: 3.03e-03 2024-08-06 20:45:21,649 INFO [trainer.py:765] (0/8) Epoch 28, batch 2100, train_loss[loss=2.852, NarTop10Accuracy=0.7513, over 4824.00 frames. ], tot_loss[loss=3.053, NarTop10Accuracy=0.7149, over 5972.98 frames. ], batch size: 5, lr: 3.03e-03 2024-08-06 20:45:47,075 INFO [trainer.py:765] (0/8) Epoch 28, batch 2200, train_loss[loss=2.973, NarTop10Accuracy=0.7384, over 7287.00 frames. ], tot_loss[loss=3.065, NarTop10Accuracy=0.7125, over 6010.74 frames. ], batch size: 31, lr: 3.03e-03 2024-08-06 20:46:12,306 INFO [trainer.py:765] (0/8) Epoch 28, batch 2300, train_loss[loss=3.345, NarTop10Accuracy=0.6458, over 5712.00 frames. ], tot_loss[loss=3.082, NarTop10Accuracy=0.7087, over 6022.40 frames. ], batch size: 9, lr: 3.03e-03 2024-08-06 20:46:36,806 INFO [trainer.py:765] (0/8) Epoch 28, batch 2400, train_loss[loss=3, NarTop10Accuracy=0.7266, over 5178.00 frames. ], tot_loss[loss=3.09, NarTop10Accuracy=0.7071, over 5777.48 frames. ], batch size: 7, lr: 3.02e-03 2024-08-06 20:46:48,594 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 20:46:56,604 INFO [trainer.py:811] (0/8) Epoch 28, validation: loss=2.931, NarTop10Accuracy=0.7396, over 1905321.00 frames. 2024-08-06 20:46:56,605 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 20:46:57,082 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.745e+02 2.201e+02 2.381e+02 2.551e+02 4.872e+02, threshold=4.762e+02, percent-clipped=0.1 2024-08-06 20:47:08,293 INFO [trainer.py:765] (0/8) Epoch 28, batch 2500, train_loss[loss=2.923, NarTop10Accuracy=0.7435, over 5088.00 frames. ], tot_loss[loss=3.065, NarTop10Accuracy=0.7121, over 5482.16 frames. ], batch size: 7, lr: 3.02e-03 2024-08-06 20:47:27,712 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 20:47:27,715 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-28.pt 2024-08-06 20:48:21,052 INFO [trainer.py:765] (0/8) Epoch 29, batch 100, train_loss[loss=3.038, NarTop10Accuracy=0.7229, over 7365.00 frames. ], tot_loss[loss=3.074, NarTop10Accuracy=0.7115, over 2377.68 frames. ], batch size: 31, lr: 2.96e-03 2024-08-06 20:48:53,405 INFO [trainer.py:765] (0/8) Epoch 29, batch 200, train_loss[loss=3.247, NarTop10Accuracy=0.6812, over 6828.00 frames. ], tot_loss[loss=3.048, NarTop10Accuracy=0.7171, over 3867.50 frames. ], batch size: 17, lr: 2.96e-03 2024-08-06 20:49:27,477 INFO [trainer.py:765] (0/8) Epoch 29, batch 300, train_loss[loss=3.21, NarTop10Accuracy=0.6888, over 7422.00 frames. ], tot_loss[loss=3.047, NarTop10Accuracy=0.7168, over 4677.17 frames. ], batch size: 23, lr: 2.96e-03 2024-08-06 20:49:56,052 INFO [trainer.py:765] (0/8) Epoch 29, batch 400, train_loss[loss=3.393, NarTop10Accuracy=0.6299, over 5223.00 frames. ], tot_loss[loss=3.071, NarTop10Accuracy=0.7114, over 5119.56 frames. ], batch size: 7, lr: 2.96e-03 2024-08-06 20:50:29,435 INFO [trainer.py:765] (0/8) Epoch 29, batch 500, train_loss[loss=3.116, NarTop10Accuracy=0.6999, over 6114.00 frames. ], tot_loss[loss=3.052, NarTop10Accuracy=0.7147, over 5403.51 frames. ], batch size: 11, lr: 2.96e-03 2024-08-06 20:51:00,024 INFO [trainer.py:765] (0/8) Epoch 29, batch 600, train_loss[loss=2.741, NarTop10Accuracy=0.7829, over 5580.00 frames. ], tot_loss[loss=3.051, NarTop10Accuracy=0.715, over 5655.55 frames. ], batch size: 9, lr: 2.95e-03 2024-08-06 20:51:35,677 INFO [trainer.py:765] (0/8) Epoch 29, batch 700, train_loss[loss=2.776, NarTop10Accuracy=0.7716, over 5097.00 frames. ], tot_loss[loss=3.074, NarTop10Accuracy=0.7107, over 5722.72 frames. ], batch size: 6, lr: 2.95e-03 2024-08-06 20:52:10,724 INFO [trainer.py:765] (0/8) Epoch 29, batch 800, train_loss[loss=2.757, NarTop10Accuracy=0.7775, over 5133.00 frames. ], tot_loss[loss=3.065, NarTop10Accuracy=0.7126, over 5763.90 frames. ], batch size: 6, lr: 2.95e-03 2024-08-06 20:52:40,742 INFO [trainer.py:765] (0/8) Epoch 29, batch 900, train_loss[loss=2.737, NarTop10Accuracy=0.776, over 6135.00 frames. ], tot_loss[loss=3.081, NarTop10Accuracy=0.7087, over 5790.31 frames. ], batch size: 13, lr: 2.95e-03 2024-08-06 20:53:16,861 INFO [trainer.py:765] (0/8) Epoch 29, batch 1000, train_loss[loss=3.499, NarTop10Accuracy=0.6218, over 6657.00 frames. ], tot_loss[loss=3.082, NarTop10Accuracy=0.7086, over 5897.68 frames. ], batch size: 14, lr: 2.95e-03 2024-08-06 20:53:52,902 INFO [trainer.py:765] (0/8) Epoch 29, batch 1100, train_loss[loss=3.186, NarTop10Accuracy=0.6957, over 6756.00 frames. ], tot_loss[loss=3.087, NarTop10Accuracy=0.7079, over 5922.52 frames. ], batch size: 17, lr: 2.94e-03 2024-08-06 20:54:23,690 INFO [trainer.py:765] (0/8) Epoch 29, batch 1200, train_loss[loss=3.147, NarTop10Accuracy=0.697, over 7236.00 frames. ], tot_loss[loss=3.077, NarTop10Accuracy=0.7098, over 5932.17 frames. ], batch size: 31, lr: 2.94e-03 2024-08-06 20:55:01,428 INFO [trainer.py:765] (0/8) Epoch 29, batch 1300, train_loss[loss=2.999, NarTop10Accuracy=0.7396, over 4212.00 frames. ], tot_loss[loss=3.074, NarTop10Accuracy=0.7108, over 5980.96 frames. ], batch size: 5, lr: 2.94e-03 2024-08-06 20:55:32,557 INFO [trainer.py:765] (0/8) Epoch 29, batch 1400, train_loss[loss=3.427, NarTop10Accuracy=0.6432, over 6051.00 frames. ], tot_loss[loss=3.075, NarTop10Accuracy=0.7107, over 6014.35 frames. ], batch size: 11, lr: 2.94e-03 2024-08-06 20:56:04,359 INFO [trainer.py:765] (0/8) Epoch 29, batch 1500, train_loss[loss=3.306, NarTop10Accuracy=0.6611, over 5826.00 frames. ], tot_loss[loss=3.069, NarTop10Accuracy=0.7116, over 5948.53 frames. ], batch size: 51, lr: 2.94e-03 2024-08-06 20:56:32,040 INFO [trainer.py:765] (0/8) Epoch 29, batch 1600, train_loss[loss=3.274, NarTop10Accuracy=0.6752, over 7161.00 frames. ], tot_loss[loss=3.08, NarTop10Accuracy=0.7092, over 5932.52 frames. ], batch size: 22, lr: 2.93e-03 2024-08-06 20:56:58,640 INFO [trainer.py:765] (0/8) Epoch 29, batch 1700, train_loss[loss=2.846, NarTop10Accuracy=0.7611, over 6396.00 frames. ], tot_loss[loss=3.074, NarTop10Accuracy=0.7102, over 5923.87 frames. ], batch size: 13, lr: 2.93e-03 2024-08-06 20:57:25,000 INFO [trainer.py:765] (0/8) Epoch 29, batch 1800, train_loss[loss=3.064, NarTop10Accuracy=0.7095, over 6975.00 frames. ], tot_loss[loss=3.069, NarTop10Accuracy=0.7109, over 5970.48 frames. ], batch size: 22, lr: 2.93e-03 2024-08-06 20:57:44,621 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 20:57:52,863 INFO [trainer.py:811] (0/8) Epoch 29, validation: loss=2.897, NarTop10Accuracy=0.7458, over 1905321.00 frames. 2024-08-06 20:57:52,864 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 20:57:53,424 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.772e+02 2.206e+02 2.380e+02 2.554e+02 4.464e+02, threshold=4.759e+02, percent-clipped=0.0 2024-08-06 20:57:59,756 INFO [trainer.py:765] (0/8) Epoch 29, batch 1900, train_loss[loss=3.127, NarTop10Accuracy=0.7114, over 6084.00 frames. ], tot_loss[loss=3.088, NarTop10Accuracy=0.7076, over 6024.48 frames. ], batch size: 51, lr: 2.93e-03 2024-08-06 20:58:25,308 INFO [trainer.py:765] (0/8) Epoch 29, batch 2000, train_loss[loss=3.528, NarTop10Accuracy=0.6206, over 6486.00 frames. ], tot_loss[loss=3.084, NarTop10Accuracy=0.7082, over 5994.52 frames. ], batch size: 53, lr: 2.93e-03 2024-08-06 20:58:50,630 INFO [trainer.py:765] (0/8) Epoch 29, batch 2100, train_loss[loss=2.854, NarTop10Accuracy=0.7477, over 4803.00 frames. ], tot_loss[loss=3.083, NarTop10Accuracy=0.7084, over 5975.90 frames. ], batch size: 5, lr: 2.92e-03 2024-08-06 20:59:15,805 INFO [trainer.py:765] (0/8) Epoch 29, batch 2200, train_loss[loss=2.847, NarTop10Accuracy=0.7553, over 7242.00 frames. ], tot_loss[loss=3.073, NarTop10Accuracy=0.7108, over 6012.33 frames. ], batch size: 31, lr: 2.92e-03 2024-08-06 20:59:40,910 INFO [trainer.py:765] (0/8) Epoch 29, batch 2300, train_loss[loss=2.938, NarTop10Accuracy=0.7374, over 5631.00 frames. ], tot_loss[loss=3.088, NarTop10Accuracy=0.7077, over 6029.58 frames. ], batch size: 9, lr: 2.92e-03 2024-08-06 21:00:05,155 INFO [trainer.py:765] (0/8) Epoch 29, batch 2400, train_loss[loss=2.809, NarTop10Accuracy=0.7671, over 5091.00 frames. ], tot_loss[loss=3.073, NarTop10Accuracy=0.7105, over 5788.88 frames. ], batch size: 7, lr: 2.92e-03 2024-08-06 21:00:28,742 INFO [trainer.py:765] (0/8) Epoch 29, batch 2500, train_loss[loss=3.374, NarTop10Accuracy=0.6547, over 5100.00 frames. ], tot_loss[loss=3.046, NarTop10Accuracy=0.7155, over 5481.23 frames. ], batch size: 7, lr: 2.92e-03 2024-08-06 21:00:48,734 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 21:00:48,737 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-29.pt 2024-08-06 21:01:41,716 INFO [trainer.py:765] (0/8) Epoch 30, batch 100, train_loss[loss=2.9, NarTop10Accuracy=0.7499, over 7212.00 frames. ], tot_loss[loss=3.031, NarTop10Accuracy=0.7206, over 2360.50 frames. ], batch size: 31, lr: 2.86e-03 2024-08-06 21:02:17,013 INFO [trainer.py:765] (0/8) Epoch 30, batch 200, train_loss[loss=2.827, NarTop10Accuracy=0.7563, over 6930.00 frames. ], tot_loss[loss=3.017, NarTop10Accuracy=0.7225, over 3859.46 frames. ], batch size: 17, lr: 2.86e-03 2024-08-06 21:02:51,343 INFO [trainer.py:765] (0/8) Epoch 30, batch 300, train_loss[loss=2.864, NarTop10Accuracy=0.7525, over 7107.00 frames. ], tot_loss[loss=3.012, NarTop10Accuracy=0.7235, over 4650.21 frames. ], batch size: 22, lr: 2.86e-03 2024-08-06 21:03:21,643 INFO [trainer.py:765] (0/8) Epoch 30, batch 400, train_loss[loss=2.801, NarTop10Accuracy=0.7688, over 5073.00 frames. ], tot_loss[loss=3.029, NarTop10Accuracy=0.7206, over 5116.43 frames. ], batch size: 7, lr: 2.86e-03 2024-08-06 21:03:58,546 INFO [trainer.py:765] (0/8) Epoch 30, batch 500, train_loss[loss=3.283, NarTop10Accuracy=0.6568, over 6123.00 frames. ], tot_loss[loss=3.038, NarTop10Accuracy=0.7177, over 5418.98 frames. ], batch size: 11, lr: 2.86e-03 2024-08-06 21:04:31,656 INFO [trainer.py:765] (0/8) Epoch 30, batch 600, train_loss[loss=2.924, NarTop10Accuracy=0.748, over 5739.00 frames. ], tot_loss[loss=3.043, NarTop10Accuracy=0.717, over 5667.28 frames. ], batch size: 9, lr: 2.85e-03 2024-08-06 21:05:03,525 INFO [trainer.py:765] (0/8) Epoch 30, batch 700, train_loss[loss=2.872, NarTop10Accuracy=0.7449, over 5070.00 frames. ], tot_loss[loss=3.033, NarTop10Accuracy=0.7194, over 5738.20 frames. ], batch size: 6, lr: 2.85e-03 2024-08-06 21:05:44,132 INFO [trainer.py:765] (0/8) Epoch 30, batch 800, train_loss[loss=2.933, NarTop10Accuracy=0.7227, over 5082.00 frames. ], tot_loss[loss=3.034, NarTop10Accuracy=0.719, over 5787.05 frames. ], batch size: 6, lr: 2.85e-03 2024-08-06 21:06:14,843 INFO [trainer.py:765] (0/8) Epoch 30, batch 900, train_loss[loss=2.915, NarTop10Accuracy=0.7466, over 6177.00 frames. ], tot_loss[loss=3.036, NarTop10Accuracy=0.7185, over 5787.67 frames. ], batch size: 13, lr: 2.85e-03 2024-08-06 21:06:48,952 INFO [trainer.py:765] (0/8) Epoch 30, batch 1000, train_loss[loss=2.898, NarTop10Accuracy=0.7417, over 6609.00 frames. ], tot_loss[loss=3.065, NarTop10Accuracy=0.7122, over 5888.62 frames. ], batch size: 14, lr: 2.85e-03 2024-08-06 21:07:25,936 INFO [trainer.py:765] (0/8) Epoch 30, batch 1100, train_loss[loss=3.368, NarTop10Accuracy=0.649, over 6813.00 frames. ], tot_loss[loss=3.082, NarTop10Accuracy=0.7088, over 5920.10 frames. ], batch size: 17, lr: 2.84e-03 2024-08-06 21:08:02,380 INFO [trainer.py:765] (0/8) Epoch 30, batch 1200, train_loss[loss=3.004, NarTop10Accuracy=0.7333, over 7245.00 frames. ], tot_loss[loss=3.076, NarTop10Accuracy=0.7104, over 5918.24 frames. ], batch size: 31, lr: 2.84e-03 2024-08-06 21:08:35,371 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 21:08:43,457 INFO [trainer.py:811] (0/8) Epoch 30, validation: loss=2.93, NarTop10Accuracy=0.7391, over 1905321.00 frames. 2024-08-06 21:08:43,458 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 21:08:44,197 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.770e+02 2.209e+02 2.377e+02 2.553e+02 3.956e+02, threshold=4.754e+02, percent-clipped=0.0 2024-08-06 21:08:44,203 INFO [trainer.py:765] (0/8) Epoch 30, batch 1300, train_loss[loss=3.01, NarTop10Accuracy=0.7266, over 4278.00 frames. ], tot_loss[loss=3.069, NarTop10Accuracy=0.7119, over 5995.53 frames. ], batch size: 5, lr: 2.84e-03 2024-08-06 21:09:22,396 INFO [trainer.py:765] (0/8) Epoch 30, batch 1400, train_loss[loss=2.805, NarTop10Accuracy=0.7688, over 6546.00 frames. ], tot_loss[loss=3.074, NarTop10Accuracy=0.7107, over 6026.67 frames. ], batch size: 12, lr: 2.84e-03 2024-08-06 21:09:52,372 INFO [trainer.py:765] (0/8) Epoch 30, batch 1500, train_loss[loss=3.068, NarTop10Accuracy=0.7152, over 6066.00 frames. ], tot_loss[loss=3.062, NarTop10Accuracy=0.7133, over 5952.64 frames. ], batch size: 50, lr: 2.84e-03 2024-08-06 21:10:20,083 INFO [trainer.py:765] (0/8) Epoch 30, batch 1600, train_loss[loss=2.964, NarTop10Accuracy=0.7287, over 7209.00 frames. ], tot_loss[loss=3.064, NarTop10Accuracy=0.7121, over 5941.02 frames. ], batch size: 22, lr: 2.84e-03 2024-08-06 21:10:46,679 INFO [trainer.py:765] (0/8) Epoch 30, batch 1700, train_loss[loss=3.107, NarTop10Accuracy=0.6961, over 6453.00 frames. ], tot_loss[loss=3.07, NarTop10Accuracy=0.7108, over 5915.59 frames. ], batch size: 14, lr: 2.83e-03 2024-08-06 21:11:13,058 INFO [trainer.py:765] (0/8) Epoch 30, batch 1800, train_loss[loss=3.384, NarTop10Accuracy=0.6375, over 6915.00 frames. ], tot_loss[loss=3.069, NarTop10Accuracy=0.7112, over 5998.78 frames. ], batch size: 22, lr: 2.83e-03 2024-08-06 21:11:39,418 INFO [trainer.py:765] (0/8) Epoch 30, batch 1900, train_loss[loss=3.095, NarTop10Accuracy=0.7077, over 6132.00 frames. ], tot_loss[loss=3.076, NarTop10Accuracy=0.7101, over 6032.07 frames. ], batch size: 51, lr: 2.83e-03 2024-08-06 21:12:04,825 INFO [trainer.py:765] (0/8) Epoch 30, batch 2000, train_loss[loss=3.34, NarTop10Accuracy=0.663, over 6063.00 frames. ], tot_loss[loss=3.063, NarTop10Accuracy=0.7127, over 6007.14 frames. ], batch size: 51, lr: 2.83e-03 2024-08-06 21:12:30,088 INFO [trainer.py:765] (0/8) Epoch 30, batch 2100, train_loss[loss=2.837, NarTop10Accuracy=0.754, over 4731.00 frames. ], tot_loss[loss=3.07, NarTop10Accuracy=0.7111, over 5997.16 frames. ], batch size: 5, lr: 2.83e-03 2024-08-06 21:12:55,224 INFO [trainer.py:765] (0/8) Epoch 30, batch 2200, train_loss[loss=2.985, NarTop10Accuracy=0.7372, over 7170.00 frames. ], tot_loss[loss=3.07, NarTop10Accuracy=0.7114, over 6033.40 frames. ], batch size: 31, lr: 2.82e-03 2024-08-06 21:13:20,297 INFO [trainer.py:765] (0/8) Epoch 30, batch 2300, train_loss[loss=2.696, NarTop10Accuracy=0.7858, over 5802.00 frames. ], tot_loss[loss=3.091, NarTop10Accuracy=0.7073, over 6030.54 frames. ], batch size: 9, lr: 2.82e-03 2024-08-06 21:13:44,491 INFO [trainer.py:765] (0/8) Epoch 30, batch 2400, train_loss[loss=2.842, NarTop10Accuracy=0.7592, over 5217.00 frames. ], tot_loss[loss=3.052, NarTop10Accuracy=0.7156, over 5781.48 frames. ], batch size: 7, lr: 2.82e-03 2024-08-06 21:14:07,987 INFO [trainer.py:765] (0/8) Epoch 30, batch 2500, train_loss[loss=3.021, NarTop10Accuracy=0.7288, over 5085.00 frames. ], tot_loss[loss=3.046, NarTop10Accuracy=0.7165, over 5488.98 frames. ], batch size: 7, lr: 2.82e-03 2024-08-06 21:14:28,045 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 21:14:28,047 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-30.pt 2024-08-06 21:15:23,633 INFO [trainer.py:765] (0/8) Epoch 31, batch 100, train_loss[loss=3.437, NarTop10Accuracy=0.6418, over 6909.00 frames. ], tot_loss[loss=3.065, NarTop10Accuracy=0.7127, over 2354.80 frames. ], batch size: 31, lr: 2.77e-03 2024-08-06 21:15:55,128 INFO [trainer.py:765] (0/8) Epoch 31, batch 200, train_loss[loss=2.907, NarTop10Accuracy=0.7337, over 6774.00 frames. ], tot_loss[loss=3.033, NarTop10Accuracy=0.7189, over 3854.05 frames. ], batch size: 17, lr: 2.77e-03 2024-08-06 21:16:31,216 INFO [trainer.py:765] (0/8) Epoch 31, batch 300, train_loss[loss=2.881, NarTop10Accuracy=0.7532, over 7224.00 frames. ], tot_loss[loss=3.033, NarTop10Accuracy=0.7188, over 4661.08 frames. ], batch size: 22, lr: 2.77e-03 2024-08-06 21:17:01,625 INFO [trainer.py:765] (0/8) Epoch 31, batch 400, train_loss[loss=3.279, NarTop10Accuracy=0.6737, over 4995.00 frames. ], tot_loss[loss=3.042, NarTop10Accuracy=0.7169, over 5102.88 frames. ], batch size: 7, lr: 2.76e-03 2024-08-06 21:17:35,725 INFO [trainer.py:765] (0/8) Epoch 31, batch 500, train_loss[loss=2.71, NarTop10Accuracy=0.7874, over 6144.00 frames. ], tot_loss[loss=3.037, NarTop10Accuracy=0.7179, over 5393.41 frames. ], batch size: 11, lr: 2.76e-03 2024-08-06 21:18:07,084 INFO [trainer.py:765] (0/8) Epoch 31, batch 600, train_loss[loss=2.804, NarTop10Accuracy=0.7651, over 5628.00 frames. ], tot_loss[loss=3.057, NarTop10Accuracy=0.714, over 5659.82 frames. ], batch size: 9, lr: 2.76e-03 2024-08-06 21:18:44,611 INFO [trainer.py:765] (0/8) Epoch 31, batch 700, train_loss[loss=3.46, NarTop10Accuracy=0.6403, over 5184.00 frames. ], tot_loss[loss=3.059, NarTop10Accuracy=0.7137, over 5749.92 frames. ], batch size: 6, lr: 2.76e-03 2024-08-06 21:18:51,096 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 21:18:59,276 INFO [trainer.py:811] (0/8) Epoch 31, validation: loss=2.984, NarTop10Accuracy=0.7279, over 1905321.00 frames. 2024-08-06 21:18:59,276 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 21:18:59,986 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.824e+02 2.222e+02 2.378e+02 2.557e+02 4.306e+02, threshold=4.755e+02, percent-clipped=0.0 2024-08-06 21:19:24,245 INFO [trainer.py:765] (0/8) Epoch 31, batch 800, train_loss[loss=2.86, NarTop10Accuracy=0.7536, over 4368.00 frames. ], tot_loss[loss=3.052, NarTop10Accuracy=0.7154, over 5785.74 frames. ], batch size: 5, lr: 2.76e-03 2024-08-06 21:19:56,950 INFO [trainer.py:765] (0/8) Epoch 31, batch 900, train_loss[loss=3.313, NarTop10Accuracy=0.6632, over 6321.00 frames. ], tot_loss[loss=3.044, NarTop10Accuracy=0.7166, over 5796.31 frames. ], batch size: 13, lr: 2.76e-03 2024-08-06 21:20:33,310 INFO [trainer.py:765] (0/8) Epoch 31, batch 1000, train_loss[loss=3.338, NarTop10Accuracy=0.6507, over 6177.00 frames. ], tot_loss[loss=3.041, NarTop10Accuracy=0.7168, over 5896.37 frames. ], batch size: 13, lr: 2.75e-03 2024-08-06 21:21:10,215 INFO [trainer.py:765] (0/8) Epoch 31, batch 1100, train_loss[loss=3.201, NarTop10Accuracy=0.678, over 6825.00 frames. ], tot_loss[loss=3.048, NarTop10Accuracy=0.7152, over 5936.56 frames. ], batch size: 17, lr: 2.75e-03 2024-08-06 21:21:41,119 INFO [trainer.py:765] (0/8) Epoch 31, batch 1200, train_loss[loss=2.915, NarTop10Accuracy=0.7349, over 7371.00 frames. ], tot_loss[loss=3.031, NarTop10Accuracy=0.7189, over 5942.26 frames. ], batch size: 31, lr: 2.75e-03 2024-08-06 21:22:19,741 INFO [trainer.py:765] (0/8) Epoch 31, batch 1300, train_loss[loss=2.824, NarTop10Accuracy=0.7607, over 4188.00 frames. ], tot_loss[loss=3.055, NarTop10Accuracy=0.714, over 5993.03 frames. ], batch size: 5, lr: 2.75e-03 2024-08-06 21:22:53,533 INFO [trainer.py:765] (0/8) Epoch 31, batch 1400, train_loss[loss=2.885, NarTop10Accuracy=0.7513, over 6066.00 frames. ], tot_loss[loss=3.065, NarTop10Accuracy=0.7124, over 6032.99 frames. ], batch size: 11, lr: 2.75e-03 2024-08-06 21:23:21,269 INFO [trainer.py:765] (0/8) Epoch 31, batch 1500, train_loss[loss=3.38, NarTop10Accuracy=0.6477, over 5973.00 frames. ], tot_loss[loss=3.048, NarTop10Accuracy=0.7156, over 5954.59 frames. ], batch size: 50, lr: 2.74e-03 2024-08-06 21:23:49,005 INFO [trainer.py:765] (0/8) Epoch 31, batch 1600, train_loss[loss=3.461, NarTop10Accuracy=0.6352, over 7293.00 frames. ], tot_loss[loss=3.048, NarTop10Accuracy=0.7158, over 5928.33 frames. ], batch size: 23, lr: 2.74e-03 2024-08-06 21:24:15,511 INFO [trainer.py:765] (0/8) Epoch 31, batch 1700, train_loss[loss=3.442, NarTop10Accuracy=0.6369, over 6585.00 frames. ], tot_loss[loss=3.052, NarTop10Accuracy=0.7149, over 5913.04 frames. ], batch size: 14, lr: 2.74e-03 2024-08-06 21:24:41,995 INFO [trainer.py:765] (0/8) Epoch 31, batch 1800, train_loss[loss=2.759, NarTop10Accuracy=0.7716, over 7122.00 frames. ], tot_loss[loss=3.042, NarTop10Accuracy=0.7169, over 5979.74 frames. ], batch size: 22, lr: 2.74e-03 2024-08-06 21:25:08,356 INFO [trainer.py:765] (0/8) Epoch 31, batch 1900, train_loss[loss=3.144, NarTop10Accuracy=0.7, over 5934.00 frames. ], tot_loss[loss=3.057, NarTop10Accuracy=0.7141, over 6016.55 frames. ], batch size: 50, lr: 2.74e-03 2024-08-06 21:25:33,774 INFO [trainer.py:765] (0/8) Epoch 31, batch 2000, train_loss[loss=2.98, NarTop10Accuracy=0.731, over 6165.00 frames. ], tot_loss[loss=3.053, NarTop10Accuracy=0.7151, over 5983.87 frames. ], batch size: 51, lr: 2.74e-03 2024-08-06 21:25:59,106 INFO [trainer.py:765] (0/8) Epoch 31, batch 2100, train_loss[loss=2.718, NarTop10Accuracy=0.7824, over 3873.00 frames. ], tot_loss[loss=3.048, NarTop10Accuracy=0.7158, over 5974.24 frames. ], batch size: 4, lr: 2.73e-03 2024-08-06 21:26:24,237 INFO [trainer.py:765] (0/8) Epoch 31, batch 2200, train_loss[loss=3.107, NarTop10Accuracy=0.7072, over 7278.00 frames. ], tot_loss[loss=3.038, NarTop10Accuracy=0.7179, over 6001.15 frames. ], batch size: 31, lr: 2.73e-03 2024-08-06 21:26:49,322 INFO [trainer.py:765] (0/8) Epoch 31, batch 2300, train_loss[loss=2.688, NarTop10Accuracy=0.7883, over 5691.00 frames. ], tot_loss[loss=3.054, NarTop10Accuracy=0.7146, over 6019.91 frames. ], batch size: 9, lr: 2.73e-03 2024-08-06 21:27:13,607 INFO [trainer.py:765] (0/8) Epoch 31, batch 2400, train_loss[loss=2.856, NarTop10Accuracy=0.76, over 5091.00 frames. ], tot_loss[loss=3.049, NarTop10Accuracy=0.7153, over 5758.17 frames. ], batch size: 7, lr: 2.73e-03 2024-08-06 21:27:37,027 INFO [trainer.py:765] (0/8) Epoch 31, batch 2500, train_loss[loss=2.83, NarTop10Accuracy=0.7584, over 5076.00 frames. ], tot_loss[loss=3.04, NarTop10Accuracy=0.7171, over 5444.66 frames. ], batch size: 7, lr: 2.73e-03 2024-08-06 21:27:57,087 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 21:27:57,090 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-31.pt 2024-08-06 21:28:49,393 INFO [trainer.py:765] (0/8) Epoch 32, batch 100, train_loss[loss=2.971, NarTop10Accuracy=0.7359, over 7092.00 frames. ], tot_loss[loss=3.049, NarTop10Accuracy=0.7153, over 2365.03 frames. ], batch size: 31, lr: 2.68e-03 2024-08-06 21:29:08,161 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 21:29:16,392 INFO [trainer.py:811] (0/8) Epoch 32, validation: loss=2.919, NarTop10Accuracy=0.7409, over 1905321.00 frames. 2024-08-06 21:29:16,393 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 21:29:16,940 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.842e+02 2.253e+02 2.413e+02 2.600e+02 5.680e+02, threshold=4.826e+02, percent-clipped=0.1 2024-08-06 21:29:32,273 INFO [trainer.py:765] (0/8) Epoch 32, batch 200, train_loss[loss=3.325, NarTop10Accuracy=0.6535, over 6741.00 frames. ], tot_loss[loss=3.067, NarTop10Accuracy=0.712, over 3860.62 frames. ], batch size: 17, lr: 2.68e-03 2024-08-06 21:30:05,279 INFO [trainer.py:765] (0/8) Epoch 32, batch 300, train_loss[loss=2.958, NarTop10Accuracy=0.7345, over 7152.00 frames. ], tot_loss[loss=3.052, NarTop10Accuracy=0.7154, over 4650.27 frames. ], batch size: 22, lr: 2.68e-03 2024-08-06 21:30:34,103 INFO [trainer.py:765] (0/8) Epoch 32, batch 400, train_loss[loss=2.838, NarTop10Accuracy=0.7609, over 5322.00 frames. ], tot_loss[loss=3.062, NarTop10Accuracy=0.713, over 5113.61 frames. ], batch size: 7, lr: 2.68e-03 2024-08-06 21:31:13,531 INFO [trainer.py:765] (0/8) Epoch 32, batch 500, train_loss[loss=2.891, NarTop10Accuracy=0.7438, over 6138.00 frames. ], tot_loss[loss=3.06, NarTop10Accuracy=0.7134, over 5388.26 frames. ], batch size: 11, lr: 2.67e-03 2024-08-06 21:31:42,487 INFO [trainer.py:765] (0/8) Epoch 32, batch 600, train_loss[loss=3.371, NarTop10Accuracy=0.645, over 5739.00 frames. ], tot_loss[loss=3.056, NarTop10Accuracy=0.7142, over 5653.28 frames. ], batch size: 9, lr: 2.67e-03 2024-08-06 21:32:17,029 INFO [trainer.py:765] (0/8) Epoch 32, batch 700, train_loss[loss=2.891, NarTop10Accuracy=0.7469, over 5163.00 frames. ], tot_loss[loss=3.043, NarTop10Accuracy=0.7173, over 5726.95 frames. ], batch size: 6, lr: 2.67e-03 2024-08-06 21:33:00,647 INFO [trainer.py:765] (0/8) Epoch 32, batch 800, train_loss[loss=3.289, NarTop10Accuracy=0.6618, over 5235.00 frames. ], tot_loss[loss=3.043, NarTop10Accuracy=0.7171, over 5787.72 frames. ], batch size: 6, lr: 2.67e-03 2024-08-06 21:33:28,992 INFO [trainer.py:765] (0/8) Epoch 32, batch 900, train_loss[loss=2.761, NarTop10Accuracy=0.7786, over 6270.00 frames. ], tot_loss[loss=3.03, NarTop10Accuracy=0.7194, over 5806.27 frames. ], batch size: 13, lr: 2.67e-03 2024-08-06 21:34:04,050 INFO [trainer.py:765] (0/8) Epoch 32, batch 1000, train_loss[loss=3.301, NarTop10Accuracy=0.6682, over 6150.00 frames. ], tot_loss[loss=3.041, NarTop10Accuracy=0.7172, over 5902.23 frames. ], batch size: 13, lr: 2.67e-03 2024-08-06 21:34:46,675 INFO [trainer.py:765] (0/8) Epoch 32, batch 1100, train_loss[loss=3.129, NarTop10Accuracy=0.6979, over 6732.00 frames. ], tot_loss[loss=3.048, NarTop10Accuracy=0.7156, over 5935.58 frames. ], batch size: 17, lr: 2.66e-03 2024-08-06 21:35:18,172 INFO [trainer.py:765] (0/8) Epoch 32, batch 1200, train_loss[loss=3.164, NarTop10Accuracy=0.6852, over 7398.00 frames. ], tot_loss[loss=3.06, NarTop10Accuracy=0.713, over 5935.57 frames. ], batch size: 32, lr: 2.66e-03 2024-08-06 21:35:52,802 INFO [trainer.py:765] (0/8) Epoch 32, batch 1300, train_loss[loss=2.918, NarTop10Accuracy=0.7335, over 5166.00 frames. ], tot_loss[loss=3.058, NarTop10Accuracy=0.7132, over 6004.80 frames. ], batch size: 6, lr: 2.66e-03 2024-08-06 21:36:29,479 INFO [trainer.py:765] (0/8) Epoch 32, batch 1400, train_loss[loss=3.188, NarTop10Accuracy=0.6841, over 6132.00 frames. ], tot_loss[loss=3.054, NarTop10Accuracy=0.7141, over 5995.60 frames. ], batch size: 11, lr: 2.66e-03 2024-08-06 21:37:04,734 INFO [trainer.py:765] (0/8) Epoch 32, batch 1500, train_loss[loss=3.475, NarTop10Accuracy=0.6273, over 6105.00 frames. ], tot_loss[loss=3.053, NarTop10Accuracy=0.7145, over 5945.13 frames. ], batch size: 51, lr: 2.66e-03 2024-08-06 21:37:32,522 INFO [trainer.py:765] (0/8) Epoch 32, batch 1600, train_loss[loss=3.048, NarTop10Accuracy=0.7171, over 7119.00 frames. ], tot_loss[loss=3.054, NarTop10Accuracy=0.7149, over 5927.45 frames. ], batch size: 22, lr: 2.66e-03 2024-08-06 21:37:59,161 INFO [trainer.py:765] (0/8) Epoch 32, batch 1700, train_loss[loss=3.132, NarTop10Accuracy=0.6905, over 6657.00 frames. ], tot_loss[loss=3.051, NarTop10Accuracy=0.7156, over 5920.16 frames. ], batch size: 14, lr: 2.65e-03 2024-08-06 21:38:25,703 INFO [trainer.py:765] (0/8) Epoch 32, batch 1800, train_loss[loss=2.926, NarTop10Accuracy=0.7333, over 7137.00 frames. ], tot_loss[loss=3.062, NarTop10Accuracy=0.7132, over 5971.95 frames. ], batch size: 22, lr: 2.65e-03 2024-08-06 21:38:52,170 INFO [trainer.py:765] (0/8) Epoch 32, batch 1900, train_loss[loss=3.056, NarTop10Accuracy=0.7148, over 6102.00 frames. ], tot_loss[loss=3.081, NarTop10Accuracy=0.7096, over 6016.42 frames. ], batch size: 50, lr: 2.65e-03 2024-08-06 21:39:17,769 INFO [trainer.py:765] (0/8) Epoch 32, batch 2000, train_loss[loss=3.476, NarTop10Accuracy=0.6313, over 6147.00 frames. ], tot_loss[loss=3.06, NarTop10Accuracy=0.7137, over 5985.50 frames. ], batch size: 50, lr: 2.65e-03 2024-08-06 21:39:43,178 INFO [trainer.py:765] (0/8) Epoch 32, batch 2100, train_loss[loss=2.798, NarTop10Accuracy=0.7709, over 4752.00 frames. ], tot_loss[loss=3.056, NarTop10Accuracy=0.7144, over 5964.35 frames. ], batch size: 5, lr: 2.65e-03 2024-08-06 21:39:54,783 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 21:40:02,941 INFO [trainer.py:811] (0/8) Epoch 32, validation: loss=2.886, NarTop10Accuracy=0.7482, over 1905321.00 frames. 2024-08-06 21:40:02,942 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 21:40:03,423 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.874e+02 2.278e+02 2.449e+02 2.609e+02 8.207e+02, threshold=4.898e+02, percent-clipped=0.3 2024-08-06 21:40:16,628 INFO [trainer.py:765] (0/8) Epoch 32, batch 2200, train_loss[loss=3.078, NarTop10Accuracy=0.7026, over 7434.00 frames. ], tot_loss[loss=3.052, NarTop10Accuracy=0.7151, over 6007.45 frames. ], batch size: 31, lr: 2.65e-03 2024-08-06 21:40:41,717 INFO [trainer.py:765] (0/8) Epoch 32, batch 2300, train_loss[loss=3.229, NarTop10Accuracy=0.6701, over 5820.00 frames. ], tot_loss[loss=3.073, NarTop10Accuracy=0.7109, over 6021.42 frames. ], batch size: 9, lr: 2.65e-03 2024-08-06 21:41:06,072 INFO [trainer.py:765] (0/8) Epoch 32, batch 2400, train_loss[loss=3.117, NarTop10Accuracy=0.7051, over 5061.00 frames. ], tot_loss[loss=3.05, NarTop10Accuracy=0.7153, over 5772.31 frames. ], batch size: 7, lr: 2.64e-03 2024-08-06 21:41:29,537 INFO [trainer.py:765] (0/8) Epoch 32, batch 2500, train_loss[loss=2.84, NarTop10Accuracy=0.7588, over 5298.00 frames. ], tot_loss[loss=3.022, NarTop10Accuracy=0.7205, over 5462.96 frames. ], batch size: 7, lr: 2.64e-03 2024-08-06 21:41:50,019 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 21:41:50,022 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-32.pt 2024-08-06 21:42:47,616 INFO [trainer.py:765] (0/8) Epoch 33, batch 100, train_loss[loss=3.137, NarTop10Accuracy=0.7014, over 7383.00 frames. ], tot_loss[loss=3.009, NarTop10Accuracy=0.7244, over 2357.38 frames. ], batch size: 32, lr: 2.60e-03 2024-08-06 21:43:22,368 INFO [trainer.py:765] (0/8) Epoch 33, batch 200, train_loss[loss=2.858, NarTop10Accuracy=0.7561, over 6822.00 frames. ], tot_loss[loss=3.028, NarTop10Accuracy=0.7204, over 3864.17 frames. ], batch size: 17, lr: 2.60e-03 2024-08-06 21:43:56,513 INFO [trainer.py:765] (0/8) Epoch 33, batch 300, train_loss[loss=3.424, NarTop10Accuracy=0.6385, over 7005.00 frames. ], tot_loss[loss=3.04, NarTop10Accuracy=0.7175, over 4661.23 frames. ], batch size: 22, lr: 2.60e-03 2024-08-06 21:44:30,316 INFO [trainer.py:765] (0/8) Epoch 33, batch 400, train_loss[loss=2.76, NarTop10Accuracy=0.7671, over 5268.00 frames. ], tot_loss[loss=3.037, NarTop10Accuracy=0.7179, over 5099.87 frames. ], batch size: 7, lr: 2.59e-03 2024-08-06 21:45:02,870 INFO [trainer.py:765] (0/8) Epoch 33, batch 500, train_loss[loss=2.708, NarTop10Accuracy=0.7872, over 6069.00 frames. ], tot_loss[loss=3.022, NarTop10Accuracy=0.7212, over 5380.30 frames. ], batch size: 11, lr: 2.59e-03 2024-08-06 21:45:36,226 INFO [trainer.py:765] (0/8) Epoch 33, batch 600, train_loss[loss=3.307, NarTop10Accuracy=0.6578, over 5727.00 frames. ], tot_loss[loss=3.054, NarTop10Accuracy=0.7142, over 5655.40 frames. ], batch size: 9, lr: 2.59e-03 2024-08-06 21:46:11,317 INFO [trainer.py:765] (0/8) Epoch 33, batch 700, train_loss[loss=2.615, NarTop10Accuracy=0.8054, over 5085.00 frames. ], tot_loss[loss=3.053, NarTop10Accuracy=0.7144, over 5718.21 frames. ], batch size: 6, lr: 2.59e-03 2024-08-06 21:46:46,170 INFO [trainer.py:765] (0/8) Epoch 33, batch 800, train_loss[loss=2.747, NarTop10Accuracy=0.7703, over 5007.00 frames. ], tot_loss[loss=3.052, NarTop10Accuracy=0.715, over 5776.23 frames. ], batch size: 6, lr: 2.59e-03 2024-08-06 21:47:18,908 INFO [trainer.py:765] (0/8) Epoch 33, batch 900, train_loss[loss=3.235, NarTop10Accuracy=0.6805, over 6162.00 frames. ], tot_loss[loss=3.054, NarTop10Accuracy=0.7145, over 5791.07 frames. ], batch size: 13, lr: 2.59e-03 2024-08-06 21:47:57,316 INFO [trainer.py:765] (0/8) Epoch 33, batch 1000, train_loss[loss=3.001, NarTop10Accuracy=0.7309, over 6744.00 frames. ], tot_loss[loss=3.056, NarTop10Accuracy=0.7141, over 5886.47 frames. ], batch size: 14, lr: 2.58e-03 2024-08-06 21:48:30,908 INFO [trainer.py:765] (0/8) Epoch 33, batch 1100, train_loss[loss=2.924, NarTop10Accuracy=0.744, over 6819.00 frames. ], tot_loss[loss=3.072, NarTop10Accuracy=0.7105, over 5917.13 frames. ], batch size: 17, lr: 2.58e-03 2024-08-06 21:49:06,660 INFO [trainer.py:765] (0/8) Epoch 33, batch 1200, train_loss[loss=2.903, NarTop10Accuracy=0.7539, over 7212.00 frames. ], tot_loss[loss=3.062, NarTop10Accuracy=0.7126, over 5940.79 frames. ], batch size: 31, lr: 2.58e-03 2024-08-06 21:49:42,816 INFO [trainer.py:765] (0/8) Epoch 33, batch 1300, train_loss[loss=2.932, NarTop10Accuracy=0.7384, over 4233.00 frames. ], tot_loss[loss=3.056, NarTop10Accuracy=0.7141, over 6002.97 frames. ], batch size: 5, lr: 2.58e-03 2024-08-06 21:50:17,310 INFO [trainer.py:765] (0/8) Epoch 33, batch 1400, train_loss[loss=3.206, NarTop10Accuracy=0.6836, over 6051.00 frames. ], tot_loss[loss=3.058, NarTop10Accuracy=0.7137, over 6036.94 frames. ], batch size: 11, lr: 2.58e-03 2024-08-06 21:50:45,370 INFO [trainer.py:765] (0/8) Epoch 33, batch 1500, train_loss[loss=3.104, NarTop10Accuracy=0.7038, over 5937.00 frames. ], tot_loss[loss=3.058, NarTop10Accuracy=0.7141, over 5968.94 frames. ], batch size: 50, lr: 2.58e-03 2024-08-06 21:51:04,608 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 21:51:12,662 INFO [trainer.py:811] (0/8) Epoch 33, validation: loss=2.938, NarTop10Accuracy=0.7372, over 1905321.00 frames. 2024-08-06 21:51:12,662 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 21:51:13,180 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.834e+02 2.250e+02 2.409e+02 2.586e+02 3.975e+02, threshold=4.818e+02, percent-clipped=0.0 2024-08-06 21:51:21,261 INFO [trainer.py:765] (0/8) Epoch 33, batch 1600, train_loss[loss=3.247, NarTop10Accuracy=0.6744, over 7005.00 frames. ], tot_loss[loss=3.043, NarTop10Accuracy=0.7172, over 5933.86 frames. ], batch size: 22, lr: 2.57e-03 2024-08-06 21:51:47,923 INFO [trainer.py:765] (0/8) Epoch 33, batch 1700, train_loss[loss=2.768, NarTop10Accuracy=0.7714, over 6801.00 frames. ], tot_loss[loss=3.054, NarTop10Accuracy=0.7149, over 5921.33 frames. ], batch size: 14, lr: 2.57e-03 2024-08-06 21:52:14,392 INFO [trainer.py:765] (0/8) Epoch 33, batch 1800, train_loss[loss=2.797, NarTop10Accuracy=0.7661, over 7323.00 frames. ], tot_loss[loss=3.043, NarTop10Accuracy=0.7168, over 5982.28 frames. ], batch size: 23, lr: 2.57e-03 2024-08-06 21:52:40,856 INFO [trainer.py:765] (0/8) Epoch 33, batch 1900, train_loss[loss=3.413, NarTop10Accuracy=0.6431, over 6222.00 frames. ], tot_loss[loss=3.068, NarTop10Accuracy=0.712, over 6029.31 frames. ], batch size: 50, lr: 2.57e-03 2024-08-06 21:53:06,352 INFO [trainer.py:765] (0/8) Epoch 33, batch 2000, train_loss[loss=3.496, NarTop10Accuracy=0.6251, over 6339.00 frames. ], tot_loss[loss=3.043, NarTop10Accuracy=0.717, over 5987.06 frames. ], batch size: 50, lr: 2.57e-03 2024-08-06 21:53:31,659 INFO [trainer.py:765] (0/8) Epoch 33, batch 2100, train_loss[loss=3.242, NarTop10Accuracy=0.67, over 4689.00 frames. ], tot_loss[loss=3.045, NarTop10Accuracy=0.7166, over 5969.80 frames. ], batch size: 5, lr: 2.57e-03 2024-08-06 21:53:56,890 INFO [trainer.py:765] (0/8) Epoch 33, batch 2200, train_loss[loss=3.377, NarTop10Accuracy=0.64, over 7008.00 frames. ], tot_loss[loss=3.051, NarTop10Accuracy=0.7151, over 6010.10 frames. ], batch size: 31, lr: 2.57e-03 2024-08-06 21:54:21,990 INFO [trainer.py:765] (0/8) Epoch 33, batch 2300, train_loss[loss=2.765, NarTop10Accuracy=0.7799, over 5682.00 frames. ], tot_loss[loss=3.048, NarTop10Accuracy=0.7162, over 6014.22 frames. ], batch size: 9, lr: 2.56e-03 2024-08-06 21:54:46,429 INFO [trainer.py:765] (0/8) Epoch 33, batch 2400, train_loss[loss=2.715, NarTop10Accuracy=0.7864, over 5208.00 frames. ], tot_loss[loss=3.04, NarTop10Accuracy=0.7176, over 5768.29 frames. ], batch size: 7, lr: 2.56e-03 2024-08-06 21:55:09,862 INFO [trainer.py:765] (0/8) Epoch 33, batch 2500, train_loss[loss=2.533, NarTop10Accuracy=0.8059, over 5685.00 frames. ], tot_loss[loss=3.014, NarTop10Accuracy=0.7225, over 5471.52 frames. ], batch size: 8, lr: 2.56e-03 2024-08-06 21:55:29,882 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 21:55:29,886 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-33.pt 2024-08-06 21:56:24,721 INFO [trainer.py:765] (0/8) Epoch 34, batch 100, train_loss[loss=3.423, NarTop10Accuracy=0.6383, over 7041.00 frames. ], tot_loss[loss=3.035, NarTop10Accuracy=0.7185, over 2357.83 frames. ], batch size: 31, lr: 2.52e-03 2024-08-06 21:56:55,613 INFO [trainer.py:765] (0/8) Epoch 34, batch 200, train_loss[loss=3.165, NarTop10Accuracy=0.7001, over 6786.00 frames. ], tot_loss[loss=3.014, NarTop10Accuracy=0.7227, over 3860.53 frames. ], batch size: 17, lr: 2.52e-03 2024-08-06 21:57:31,776 INFO [trainer.py:765] (0/8) Epoch 34, batch 300, train_loss[loss=2.881, NarTop10Accuracy=0.7473, over 7254.00 frames. ], tot_loss[loss=3.03, NarTop10Accuracy=0.7193, over 4671.56 frames. ], batch size: 22, lr: 2.52e-03 2024-08-06 21:58:02,724 INFO [trainer.py:765] (0/8) Epoch 34, batch 400, train_loss[loss=3.205, NarTop10Accuracy=0.6772, over 5175.00 frames. ], tot_loss[loss=3.017, NarTop10Accuracy=0.7216, over 5111.58 frames. ], batch size: 7, lr: 2.52e-03 2024-08-06 21:58:34,689 INFO [trainer.py:765] (0/8) Epoch 34, batch 500, train_loss[loss=3.257, NarTop10Accuracy=0.6681, over 6114.00 frames. ], tot_loss[loss=3.03, NarTop10Accuracy=0.7189, over 5402.39 frames. ], batch size: 11, lr: 2.51e-03 2024-08-06 21:59:09,616 INFO [trainer.py:765] (0/8) Epoch 34, batch 600, train_loss[loss=2.895, NarTop10Accuracy=0.7587, over 5742.00 frames. ], tot_loss[loss=3.035, NarTop10Accuracy=0.7181, over 5660.05 frames. ], batch size: 9, lr: 2.51e-03 2024-08-06 21:59:46,056 INFO [trainer.py:765] (0/8) Epoch 34, batch 700, train_loss[loss=3.126, NarTop10Accuracy=0.7114, over 5193.00 frames. ], tot_loss[loss=3.039, NarTop10Accuracy=0.7177, over 5718.00 frames. ], batch size: 6, lr: 2.51e-03 2024-08-06 22:00:17,575 INFO [trainer.py:765] (0/8) Epoch 34, batch 800, train_loss[loss=2.986, NarTop10Accuracy=0.7261, over 5184.00 frames. ], tot_loss[loss=3.03, NarTop10Accuracy=0.7196, over 5778.64 frames. ], batch size: 6, lr: 2.51e-03 2024-08-06 22:00:49,874 INFO [trainer.py:765] (0/8) Epoch 34, batch 900, train_loss[loss=2.84, NarTop10Accuracy=0.7567, over 6195.00 frames. ], tot_loss[loss=3.027, NarTop10Accuracy=0.7196, over 5800.10 frames. ], batch size: 13, lr: 2.51e-03 2024-08-06 22:01:25,338 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 22:01:33,386 INFO [trainer.py:811] (0/8) Epoch 34, validation: loss=2.9, NarTop10Accuracy=0.7444, over 1905321.00 frames. 2024-08-06 22:01:33,387 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 22:01:34,092 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.819e+02 2.259e+02 2.434e+02 2.615e+02 5.125e+02, threshold=4.868e+02, percent-clipped=0.1 2024-08-06 22:01:35,624 INFO [trainer.py:765] (0/8) Epoch 34, batch 1000, train_loss[loss=3.429, NarTop10Accuracy=0.6342, over 6207.00 frames. ], tot_loss[loss=3.037, NarTop10Accuracy=0.7174, over 5902.05 frames. ], batch size: 13, lr: 2.51e-03 2024-08-06 22:02:10,829 INFO [trainer.py:765] (0/8) Epoch 34, batch 1100, train_loss[loss=3.173, NarTop10Accuracy=0.6965, over 6642.00 frames. ], tot_loss[loss=3.042, NarTop10Accuracy=0.7167, over 5925.36 frames. ], batch size: 17, lr: 2.51e-03 2024-08-06 22:02:46,787 INFO [trainer.py:765] (0/8) Epoch 34, batch 1200, train_loss[loss=2.915, NarTop10Accuracy=0.7436, over 7173.00 frames. ], tot_loss[loss=3.044, NarTop10Accuracy=0.7168, over 5915.35 frames. ], batch size: 31, lr: 2.50e-03 2024-08-06 22:03:20,814 INFO [trainer.py:765] (0/8) Epoch 34, batch 1300, train_loss[loss=2.903, NarTop10Accuracy=0.7466, over 4452.00 frames. ], tot_loss[loss=3.044, NarTop10Accuracy=0.7165, over 5977.56 frames. ], batch size: 5, lr: 2.50e-03 2024-08-06 22:03:52,950 INFO [trainer.py:765] (0/8) Epoch 34, batch 1400, train_loss[loss=3.294, NarTop10Accuracy=0.6736, over 6033.00 frames. ], tot_loss[loss=3.04, NarTop10Accuracy=0.7175, over 5998.77 frames. ], batch size: 11, lr: 2.50e-03 2024-08-06 22:04:20,822 INFO [trainer.py:765] (0/8) Epoch 34, batch 1500, train_loss[loss=3.124, NarTop10Accuracy=0.7027, over 6069.00 frames. ], tot_loss[loss=3.039, NarTop10Accuracy=0.7176, over 5945.30 frames. ], batch size: 50, lr: 2.50e-03 2024-08-06 22:04:48,600 INFO [trainer.py:765] (0/8) Epoch 34, batch 1600, train_loss[loss=2.917, NarTop10Accuracy=0.7397, over 7068.00 frames. ], tot_loss[loss=3.044, NarTop10Accuracy=0.7164, over 5932.89 frames. ], batch size: 22, lr: 2.50e-03 2024-08-06 22:05:15,241 INFO [trainer.py:765] (0/8) Epoch 34, batch 1700, train_loss[loss=3.24, NarTop10Accuracy=0.6832, over 6657.00 frames. ], tot_loss[loss=3.032, NarTop10Accuracy=0.7184, over 5903.10 frames. ], batch size: 14, lr: 2.50e-03 2024-08-06 22:05:41,721 INFO [trainer.py:765] (0/8) Epoch 34, batch 1800, train_loss[loss=3.27, NarTop10Accuracy=0.6698, over 7113.00 frames. ], tot_loss[loss=3.041, NarTop10Accuracy=0.7165, over 5975.24 frames. ], batch size: 22, lr: 2.50e-03 2024-08-06 22:06:08,206 INFO [trainer.py:765] (0/8) Epoch 34, batch 1900, train_loss[loss=3.097, NarTop10Accuracy=0.7145, over 6312.00 frames. ], tot_loss[loss=3.065, NarTop10Accuracy=0.7122, over 6029.04 frames. ], batch size: 50, lr: 2.49e-03 2024-08-06 22:06:33,770 INFO [trainer.py:765] (0/8) Epoch 34, batch 2000, train_loss[loss=3.066, NarTop10Accuracy=0.7153, over 6369.00 frames. ], tot_loss[loss=3.051, NarTop10Accuracy=0.7149, over 6023.68 frames. ], batch size: 50, lr: 2.49e-03 2024-08-06 22:06:59,126 INFO [trainer.py:765] (0/8) Epoch 34, batch 2100, train_loss[loss=3.209, NarTop10Accuracy=0.6788, over 4908.00 frames. ], tot_loss[loss=3.069, NarTop10Accuracy=0.7114, over 6002.39 frames. ], batch size: 5, lr: 2.49e-03 2024-08-06 22:07:24,399 INFO [trainer.py:765] (0/8) Epoch 34, batch 2200, train_loss[loss=2.816, NarTop10Accuracy=0.7689, over 7029.00 frames. ], tot_loss[loss=3.067, NarTop10Accuracy=0.7119, over 6029.05 frames. ], batch size: 31, lr: 2.49e-03 2024-08-06 22:07:49,535 INFO [trainer.py:765] (0/8) Epoch 34, batch 2300, train_loss[loss=2.918, NarTop10Accuracy=0.76, over 5727.00 frames. ], tot_loss[loss=3.067, NarTop10Accuracy=0.7123, over 6051.24 frames. ], batch size: 9, lr: 2.49e-03 2024-08-06 22:08:14,059 INFO [trainer.py:765] (0/8) Epoch 34, batch 2400, train_loss[loss=3.412, NarTop10Accuracy=0.6428, over 5088.00 frames. ], tot_loss[loss=3.063, NarTop10Accuracy=0.7127, over 5794.89 frames. ], batch size: 7, lr: 2.49e-03 2024-08-06 22:08:37,648 INFO [trainer.py:765] (0/8) Epoch 34, batch 2500, train_loss[loss=2.675, NarTop10Accuracy=0.7916, over 5097.00 frames. ], tot_loss[loss=3.024, NarTop10Accuracy=0.7204, over 5495.42 frames. ], batch size: 7, lr: 2.49e-03 2024-08-06 22:08:57,384 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 22:08:57,387 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-34.pt 2024-08-06 22:09:52,639 INFO [trainer.py:765] (0/8) Epoch 35, batch 100, train_loss[loss=2.951, NarTop10Accuracy=0.7332, over 7188.00 frames. ], tot_loss[loss=3.053, NarTop10Accuracy=0.7149, over 2357.24 frames. ], batch size: 31, lr: 2.45e-03 2024-08-06 22:10:29,697 INFO [trainer.py:765] (0/8) Epoch 35, batch 200, train_loss[loss=3.096, NarTop10Accuracy=0.7112, over 6714.00 frames. ], tot_loss[loss=3.049, NarTop10Accuracy=0.7157, over 3845.27 frames. ], batch size: 17, lr: 2.45e-03 2024-08-06 22:11:04,942 INFO [trainer.py:765] (0/8) Epoch 35, batch 300, train_loss[loss=2.854, NarTop10Accuracy=0.7566, over 7173.00 frames. ], tot_loss[loss=3.03, NarTop10Accuracy=0.7193, over 4655.41 frames. ], batch size: 22, lr: 2.44e-03 2024-08-06 22:11:35,332 INFO [trainer.py:765] (0/8) Epoch 35, batch 400, train_loss[loss=2.943, NarTop10Accuracy=0.7445, over 5352.00 frames. ], tot_loss[loss=3.028, NarTop10Accuracy=0.7195, over 5119.25 frames. ], batch size: 7, lr: 2.44e-03 2024-08-06 22:11:40,047 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 22:11:48,129 INFO [trainer.py:811] (0/8) Epoch 35, validation: loss=2.84, NarTop10Accuracy=0.7576, over 1905321.00 frames. 2024-08-06 22:11:48,129 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30264MB 2024-08-06 22:11:48,702 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.898e+02 2.275e+02 2.426e+02 2.615e+02 4.095e+02, threshold=4.852e+02, percent-clipped=0.0 2024-08-06 22:12:17,723 INFO [trainer.py:765] (0/8) Epoch 35, batch 500, train_loss[loss=2.743, NarTop10Accuracy=0.776, over 6189.00 frames. ], tot_loss[loss=3.019, NarTop10Accuracy=0.7213, over 5402.06 frames. ], batch size: 11, lr: 2.44e-03 2024-08-06 22:12:51,424 INFO [trainer.py:765] (0/8) Epoch 35, batch 600, train_loss[loss=3.264, NarTop10Accuracy=0.6675, over 5757.00 frames. ], tot_loss[loss=3.036, NarTop10Accuracy=0.718, over 5665.08 frames. ], batch size: 9, lr: 2.44e-03 2024-08-06 22:13:24,940 INFO [trainer.py:765] (0/8) Epoch 35, batch 700, train_loss[loss=2.803, NarTop10Accuracy=0.7694, over 5067.00 frames. ], tot_loss[loss=3.04, NarTop10Accuracy=0.7175, over 5727.61 frames. ], batch size: 6, lr: 2.44e-03 2024-08-06 22:14:01,383 INFO [trainer.py:765] (0/8) Epoch 35, batch 800, train_loss[loss=2.734, NarTop10Accuracy=0.7795, over 5055.00 frames. ], tot_loss[loss=3.045, NarTop10Accuracy=0.7164, over 5775.70 frames. ], batch size: 6, lr: 2.44e-03 2024-08-06 22:14:34,372 INFO [trainer.py:765] (0/8) Epoch 35, batch 900, train_loss[loss=3.229, NarTop10Accuracy=0.6791, over 6366.00 frames. ], tot_loss[loss=3.034, NarTop10Accuracy=0.7186, over 5787.84 frames. ], batch size: 13, lr: 2.44e-03 2024-08-06 22:15:09,372 INFO [trainer.py:765] (0/8) Epoch 35, batch 1000, train_loss[loss=2.907, NarTop10Accuracy=0.7482, over 6585.00 frames. ], tot_loss[loss=3.04, NarTop10Accuracy=0.7174, over 5883.15 frames. ], batch size: 14, lr: 2.43e-03 2024-08-06 22:15:48,495 INFO [trainer.py:765] (0/8) Epoch 35, batch 1100, train_loss[loss=2.987, NarTop10Accuracy=0.7304, over 6903.00 frames. ], tot_loss[loss=3.048, NarTop10Accuracy=0.7158, over 5915.31 frames. ], batch size: 17, lr: 2.43e-03 2024-08-06 22:16:22,483 INFO [trainer.py:765] (0/8) Epoch 35, batch 1200, train_loss[loss=2.96, NarTop10Accuracy=0.7404, over 7242.00 frames. ], tot_loss[loss=3.031, NarTop10Accuracy=0.7191, over 5902.19 frames. ], batch size: 31, lr: 2.43e-03 2024-08-06 22:16:57,060 INFO [trainer.py:765] (0/8) Epoch 35, batch 1300, train_loss[loss=2.947, NarTop10Accuracy=0.7462, over 5082.00 frames. ], tot_loss[loss=3.023, NarTop10Accuracy=0.7211, over 5973.27 frames. ], batch size: 6, lr: 2.43e-03 2024-08-06 22:17:31,060 INFO [trainer.py:765] (0/8) Epoch 35, batch 1400, train_loss[loss=3.072, NarTop10Accuracy=0.6995, over 6093.00 frames. ], tot_loss[loss=3.033, NarTop10Accuracy=0.7187, over 5991.44 frames. ], batch size: 11, lr: 2.43e-03 2024-08-06 22:18:03,062 INFO [trainer.py:765] (0/8) Epoch 35, batch 1500, train_loss[loss=3.034, NarTop10Accuracy=0.7128, over 6102.00 frames. ], tot_loss[loss=3.04, NarTop10Accuracy=0.7174, over 5971.20 frames. ], batch size: 50, lr: 2.43e-03 2024-08-06 22:18:30,728 INFO [trainer.py:765] (0/8) Epoch 35, batch 1600, train_loss[loss=2.938, NarTop10Accuracy=0.7451, over 7209.00 frames. ], tot_loss[loss=3.049, NarTop10Accuracy=0.7154, over 5959.90 frames. ], batch size: 22, lr: 2.43e-03 2024-08-06 22:18:57,319 INFO [trainer.py:765] (0/8) Epoch 35, batch 1700, train_loss[loss=2.769, NarTop10Accuracy=0.7678, over 6135.00 frames. ], tot_loss[loss=3.055, NarTop10Accuracy=0.7141, over 5939.25 frames. ], batch size: 13, lr: 2.42e-03 2024-08-06 22:19:23,702 INFO [trainer.py:765] (0/8) Epoch 35, batch 1800, train_loss[loss=3.456, NarTop10Accuracy=0.6273, over 7035.00 frames. ], tot_loss[loss=3.05, NarTop10Accuracy=0.7153, over 6009.46 frames. ], batch size: 22, lr: 2.42e-03 2024-08-06 22:19:50,201 INFO [trainer.py:765] (0/8) Epoch 35, batch 1900, train_loss[loss=3.192, NarTop10Accuracy=0.6897, over 5937.00 frames. ], tot_loss[loss=3.057, NarTop10Accuracy=0.714, over 6027.41 frames. ], batch size: 50, lr: 2.42e-03 2024-08-06 22:20:15,762 INFO [trainer.py:765] (0/8) Epoch 35, batch 2000, train_loss[loss=3.102, NarTop10Accuracy=0.7097, over 6543.00 frames. ], tot_loss[loss=3.048, NarTop10Accuracy=0.7157, over 6000.15 frames. ], batch size: 50, lr: 2.42e-03 2024-08-06 22:20:41,045 INFO [trainer.py:765] (0/8) Epoch 35, batch 2100, train_loss[loss=2.749, NarTop10Accuracy=0.7863, over 4695.00 frames. ], tot_loss[loss=3.047, NarTop10Accuracy=0.7159, over 5966.40 frames. ], batch size: 5, lr: 2.42e-03 2024-08-06 22:21:06,226 INFO [trainer.py:765] (0/8) Epoch 35, batch 2200, train_loss[loss=2.858, NarTop10Accuracy=0.7622, over 7515.00 frames. ], tot_loss[loss=3.051, NarTop10Accuracy=0.7152, over 6011.35 frames. ], batch size: 33, lr: 2.42e-03 2024-08-06 22:21:31,286 INFO [trainer.py:765] (0/8) Epoch 35, batch 2300, train_loss[loss=2.861, NarTop10Accuracy=0.7468, over 5742.00 frames. ], tot_loss[loss=3.049, NarTop10Accuracy=0.7157, over 6025.43 frames. ], batch size: 9, lr: 2.42e-03 2024-08-06 22:21:55,648 INFO [trainer.py:765] (0/8) Epoch 35, batch 2400, train_loss[loss=3.239, NarTop10Accuracy=0.6721, over 5223.00 frames. ], tot_loss[loss=3.044, NarTop10Accuracy=0.7166, over 5764.81 frames. ], batch size: 7, lr: 2.42e-03 2024-08-06 22:21:59,680 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 22:22:07,656 INFO [trainer.py:811] (0/8) Epoch 35, validation: loss=2.905, NarTop10Accuracy=0.7437, over 1905321.00 frames. 2024-08-06 22:22:07,657 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30377MB 2024-08-06 22:22:08,115 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.895e+02 2.316e+02 2.462e+02 2.653e+02 5.566e+02, threshold=4.923e+02, percent-clipped=0.1 2024-08-06 22:22:27,127 INFO [trainer.py:765] (0/8) Epoch 35, batch 2500, train_loss[loss=3.184, NarTop10Accuracy=0.685, over 5118.00 frames. ], tot_loss[loss=3.022, NarTop10Accuracy=0.7203, over 5478.79 frames. ], batch size: 7, lr: 2.41e-03 2024-08-06 22:22:46,663 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 22:22:46,666 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-35.pt 2024-08-06 22:23:47,171 INFO [trainer.py:765] (0/8) Epoch 36, batch 100, train_loss[loss=3.252, NarTop10Accuracy=0.6746, over 7521.00 frames. ], tot_loss[loss=2.997, NarTop10Accuracy=0.7266, over 2354.48 frames. ], batch size: 33, lr: 2.38e-03 2024-08-06 22:24:22,494 INFO [trainer.py:765] (0/8) Epoch 36, batch 200, train_loss[loss=2.998, NarTop10Accuracy=0.7299, over 6846.00 frames. ], tot_loss[loss=3.026, NarTop10Accuracy=0.7204, over 3845.61 frames. ], batch size: 17, lr: 2.38e-03 2024-08-06 22:24:54,720 INFO [trainer.py:765] (0/8) Epoch 36, batch 300, train_loss[loss=3.267, NarTop10Accuracy=0.6597, over 7428.00 frames. ], tot_loss[loss=3.036, NarTop10Accuracy=0.7186, over 4643.41 frames. ], batch size: 23, lr: 2.37e-03 2024-08-06 22:25:29,275 INFO [trainer.py:765] (0/8) Epoch 36, batch 400, train_loss[loss=2.964, NarTop10Accuracy=0.733, over 5010.00 frames. ], tot_loss[loss=3.021, NarTop10Accuracy=0.7219, over 5095.36 frames. ], batch size: 7, lr: 2.37e-03 2024-08-06 22:26:01,818 INFO [trainer.py:765] (0/8) Epoch 36, batch 500, train_loss[loss=3.242, NarTop10Accuracy=0.6735, over 6114.00 frames. ], tot_loss[loss=3.02, NarTop10Accuracy=0.7215, over 5369.33 frames. ], batch size: 11, lr: 2.37e-03 2024-08-06 22:26:35,025 INFO [trainer.py:765] (0/8) Epoch 36, batch 600, train_loss[loss=2.986, NarTop10Accuracy=0.7288, over 5772.00 frames. ], tot_loss[loss=3.02, NarTop10Accuracy=0.7217, over 5649.62 frames. ], batch size: 9, lr: 2.37e-03 2024-08-06 22:27:10,990 INFO [trainer.py:765] (0/8) Epoch 36, batch 700, train_loss[loss=3.288, NarTop10Accuracy=0.6613, over 5058.00 frames. ], tot_loss[loss=3.018, NarTop10Accuracy=0.722, over 5711.49 frames. ], batch size: 6, lr: 2.37e-03 2024-08-06 22:27:44,914 INFO [trainer.py:765] (0/8) Epoch 36, batch 800, train_loss[loss=3.369, NarTop10Accuracy=0.6516, over 5157.00 frames. ], tot_loss[loss=3.035, NarTop10Accuracy=0.7184, over 5755.67 frames. ], batch size: 6, lr: 2.37e-03 2024-08-06 22:28:17,811 INFO [trainer.py:765] (0/8) Epoch 36, batch 900, train_loss[loss=2.794, NarTop10Accuracy=0.7714, over 6612.00 frames. ], tot_loss[loss=3.023, NarTop10Accuracy=0.721, over 5797.24 frames. ], batch size: 14, lr: 2.37e-03 2024-08-06 22:28:56,983 INFO [trainer.py:765] (0/8) Epoch 36, batch 1000, train_loss[loss=3.364, NarTop10Accuracy=0.6556, over 6609.00 frames. ], tot_loss[loss=3.024, NarTop10Accuracy=0.7208, over 5916.29 frames. ], batch size: 14, lr: 2.37e-03 2024-08-06 22:29:29,364 INFO [trainer.py:765] (0/8) Epoch 36, batch 1100, train_loss[loss=2.8, NarTop10Accuracy=0.774, over 6861.00 frames. ], tot_loss[loss=3.031, NarTop10Accuracy=0.7193, over 5940.63 frames. ], batch size: 17, lr: 2.36e-03 2024-08-06 22:30:05,680 INFO [trainer.py:765] (0/8) Epoch 36, batch 1200, train_loss[loss=3.103, NarTop10Accuracy=0.7028, over 6993.00 frames. ], tot_loss[loss=3.024, NarTop10Accuracy=0.7203, over 5935.40 frames. ], batch size: 31, lr: 2.36e-03 2024-08-06 22:30:42,575 INFO [trainer.py:765] (0/8) Epoch 36, batch 1300, train_loss[loss=2.951, NarTop10Accuracy=0.7383, over 4323.00 frames. ], tot_loss[loss=3.031, NarTop10Accuracy=0.7188, over 5995.39 frames. ], batch size: 5, lr: 2.36e-03 2024-08-06 22:31:15,938 INFO [trainer.py:765] (0/8) Epoch 36, batch 1400, train_loss[loss=3.23, NarTop10Accuracy=0.6899, over 6072.00 frames. ], tot_loss[loss=3.019, NarTop10Accuracy=0.7214, over 6015.57 frames. ], batch size: 11, lr: 2.36e-03 2024-08-06 22:31:43,748 INFO [trainer.py:765] (0/8) Epoch 36, batch 1500, train_loss[loss=3.454, NarTop10Accuracy=0.6235, over 6333.00 frames. ], tot_loss[loss=3.018, NarTop10Accuracy=0.7218, over 5956.44 frames. ], batch size: 50, lr: 2.36e-03 2024-08-06 22:32:11,459 INFO [trainer.py:765] (0/8) Epoch 36, batch 1600, train_loss[loss=3.425, NarTop10Accuracy=0.6343, over 7044.00 frames. ], tot_loss[loss=3.021, NarTop10Accuracy=0.721, over 5946.79 frames. ], batch size: 22, lr: 2.36e-03 2024-08-06 22:32:38,108 INFO [trainer.py:765] (0/8) Epoch 36, batch 1700, train_loss[loss=3.395, NarTop10Accuracy=0.6444, over 6201.00 frames. ], tot_loss[loss=3.03, NarTop10Accuracy=0.7194, over 5925.71 frames. ], batch size: 13, lr: 2.36e-03 2024-08-06 22:33:04,554 INFO [trainer.py:765] (0/8) Epoch 36, batch 1800, train_loss[loss=3.112, NarTop10Accuracy=0.6956, over 6993.00 frames. ], tot_loss[loss=3.033, NarTop10Accuracy=0.7189, over 5976.36 frames. ], batch size: 22, lr: 2.36e-03 2024-08-06 22:33:15,169 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 22:33:23,567 INFO [trainer.py:811] (0/8) Epoch 36, validation: loss=2.897, NarTop10Accuracy=0.7457, over 1905321.00 frames. 2024-08-06 22:33:23,568 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30377MB 2024-08-06 22:33:24,096 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.876e+02 2.309e+02 2.476e+02 2.664e+02 4.811e+02, threshold=4.951e+02, percent-clipped=0.0 2024-08-06 22:33:39,456 INFO [trainer.py:765] (0/8) Epoch 36, batch 1900, train_loss[loss=2.982, NarTop10Accuracy=0.7291, over 6366.00 frames. ], tot_loss[loss=3.038, NarTop10Accuracy=0.7181, over 6012.50 frames. ], batch size: 50, lr: 2.35e-03 2024-08-06 22:34:05,077 INFO [trainer.py:765] (0/8) Epoch 36, batch 2000, train_loss[loss=3.241, NarTop10Accuracy=0.6798, over 6261.00 frames. ], tot_loss[loss=3.031, NarTop10Accuracy=0.7195, over 5991.45 frames. ], batch size: 51, lr: 2.35e-03 2024-08-06 22:34:30,514 INFO [trainer.py:765] (0/8) Epoch 36, batch 2100, train_loss[loss=2.767, NarTop10Accuracy=0.7699, over 4893.00 frames. ], tot_loss[loss=3.026, NarTop10Accuracy=0.7206, over 5990.26 frames. ], batch size: 5, lr: 2.35e-03 2024-08-06 22:34:55,938 INFO [trainer.py:765] (0/8) Epoch 36, batch 2200, train_loss[loss=3.327, NarTop10Accuracy=0.654, over 7185.00 frames. ], tot_loss[loss=3.046, NarTop10Accuracy=0.7162, over 6029.57 frames. ], batch size: 31, lr: 2.35e-03 2024-08-06 22:35:21,146 INFO [trainer.py:765] (0/8) Epoch 36, batch 2300, train_loss[loss=3.461, NarTop10Accuracy=0.6395, over 5634.00 frames. ], tot_loss[loss=3.057, NarTop10Accuracy=0.7138, over 6038.87 frames. ], batch size: 9, lr: 2.35e-03 2024-08-06 22:35:45,601 INFO [trainer.py:765] (0/8) Epoch 36, batch 2400, train_loss[loss=3.235, NarTop10Accuracy=0.6767, over 5166.00 frames. ], tot_loss[loss=3.045, NarTop10Accuracy=0.716, over 5792.30 frames. ], batch size: 7, lr: 2.35e-03 2024-08-06 22:36:09,182 INFO [trainer.py:765] (0/8) Epoch 36, batch 2500, train_loss[loss=2.779, NarTop10Accuracy=0.7636, over 5145.00 frames. ], tot_loss[loss=3.027, NarTop10Accuracy=0.7195, over 5485.34 frames. ], batch size: 7, lr: 2.35e-03 2024-08-06 22:36:29,315 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 22:36:29,318 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-36.pt 2024-08-06 22:37:29,727 INFO [trainer.py:765] (0/8) Epoch 37, batch 100, train_loss[loss=2.78, NarTop10Accuracy=0.7697, over 7221.00 frames. ], tot_loss[loss=3.039, NarTop10Accuracy=0.7174, over 2350.44 frames. ], batch size: 31, lr: 2.31e-03 2024-08-06 22:38:01,273 INFO [trainer.py:765] (0/8) Epoch 37, batch 200, train_loss[loss=2.687, NarTop10Accuracy=0.7829, over 6858.00 frames. ], tot_loss[loss=3.021, NarTop10Accuracy=0.7211, over 3860.82 frames. ], batch size: 17, lr: 2.31e-03 2024-08-06 22:38:35,958 INFO [trainer.py:765] (0/8) Epoch 37, batch 300, train_loss[loss=3.213, NarTop10Accuracy=0.6845, over 7032.00 frames. ], tot_loss[loss=3.013, NarTop10Accuracy=0.7226, over 4654.83 frames. ], batch size: 22, lr: 2.31e-03 2024-08-06 22:39:09,308 INFO [trainer.py:765] (0/8) Epoch 37, batch 400, train_loss[loss=2.679, NarTop10Accuracy=0.7919, over 5025.00 frames. ], tot_loss[loss=3.003, NarTop10Accuracy=0.7247, over 5120.46 frames. ], batch size: 7, lr: 2.31e-03 2024-08-06 22:39:43,862 INFO [trainer.py:765] (0/8) Epoch 37, batch 500, train_loss[loss=3.388, NarTop10Accuracy=0.6394, over 6054.00 frames. ], tot_loss[loss=3, NarTop10Accuracy=0.7249, over 5391.90 frames. ], batch size: 11, lr: 2.31e-03 2024-08-06 22:40:17,335 INFO [trainer.py:765] (0/8) Epoch 37, batch 600, train_loss[loss=2.735, NarTop10Accuracy=0.7772, over 5790.00 frames. ], tot_loss[loss=3.01, NarTop10Accuracy=0.723, over 5649.52 frames. ], batch size: 9, lr: 2.31e-03 2024-08-06 22:40:51,618 INFO [trainer.py:765] (0/8) Epoch 37, batch 700, train_loss[loss=3.144, NarTop10Accuracy=0.702, over 4293.00 frames. ], tot_loss[loss=3.034, NarTop10Accuracy=0.7186, over 5733.19 frames. ], batch size: 5, lr: 2.30e-03 2024-08-06 22:41:30,566 INFO [trainer.py:765] (0/8) Epoch 37, batch 800, train_loss[loss=2.811, NarTop10Accuracy=0.7619, over 4956.00 frames. ], tot_loss[loss=3.04, NarTop10Accuracy=0.7171, over 5778.52 frames. ], batch size: 6, lr: 2.30e-03 2024-08-06 22:41:59,085 INFO [trainer.py:765] (0/8) Epoch 37, batch 900, train_loss[loss=2.905, NarTop10Accuracy=0.75, over 6162.00 frames. ], tot_loss[loss=3.025, NarTop10Accuracy=0.7202, over 5804.61 frames. ], batch size: 13, lr: 2.30e-03 2024-08-06 22:42:38,268 INFO [trainer.py:765] (0/8) Epoch 37, batch 1000, train_loss[loss=3.279, NarTop10Accuracy=0.6652, over 6561.00 frames. ], tot_loss[loss=3.042, NarTop10Accuracy=0.7165, over 5892.00 frames. ], batch size: 14, lr: 2.30e-03 2024-08-06 22:43:15,907 INFO [trainer.py:765] (0/8) Epoch 37, batch 1100, train_loss[loss=3.076, NarTop10Accuracy=0.7112, over 6969.00 frames. ], tot_loss[loss=3.043, NarTop10Accuracy=0.7166, over 5940.44 frames. ], batch size: 17, lr: 2.30e-03 2024-08-06 22:43:47,740 INFO [trainer.py:765] (0/8) Epoch 37, batch 1200, train_loss[loss=2.906, NarTop10Accuracy=0.7434, over 7488.00 frames. ], tot_loss[loss=3.044, NarTop10Accuracy=0.7164, over 5939.64 frames. ], batch size: 31, lr: 2.30e-03 2024-08-06 22:44:11,755 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 22:44:20,075 INFO [trainer.py:811] (0/8) Epoch 37, validation: loss=2.92, NarTop10Accuracy=0.7407, over 1905321.00 frames. 2024-08-06 22:44:20,076 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30377MB 2024-08-06 22:44:20,606 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.887e+02 2.309e+02 2.481e+02 2.647e+02 8.766e+02, threshold=4.961e+02, percent-clipped=0.1 2024-08-06 22:44:32,785 INFO [trainer.py:765] (0/8) Epoch 37, batch 1300, train_loss[loss=2.863, NarTop10Accuracy=0.7552, over 5100.00 frames. ], tot_loss[loss=3.025, NarTop10Accuracy=0.7203, over 5999.56 frames. ], batch size: 6, lr: 2.30e-03 2024-08-06 22:45:10,388 INFO [trainer.py:765] (0/8) Epoch 37, batch 1400, train_loss[loss=2.72, NarTop10Accuracy=0.7755, over 6468.00 frames. ], tot_loss[loss=3.018, NarTop10Accuracy=0.7216, over 6016.13 frames. ], batch size: 12, lr: 2.30e-03 2024-08-06 22:45:40,512 INFO [trainer.py:765] (0/8) Epoch 37, batch 1500, train_loss[loss=2.947, NarTop10Accuracy=0.7393, over 6216.00 frames. ], tot_loss[loss=3.032, NarTop10Accuracy=0.7188, over 5949.66 frames. ], batch size: 50, lr: 2.29e-03 2024-08-06 22:46:08,438 INFO [trainer.py:765] (0/8) Epoch 37, batch 1600, train_loss[loss=3.359, NarTop10Accuracy=0.653, over 7104.00 frames. ], tot_loss[loss=3.036, NarTop10Accuracy=0.7178, over 5929.61 frames. ], batch size: 22, lr: 2.29e-03 2024-08-06 22:46:35,187 INFO [trainer.py:765] (0/8) Epoch 37, batch 1700, train_loss[loss=3.473, NarTop10Accuracy=0.63, over 6207.00 frames. ], tot_loss[loss=3.035, NarTop10Accuracy=0.718, over 5916.42 frames. ], batch size: 13, lr: 2.29e-03 2024-08-06 22:47:01,793 INFO [trainer.py:765] (0/8) Epoch 37, batch 1800, train_loss[loss=2.803, NarTop10Accuracy=0.7608, over 7101.00 frames. ], tot_loss[loss=3.035, NarTop10Accuracy=0.7181, over 5978.77 frames. ], batch size: 22, lr: 2.29e-03 2024-08-06 22:47:28,312 INFO [trainer.py:765] (0/8) Epoch 37, batch 1900, train_loss[loss=3.176, NarTop10Accuracy=0.6877, over 6186.00 frames. ], tot_loss[loss=3.041, NarTop10Accuracy=0.7173, over 6020.79 frames. ], batch size: 51, lr: 2.29e-03 2024-08-06 22:47:53,925 INFO [trainer.py:765] (0/8) Epoch 37, batch 2000, train_loss[loss=3.102, NarTop10Accuracy=0.7024, over 6123.00 frames. ], tot_loss[loss=3.035, NarTop10Accuracy=0.7184, over 5999.03 frames. ], batch size: 50, lr: 2.29e-03 2024-08-06 22:48:19,326 INFO [trainer.py:765] (0/8) Epoch 37, batch 2100, train_loss[loss=2.893, NarTop10Accuracy=0.7338, over 3894.00 frames. ], tot_loss[loss=3.035, NarTop10Accuracy=0.7181, over 5967.60 frames. ], batch size: 4, lr: 2.29e-03 2024-08-06 22:48:44,708 INFO [trainer.py:765] (0/8) Epoch 37, batch 2200, train_loss[loss=2.919, NarTop10Accuracy=0.7389, over 7311.00 frames. ], tot_loss[loss=3.041, NarTop10Accuracy=0.7171, over 6001.26 frames. ], batch size: 32, lr: 2.29e-03 2024-08-06 22:49:09,913 INFO [trainer.py:765] (0/8) Epoch 37, batch 2300, train_loss[loss=2.755, NarTop10Accuracy=0.7755, over 5577.00 frames. ], tot_loss[loss=3.043, NarTop10Accuracy=0.7166, over 6007.28 frames. ], batch size: 9, lr: 2.29e-03 2024-08-06 22:49:34,318 INFO [trainer.py:765] (0/8) Epoch 37, batch 2400, train_loss[loss=3.133, NarTop10Accuracy=0.6955, over 5229.00 frames. ], tot_loss[loss=3.024, NarTop10Accuracy=0.7203, over 5762.43 frames. ], batch size: 7, lr: 2.28e-03 2024-08-06 22:49:57,861 INFO [trainer.py:765] (0/8) Epoch 37, batch 2500, train_loss[loss=3.159, NarTop10Accuracy=0.6838, over 5013.00 frames. ], tot_loss[loss=2.99, NarTop10Accuracy=0.727, over 5454.12 frames. ], batch size: 7, lr: 2.28e-03 2024-08-06 22:50:18,222 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 22:50:18,227 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-37.pt 2024-08-06 22:51:16,151 INFO [trainer.py:765] (0/8) Epoch 38, batch 100, train_loss[loss=3.123, NarTop10Accuracy=0.699, over 7104.00 frames. ], tot_loss[loss=3.029, NarTop10Accuracy=0.7205, over 2378.19 frames. ], batch size: 31, lr: 2.25e-03 2024-08-06 22:51:53,014 INFO [trainer.py:765] (0/8) Epoch 38, batch 200, train_loss[loss=3.293, NarTop10Accuracy=0.6651, over 6765.00 frames. ], tot_loss[loss=3.026, NarTop10Accuracy=0.72, over 3849.23 frames. ], batch size: 17, lr: 2.25e-03 2024-08-06 22:52:25,202 INFO [trainer.py:765] (0/8) Epoch 38, batch 300, train_loss[loss=2.911, NarTop10Accuracy=0.7498, over 7146.00 frames. ], tot_loss[loss=3.038, NarTop10Accuracy=0.7174, over 4652.97 frames. ], batch size: 22, lr: 2.25e-03 2024-08-06 22:52:55,626 INFO [trainer.py:765] (0/8) Epoch 38, batch 400, train_loss[loss=3.239, NarTop10Accuracy=0.6696, over 5247.00 frames. ], tot_loss[loss=3.02, NarTop10Accuracy=0.7208, over 5089.75 frames. ], batch size: 7, lr: 2.25e-03 2024-08-06 22:53:32,228 INFO [trainer.py:765] (0/8) Epoch 38, batch 500, train_loss[loss=2.744, NarTop10Accuracy=0.7815, over 6165.00 frames. ], tot_loss[loss=2.986, NarTop10Accuracy=0.7277, over 5359.36 frames. ], batch size: 11, lr: 2.25e-03 2024-08-06 22:54:05,497 INFO [trainer.py:765] (0/8) Epoch 38, batch 600, train_loss[loss=3.135, NarTop10Accuracy=0.7017, over 5562.00 frames. ], tot_loss[loss=3.001, NarTop10Accuracy=0.7248, over 5641.47 frames. ], batch size: 9, lr: 2.24e-03 2024-08-06 22:54:36,002 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 22:54:43,918 INFO [trainer.py:811] (0/8) Epoch 38, validation: loss=2.939, NarTop10Accuracy=0.7369, over 1905321.00 frames. 2024-08-06 22:54:43,919 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30377MB 2024-08-06 22:54:44,427 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.880e+02 2.313e+02 2.478e+02 2.663e+02 7.254e+02, threshold=4.957e+02, percent-clipped=0.3 2024-08-06 22:54:46,658 INFO [trainer.py:765] (0/8) Epoch 38, batch 700, train_loss[loss=2.777, NarTop10Accuracy=0.765, over 5193.00 frames. ], tot_loss[loss=3.004, NarTop10Accuracy=0.7243, over 5730.92 frames. ], batch size: 6, lr: 2.24e-03 2024-08-06 22:55:24,937 INFO [trainer.py:765] (0/8) Epoch 38, batch 800, train_loss[loss=3.06, NarTop10Accuracy=0.7129, over 4221.00 frames. ], tot_loss[loss=3.022, NarTop10Accuracy=0.7206, over 5759.80 frames. ], batch size: 5, lr: 2.24e-03 2024-08-06 22:55:59,702 INFO [trainer.py:765] (0/8) Epoch 38, batch 900, train_loss[loss=2.898, NarTop10Accuracy=0.7474, over 6255.00 frames. ], tot_loss[loss=3.016, NarTop10Accuracy=0.7218, over 5794.93 frames. ], batch size: 13, lr: 2.24e-03 2024-08-06 22:56:32,090 INFO [trainer.py:765] (0/8) Epoch 38, batch 1000, train_loss[loss=3.243, NarTop10Accuracy=0.6749, over 6093.00 frames. ], tot_loss[loss=3.018, NarTop10Accuracy=0.7213, over 5920.90 frames. ], batch size: 13, lr: 2.24e-03 2024-08-06 22:57:08,990 INFO [trainer.py:765] (0/8) Epoch 38, batch 1100, train_loss[loss=3.215, NarTop10Accuracy=0.6852, over 6822.00 frames. ], tot_loss[loss=3.038, NarTop10Accuracy=0.7172, over 5937.97 frames. ], batch size: 17, lr: 2.24e-03 2024-08-06 22:57:42,661 INFO [trainer.py:765] (0/8) Epoch 38, batch 1200, train_loss[loss=2.826, NarTop10Accuracy=0.7552, over 7251.00 frames. ], tot_loss[loss=3.041, NarTop10Accuracy=0.7169, over 5936.21 frames. ], batch size: 32, lr: 2.24e-03 2024-08-06 22:58:16,545 INFO [trainer.py:765] (0/8) Epoch 38, batch 1300, train_loss[loss=3.38, NarTop10Accuracy=0.6484, over 5181.00 frames. ], tot_loss[loss=3.045, NarTop10Accuracy=0.7162, over 5993.46 frames. ], batch size: 6, lr: 2.24e-03 2024-08-06 22:58:49,810 INFO [trainer.py:765] (0/8) Epoch 38, batch 1400, train_loss[loss=2.938, NarTop10Accuracy=0.7425, over 5955.00 frames. ], tot_loss[loss=3.061, NarTop10Accuracy=0.7129, over 6016.31 frames. ], batch size: 11, lr: 2.23e-03 2024-08-06 22:59:22,854 INFO [trainer.py:765] (0/8) Epoch 38, batch 1500, train_loss[loss=3.513, NarTop10Accuracy=0.6256, over 6171.00 frames. ], tot_loss[loss=3.044, NarTop10Accuracy=0.7164, over 5955.23 frames. ], batch size: 50, lr: 2.23e-03 2024-08-06 22:59:50,643 INFO [trainer.py:765] (0/8) Epoch 38, batch 1600, train_loss[loss=3.358, NarTop10Accuracy=0.6577, over 7047.00 frames. ], tot_loss[loss=3.041, NarTop10Accuracy=0.717, over 5935.26 frames. ], batch size: 22, lr: 2.23e-03 2024-08-06 23:00:17,314 INFO [trainer.py:765] (0/8) Epoch 38, batch 1700, train_loss[loss=3.08, NarTop10Accuracy=0.7179, over 6702.00 frames. ], tot_loss[loss=3.06, NarTop10Accuracy=0.7132, over 5925.47 frames. ], batch size: 14, lr: 2.23e-03 2024-08-06 23:00:43,763 INFO [trainer.py:765] (0/8) Epoch 38, batch 1800, train_loss[loss=3.177, NarTop10Accuracy=0.6918, over 6960.00 frames. ], tot_loss[loss=3.051, NarTop10Accuracy=0.7147, over 5974.34 frames. ], batch size: 22, lr: 2.23e-03 2024-08-06 23:01:10,191 INFO [trainer.py:765] (0/8) Epoch 38, batch 1900, train_loss[loss=3.53, NarTop10Accuracy=0.6258, over 5526.00 frames. ], tot_loss[loss=3.058, NarTop10Accuracy=0.7134, over 6006.09 frames. ], batch size: 50, lr: 2.23e-03 2024-08-06 23:01:35,681 INFO [trainer.py:765] (0/8) Epoch 38, batch 2000, train_loss[loss=3.324, NarTop10Accuracy=0.6584, over 6408.00 frames. ], tot_loss[loss=3.053, NarTop10Accuracy=0.7139, over 5990.88 frames. ], batch size: 53, lr: 2.23e-03 2024-08-06 23:02:01,050 INFO [trainer.py:765] (0/8) Epoch 38, batch 2100, train_loss[loss=2.968, NarTop10Accuracy=0.7213, over 4068.00 frames. ], tot_loss[loss=3.039, NarTop10Accuracy=0.717, over 5963.07 frames. ], batch size: 4, lr: 2.23e-03 2024-08-06 23:02:26,314 INFO [trainer.py:765] (0/8) Epoch 38, batch 2200, train_loss[loss=2.795, NarTop10Accuracy=0.7693, over 7173.00 frames. ], tot_loss[loss=3.034, NarTop10Accuracy=0.7183, over 5996.84 frames. ], batch size: 31, lr: 2.23e-03 2024-08-06 23:02:51,419 INFO [trainer.py:765] (0/8) Epoch 38, batch 2300, train_loss[loss=2.701, NarTop10Accuracy=0.7893, over 5748.00 frames. ], tot_loss[loss=3.038, NarTop10Accuracy=0.7177, over 6015.64 frames. ], batch size: 9, lr: 2.22e-03 2024-08-06 23:03:16,347 INFO [trainer.py:765] (0/8) Epoch 38, batch 2400, train_loss[loss=2.751, NarTop10Accuracy=0.7736, over 5118.00 frames. ], tot_loss[loss=3.027, NarTop10Accuracy=0.7199, over 5778.36 frames. ], batch size: 7, lr: 2.22e-03 2024-08-06 23:03:39,824 INFO [trainer.py:765] (0/8) Epoch 38, batch 2500, train_loss[loss=3.364, NarTop10Accuracy=0.641, over 5094.00 frames. ], tot_loss[loss=3.008, NarTop10Accuracy=0.7234, over 5487.70 frames. ], batch size: 7, lr: 2.22e-03 2024-08-06 23:03:59,636 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 23:03:59,638 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-38.pt 2024-08-06 23:04:58,941 INFO [trainer.py:765] (0/8) Epoch 39, batch 100, train_loss[loss=3.322, NarTop10Accuracy=0.6577, over 7263.00 frames. ], tot_loss[loss=2.993, NarTop10Accuracy=0.7274, over 2366.12 frames. ], batch size: 31, lr: 2.19e-03 2024-08-06 23:05:03,469 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 23:05:11,563 INFO [trainer.py:811] (0/8) Epoch 39, validation: loss=2.9, NarTop10Accuracy=0.7445, over 1905321.00 frames. 2024-08-06 23:05:11,564 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30377MB 2024-08-06 23:05:12,137 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.911e+02 2.316e+02 2.500e+02 2.688e+02 4.683e+02, threshold=5.001e+02, percent-clipped=0.0 2024-08-06 23:05:40,163 INFO [trainer.py:765] (0/8) Epoch 39, batch 200, train_loss[loss=2.758, NarTop10Accuracy=0.7693, over 6672.00 frames. ], tot_loss[loss=2.997, NarTop10Accuracy=0.7266, over 3845.01 frames. ], batch size: 17, lr: 2.19e-03 2024-08-06 23:06:17,293 INFO [trainer.py:765] (0/8) Epoch 39, batch 300, train_loss[loss=3.016, NarTop10Accuracy=0.7265, over 7113.00 frames. ], tot_loss[loss=2.991, NarTop10Accuracy=0.7275, over 4651.05 frames. ], batch size: 22, lr: 2.19e-03 2024-08-06 23:06:48,276 INFO [trainer.py:765] (0/8) Epoch 39, batch 400, train_loss[loss=2.899, NarTop10Accuracy=0.7433, over 5766.00 frames. ], tot_loss[loss=2.987, NarTop10Accuracy=0.7285, over 5101.46 frames. ], batch size: 8, lr: 2.19e-03 2024-08-06 23:07:19,175 INFO [trainer.py:765] (0/8) Epoch 39, batch 500, train_loss[loss=3.358, NarTop10Accuracy=0.6523, over 6054.00 frames. ], tot_loss[loss=2.994, NarTop10Accuracy=0.7265, over 5384.28 frames. ], batch size: 11, lr: 2.19e-03 2024-08-06 23:07:52,563 INFO [trainer.py:765] (0/8) Epoch 39, batch 600, train_loss[loss=2.662, NarTop10Accuracy=0.7865, over 5790.00 frames. ], tot_loss[loss=3.014, NarTop10Accuracy=0.7225, over 5649.98 frames. ], batch size: 9, lr: 2.19e-03 2024-08-06 23:08:33,695 INFO [trainer.py:765] (0/8) Epoch 39, batch 700, train_loss[loss=3.172, NarTop10Accuracy=0.6775, over 4263.00 frames. ], tot_loss[loss=3.024, NarTop10Accuracy=0.7204, over 5728.22 frames. ], batch size: 5, lr: 2.18e-03 2024-08-06 23:09:05,861 INFO [trainer.py:765] (0/8) Epoch 39, batch 800, train_loss[loss=2.719, NarTop10Accuracy=0.7768, over 5073.00 frames. ], tot_loss[loss=3.023, NarTop10Accuracy=0.7205, over 5794.26 frames. ], batch size: 6, lr: 2.18e-03 2024-08-06 23:09:38,866 INFO [trainer.py:765] (0/8) Epoch 39, batch 900, train_loss[loss=3.362, NarTop10Accuracy=0.6546, over 6588.00 frames. ], tot_loss[loss=3.015, NarTop10Accuracy=0.7221, over 5813.35 frames. ], batch size: 14, lr: 2.18e-03 2024-08-06 23:10:18,460 INFO [trainer.py:765] (0/8) Epoch 39, batch 1000, train_loss[loss=2.785, NarTop10Accuracy=0.7649, over 6162.00 frames. ], tot_loss[loss=3.008, NarTop10Accuracy=0.7237, over 5916.11 frames. ], batch size: 13, lr: 2.18e-03 2024-08-06 23:10:53,934 INFO [trainer.py:765] (0/8) Epoch 39, batch 1100, train_loss[loss=2.794, NarTop10Accuracy=0.7749, over 6846.00 frames. ], tot_loss[loss=3.024, NarTop10Accuracy=0.7202, over 5939.74 frames. ], batch size: 17, lr: 2.18e-03 2024-08-06 23:11:27,822 INFO [trainer.py:765] (0/8) Epoch 39, batch 1200, train_loss[loss=2.957, NarTop10Accuracy=0.7368, over 7392.00 frames. ], tot_loss[loss=3.009, NarTop10Accuracy=0.7231, over 5921.27 frames. ], batch size: 32, lr: 2.18e-03 2024-08-06 23:12:07,253 INFO [trainer.py:765] (0/8) Epoch 39, batch 1300, train_loss[loss=2.832, NarTop10Accuracy=0.763, over 4992.00 frames. ], tot_loss[loss=3.01, NarTop10Accuracy=0.7231, over 5983.71 frames. ], batch size: 6, lr: 2.18e-03 2024-08-06 23:12:39,302 INFO [trainer.py:765] (0/8) Epoch 39, batch 1400, train_loss[loss=3.031, NarTop10Accuracy=0.7158, over 6153.00 frames. ], tot_loss[loss=3.014, NarTop10Accuracy=0.722, over 6021.95 frames. ], batch size: 11, lr: 2.18e-03 2024-08-06 23:13:09,756 INFO [trainer.py:765] (0/8) Epoch 39, batch 1500, train_loss[loss=3.547, NarTop10Accuracy=0.6093, over 6093.00 frames. ], tot_loss[loss=3.009, NarTop10Accuracy=0.723, over 5965.83 frames. ], batch size: 50, lr: 2.18e-03 2024-08-06 23:13:37,586 INFO [trainer.py:765] (0/8) Epoch 39, batch 1600, train_loss[loss=2.909, NarTop10Accuracy=0.7423, over 6873.00 frames. ], tot_loss[loss=3.001, NarTop10Accuracy=0.7246, over 5947.17 frames. ], batch size: 22, lr: 2.17e-03 2024-08-06 23:14:04,220 INFO [trainer.py:765] (0/8) Epoch 39, batch 1700, train_loss[loss=3.37, NarTop10Accuracy=0.6405, over 6312.00 frames. ], tot_loss[loss=3.03, NarTop10Accuracy=0.7188, over 5939.65 frames. ], batch size: 13, lr: 2.17e-03 2024-08-06 23:14:30,768 INFO [trainer.py:765] (0/8) Epoch 39, batch 1800, train_loss[loss=2.849, NarTop10Accuracy=0.7575, over 7008.00 frames. ], tot_loss[loss=3.031, NarTop10Accuracy=0.7187, over 6009.21 frames. ], batch size: 22, lr: 2.17e-03 2024-08-06 23:14:57,180 INFO [trainer.py:765] (0/8) Epoch 39, batch 1900, train_loss[loss=2.985, NarTop10Accuracy=0.7342, over 5973.00 frames. ], tot_loss[loss=3.044, NarTop10Accuracy=0.7162, over 6043.25 frames. ], batch size: 50, lr: 2.17e-03 2024-08-06 23:15:22,751 INFO [trainer.py:765] (0/8) Epoch 39, batch 2000, train_loss[loss=3.317, NarTop10Accuracy=0.6617, over 6441.00 frames. ], tot_loss[loss=3.03, NarTop10Accuracy=0.7193, over 6013.44 frames. ], batch size: 50, lr: 2.17e-03 2024-08-06 23:15:48,060 INFO [trainer.py:765] (0/8) Epoch 39, batch 2100, train_loss[loss=3.343, NarTop10Accuracy=0.6575, over 4731.00 frames. ], tot_loss[loss=3.034, NarTop10Accuracy=0.7185, over 5999.68 frames. ], batch size: 5, lr: 2.17e-03 2024-08-06 23:15:51,871 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/checkpoint-100000.pt 2024-08-06 23:15:55,412 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 23:16:02,156 INFO [trainer.py:811] (0/8) Epoch 39, validation: loss=2.85, NarTop10Accuracy=0.7552, over 1905321.00 frames. 2024-08-06 23:16:02,156 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30377MB 2024-08-06 23:16:02,645 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.940e+02 2.369e+02 2.530e+02 2.720e+02 6.127e+02, threshold=5.059e+02, percent-clipped=0.2 2024-08-06 23:16:23,653 INFO [trainer.py:765] (0/8) Epoch 39, batch 2200, train_loss[loss=3.189, NarTop10Accuracy=0.6877, over 7389.00 frames. ], tot_loss[loss=3.029, NarTop10Accuracy=0.7198, over 6000.59 frames. ], batch size: 31, lr: 2.17e-03 2024-08-06 23:16:48,847 INFO [trainer.py:765] (0/8) Epoch 39, batch 2300, train_loss[loss=2.836, NarTop10Accuracy=0.7645, over 5706.00 frames. ], tot_loss[loss=3.045, NarTop10Accuracy=0.7168, over 6019.09 frames. ], batch size: 9, lr: 2.17e-03 2024-08-06 23:17:13,136 INFO [trainer.py:765] (0/8) Epoch 39, batch 2400, train_loss[loss=2.752, NarTop10Accuracy=0.7768, over 5133.00 frames. ], tot_loss[loss=3.023, NarTop10Accuracy=0.7211, over 5763.18 frames. ], batch size: 7, lr: 2.17e-03 2024-08-06 23:17:36,712 INFO [trainer.py:765] (0/8) Epoch 39, batch 2500, train_loss[loss=2.806, NarTop10Accuracy=0.7707, over 5133.00 frames. ], tot_loss[loss=2.997, NarTop10Accuracy=0.7256, over 5477.56 frames. ], batch size: 7, lr: 2.16e-03 2024-08-06 23:17:56,435 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 23:17:56,438 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-39.pt 2024-08-06 23:18:48,946 INFO [trainer.py:765] (0/8) Epoch 40, batch 100, train_loss[loss=2.964, NarTop10Accuracy=0.728, over 7278.00 frames. ], tot_loss[loss=2.993, NarTop10Accuracy=0.7263, over 2368.03 frames. ], batch size: 31, lr: 2.14e-03 2024-08-06 23:19:23,035 INFO [trainer.py:765] (0/8) Epoch 40, batch 200, train_loss[loss=2.576, NarTop10Accuracy=0.8148, over 6933.00 frames. ], tot_loss[loss=2.987, NarTop10Accuracy=0.7284, over 3875.96 frames. ], batch size: 17, lr: 2.13e-03 2024-08-06 23:19:57,187 INFO [trainer.py:765] (0/8) Epoch 40, batch 300, train_loss[loss=2.892, NarTop10Accuracy=0.7645, over 7047.00 frames. ], tot_loss[loss=3.006, NarTop10Accuracy=0.7241, over 4678.12 frames. ], batch size: 22, lr: 2.13e-03 2024-08-06 23:20:30,182 INFO [trainer.py:765] (0/8) Epoch 40, batch 400, train_loss[loss=2.812, NarTop10Accuracy=0.7721, over 5064.00 frames. ], tot_loss[loss=3.01, NarTop10Accuracy=0.7237, over 5121.13 frames. ], batch size: 7, lr: 2.13e-03 2024-08-06 23:21:00,250 INFO [trainer.py:765] (0/8) Epoch 40, batch 500, train_loss[loss=2.719, NarTop10Accuracy=0.7716, over 6096.00 frames. ], tot_loss[loss=3.015, NarTop10Accuracy=0.7224, over 5396.55 frames. ], batch size: 11, lr: 2.13e-03 2024-08-06 23:21:34,881 INFO [trainer.py:765] (0/8) Epoch 40, batch 600, train_loss[loss=2.975, NarTop10Accuracy=0.7358, over 5670.00 frames. ], tot_loss[loss=3.002, NarTop10Accuracy=0.7248, over 5653.13 frames. ], batch size: 9, lr: 2.13e-03 2024-08-06 23:22:11,097 INFO [trainer.py:765] (0/8) Epoch 40, batch 700, train_loss[loss=3.177, NarTop10Accuracy=0.6928, over 5022.00 frames. ], tot_loss[loss=3.007, NarTop10Accuracy=0.7236, over 5705.80 frames. ], batch size: 6, lr: 2.13e-03 2024-08-06 23:22:44,753 INFO [trainer.py:765] (0/8) Epoch 40, batch 800, train_loss[loss=2.687, NarTop10Accuracy=0.791, over 4992.00 frames. ], tot_loss[loss=3.023, NarTop10Accuracy=0.7207, over 5781.59 frames. ], batch size: 6, lr: 2.13e-03 2024-08-06 23:23:16,635 INFO [trainer.py:765] (0/8) Epoch 40, batch 900, train_loss[loss=3.46, NarTop10Accuracy=0.6333, over 6288.00 frames. ], tot_loss[loss=3.018, NarTop10Accuracy=0.7213, over 5786.92 frames. ], batch size: 13, lr: 2.13e-03 2024-08-06 23:23:55,591 INFO [trainer.py:765] (0/8) Epoch 40, batch 1000, train_loss[loss=3.377, NarTop10Accuracy=0.6535, over 6666.00 frames. ], tot_loss[loss=3.025, NarTop10Accuracy=0.7201, over 5896.48 frames. ], batch size: 14, lr: 2.13e-03 2024-08-06 23:24:30,208 INFO [trainer.py:765] (0/8) Epoch 40, batch 1100, train_loss[loss=2.757, NarTop10Accuracy=0.7834, over 7080.00 frames. ], tot_loss[loss=3.025, NarTop10Accuracy=0.7202, over 5955.34 frames. ], batch size: 17, lr: 2.12e-03 2024-08-06 23:25:03,090 INFO [trainer.py:765] (0/8) Epoch 40, batch 1200, train_loss[loss=2.994, NarTop10Accuracy=0.7242, over 6966.00 frames. ], tot_loss[loss=3.018, NarTop10Accuracy=0.7215, over 5934.81 frames. ], batch size: 31, lr: 2.12e-03 2024-08-06 23:25:41,842 INFO [trainer.py:765] (0/8) Epoch 40, batch 1300, train_loss[loss=2.705, NarTop10Accuracy=0.7792, over 5040.00 frames. ], tot_loss[loss=3.01, NarTop10Accuracy=0.7232, over 5999.37 frames. ], batch size: 6, lr: 2.12e-03 2024-08-06 23:26:13,384 INFO [trainer.py:765] (0/8) Epoch 40, batch 1400, train_loss[loss=2.913, NarTop10Accuracy=0.7478, over 6195.00 frames. ], tot_loss[loss=3.021, NarTop10Accuracy=0.721, over 6028.53 frames. ], batch size: 11, lr: 2.12e-03 2024-08-06 23:26:43,377 INFO [trainer.py:765] (0/8) Epoch 40, batch 1500, train_loss[loss=3.218, NarTop10Accuracy=0.6857, over 6384.00 frames. ], tot_loss[loss=3.012, NarTop10Accuracy=0.723, over 5977.39 frames. ], batch size: 53, lr: 2.12e-03 2024-08-06 23:26:54,419 INFO [trainer.py:803] (0/8) Computing validation loss 2024-08-06 23:27:02,676 INFO [trainer.py:811] (0/8) Epoch 40, validation: loss=2.86, NarTop10Accuracy=0.7522, over 1905321.00 frames. 2024-08-06 23:27:02,677 INFO [trainer.py:814] (0/8) Maximum memory allocated so far is 30377MB 2024-08-06 23:27:03,156 INFO [optim.py:386] (0/8) Clipping_scale=2.0, grad-norm quartiles 1.941e+02 2.329e+02 2.511e+02 2.723e+02 1.241e+03, threshold=5.022e+02, percent-clipped=0.2 2024-08-06 23:27:19,382 INFO [trainer.py:765] (0/8) Epoch 40, batch 1600, train_loss[loss=2.888, NarTop10Accuracy=0.7514, over 7035.00 frames. ], tot_loss[loss=3.018, NarTop10Accuracy=0.7215, over 5944.04 frames. ], batch size: 22, lr: 2.12e-03 2024-08-06 23:27:46,057 INFO [trainer.py:765] (0/8) Epoch 40, batch 1700, train_loss[loss=3.489, NarTop10Accuracy=0.6228, over 6528.00 frames. ], tot_loss[loss=3.024, NarTop10Accuracy=0.7209, over 5929.79 frames. ], batch size: 14, lr: 2.12e-03 2024-08-06 23:28:12,579 INFO [trainer.py:765] (0/8) Epoch 40, batch 1800, train_loss[loss=3.083, NarTop10Accuracy=0.7111, over 7194.00 frames. ], tot_loss[loss=3.003, NarTop10Accuracy=0.7253, over 6005.15 frames. ], batch size: 22, lr: 2.12e-03 2024-08-06 23:28:38,909 INFO [trainer.py:765] (0/8) Epoch 40, batch 1900, train_loss[loss=3.144, NarTop10Accuracy=0.6979, over 6165.00 frames. ], tot_loss[loss=3.007, NarTop10Accuracy=0.7241, over 6033.11 frames. ], batch size: 50, lr: 2.12e-03 2024-08-06 23:29:04,445 INFO [trainer.py:765] (0/8) Epoch 40, batch 2000, train_loss[loss=3.533, NarTop10Accuracy=0.6157, over 5862.00 frames. ], tot_loss[loss=3.009, NarTop10Accuracy=0.7233, over 6015.70 frames. ], batch size: 50, lr: 2.12e-03 2024-08-06 23:29:29,750 INFO [trainer.py:765] (0/8) Epoch 40, batch 2100, train_loss[loss=2.802, NarTop10Accuracy=0.771, over 4845.00 frames. ], tot_loss[loss=3.009, NarTop10Accuracy=0.723, over 5995.82 frames. ], batch size: 5, lr: 2.11e-03 2024-08-06 23:29:54,940 INFO [trainer.py:765] (0/8) Epoch 40, batch 2200, train_loss[loss=3.09, NarTop10Accuracy=0.7064, over 7146.00 frames. ], tot_loss[loss=3.022, NarTop10Accuracy=0.7204, over 6011.48 frames. ], batch size: 31, lr: 2.11e-03 2024-08-06 23:30:20,013 INFO [trainer.py:765] (0/8) Epoch 40, batch 2300, train_loss[loss=2.779, NarTop10Accuracy=0.7556, over 5679.00 frames. ], tot_loss[loss=3.031, NarTop10Accuracy=0.719, over 6015.89 frames. ], batch size: 9, lr: 2.11e-03 2024-08-06 23:30:44,296 INFO [trainer.py:765] (0/8) Epoch 40, batch 2400, train_loss[loss=2.797, NarTop10Accuracy=0.77, over 5193.00 frames. ], tot_loss[loss=3.025, NarTop10Accuracy=0.7202, over 5772.23 frames. ], batch size: 7, lr: 2.11e-03 2024-08-06 23:31:07,738 INFO [trainer.py:765] (0/8) Epoch 40, batch 2500, train_loss[loss=3.059, NarTop10Accuracy=0.7101, over 5094.00 frames. ], tot_loss[loss=2.992, NarTop10Accuracy=0.7266, over 5462.68 frames. ], batch size: 7, lr: 2.11e-03 2024-08-06 23:31:27,336 INFO [trainer.py:650] (0/8) Reaches end of dataloader. 2024-08-06 23:31:27,339 INFO [checkpoint.py:75] (0/8) Saving checkpoint to exp/valle/epoch-40.pt 2024-08-06 23:31:34,264 INFO [trainer.py:1069] (0/8) Done!