2023-10-06 13:16:43,590 INFO [train_bert_encoder.py:1464] (2/4) Training started 2023-10-06 13:16:43,590 INFO [train_bert_encoder.py:1485] (2/4) Device: cuda:2 2023-10-06 13:16:43,593 INFO [train_bert_encoder.py:1494] (2/4) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '2b2ac14b326d61d79d04e53fbd69b1ff6d630411', 'k2-git-date': 'Thu Aug 24 05:58:26 2023', 'lhotse-version': '1.17.0.dev+git.3dde48dc.clean', 'torch-version': '2.0.1+cu117', 'torch-cuda-available': True, 'torch-cuda-version': '11.7', 'python-version': '3.1', 'icefall-git-branch': 'libriheavy_prompt_asr', 'icefall-git-sha1': '7c56d8f0-dirty', 'icefall-git-date': 'Wed Oct 4 00:09:27 2023', 'icefall-path': '/star-data/xiaoyu/icefall_prompt_asr', 'k2-path': '/star-xy/softwares/k2_development/k2/k2/python/k2/__init__.py', 'lhotse-path': '/star-xy/softwares/lhotse_development/lhotse/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-2-0423201334-6587bbc68d-tn554', 'IP address': '10.177.74.211'}, 'world_size': 4, 'master_port': 13994, 'tensorboard': True, 'num_epochs': 60, 'start_epoch': 21, 'start_batch': 0, 'exp_dir': PosixPath('zipformer_prompt_asr/exp_medium_BERT_memory_layer_0_memory_drop_0.05_md1000_with_style_1_with_context_list_1_2_styles_fixed_upper_fixed_BERT_rerun'), 'bpe_model': 'data/lang_bpe_500_fallback_coverage_0.99/bpe.model', 'base_lr': 0.045, 'lr_batches': 7500, 'lr_epochs': 3.5, 'ref_duration': 600, 'prune_range': 5, 'lm_scale': 0.25, 'am_scale': 0.0, 'simple_loss_scale': 0.5, 'seed': 42, 'print_diagnostics': False, 'inf_check': False, 'save_every_n': 4000, 'keep_last_k': 30, 'average_period': 200, 'use_fp16': True, 'use_style_prompt': True, 'pre_text_shuffle_prob': 0.05, 'style_text_shuffle_prob': 0.2, 'prompt_mask_prob': 0.05, 'forced_upper_pre_text': False, 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'memory_dropout_rate': 0.05, 'memory_layer': 0, 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'context_size': 2, 'causal': False, 'chunk_size': '16,32,64,-1', 'left_context_frames': '64,128,256,-1', 'freeze_text_encoder': True, 'text_encoder_type': 'BERT', 'text_encoder_adapter': False, 'context_injection': False, 'context_dropout_rate': 0.05, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 1000, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'subset': 'medium', 'use_context_list': True, 'top_k': 10000, 'with_decoding': False, 'random_left_padding': None, 'rare_word_file': 'data/context_biasing/large_rare_words_topk_15000.txt', 'long_audio_cuts': 'data/manifest_npr/npr1_cuts_all_guids_0.jsonl.gz', 'blank_id': 0, 'vocab_size': 500} 2023-10-06 13:16:43,593 INFO [train_bert_encoder.py:1496] (2/4) About to create model 2023-10-06 13:16:52,250 INFO [train_bert_encoder.py:769] (2/4) Loading pre-trained BERT-base-cased as text encoder 2023-10-06 13:17:02,352 WARNING [_http.py:271] (2/4) '(MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /bert-base-cased/resolve/main/config.json (Caused by ConnectTimeoutError(, 'Connection to huggingface.co timed out. (connect timeout=10)'))"), '(Request ID: a7c5ae96-a4c2-4999-b82d-9bbacfafb5c2)')' thrown while requesting HEAD https://huggingface.co/bert-base-cased/resolve/main/config.json 2023-10-06 13:17:12,406 WARNING [_http.py:271] (2/4) '(MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /bert-base-cased/resolve/main/config.json (Caused by ConnectTimeoutError(, 'Connection to huggingface.co timed out. (connect timeout=10)'))"), '(Request ID: 35701982-d7d8-4e68-8c94-5b7b552e516a)')' thrown while requesting HEAD https://huggingface.co/bert-base-cased/resolve/main/config.json 2023-10-06 13:17:14,129 INFO [train_bert_encoder.py:856] (2/4) Num params in text encoder: 108310272 2023-10-06 13:17:24,222 WARNING [_http.py:271] (2/4) '(MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /bert-base-cased/resolve/main/vocab.txt (Caused by ConnectTimeoutError(, 'Connection to huggingface.co timed out. (connect timeout=10)'))"), '(Request ID: c0def217-aae0-4e30-8055-1c2a0d85b270)')' thrown while requesting HEAD https://huggingface.co/bert-base-cased/resolve/main/vocab.txt 2023-10-06 13:17:24,266 INFO [train_bert_encoder.py:1501] (2/4) Number of model parameters: 179038803 2023-10-06 13:17:24,266 INFO [checkpoint.py:112] (2/4) Loading checkpoint from zipformer_prompt_asr/exp_medium_BERT_memory_layer_0_memory_drop_0.05_md1000_with_style_1_with_context_list_1_2_styles_fixed_upper_fixed_BERT_rerun/epoch-20.pt 2023-10-06 13:17:30,332 INFO [train_bert_encoder.py:1516] (2/4) Using DDP 2023-10-06 13:17:31,115 INFO [train_bert_encoder.py:1521] (2/4) Freeze the parameters of text encoder and don't include them in the optimizer 2023-10-06 13:17:31,138 INFO [utils.py:1428] (2/4) Remove module.text_encoder.embeddings.word_embeddings.weight from parameters 2023-10-06 13:17:31,139 INFO [utils.py:1428] (2/4) Remove module.text_encoder.embeddings.position_embeddings.weight from parameters 2023-10-06 13:17:31,139 INFO [utils.py:1428] (2/4) Remove module.text_encoder.embeddings.token_type_embeddings.weight from parameters 2023-10-06 13:17:31,139 INFO [utils.py:1428] (2/4) Remove module.text_encoder.embeddings.LayerNorm.weight from parameters 2023-10-06 13:17:31,139 INFO [utils.py:1428] (2/4) Remove module.text_encoder.embeddings.LayerNorm.bias from parameters 2023-10-06 13:17:31,139 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.0.attention.self.query.weight from parameters 2023-10-06 13:17:31,139 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.0.attention.self.query.bias from parameters 2023-10-06 13:17:31,139 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.0.attention.self.key.weight from parameters 2023-10-06 13:17:31,139 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.0.attention.self.key.bias from parameters 2023-10-06 13:17:31,139 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.0.attention.self.value.weight from parameters 2023-10-06 13:17:31,139 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.0.attention.self.value.bias from parameters 2023-10-06 13:17:31,139 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.0.attention.output.dense.weight from parameters 2023-10-06 13:17:31,139 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.0.attention.output.dense.bias from parameters 2023-10-06 13:17:31,139 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.0.attention.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,139 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.0.attention.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,140 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.0.intermediate.dense.weight from parameters 2023-10-06 13:17:31,140 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.0.intermediate.dense.bias from parameters 2023-10-06 13:17:31,140 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.0.output.dense.weight from parameters 2023-10-06 13:17:31,140 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.0.output.dense.bias from parameters 2023-10-06 13:17:31,140 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.0.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,140 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.0.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,140 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.1.attention.self.query.weight from parameters 2023-10-06 13:17:31,140 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.1.attention.self.query.bias from parameters 2023-10-06 13:17:31,140 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.1.attention.self.key.weight from parameters 2023-10-06 13:17:31,140 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.1.attention.self.key.bias from parameters 2023-10-06 13:17:31,140 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.1.attention.self.value.weight from parameters 2023-10-06 13:17:31,140 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.1.attention.self.value.bias from parameters 2023-10-06 13:17:31,140 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.1.attention.output.dense.weight from parameters 2023-10-06 13:17:31,140 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.1.attention.output.dense.bias from parameters 2023-10-06 13:17:31,140 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.1.attention.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,141 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.1.attention.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,141 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.1.intermediate.dense.weight from parameters 2023-10-06 13:17:31,141 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.1.intermediate.dense.bias from parameters 2023-10-06 13:17:31,141 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.1.output.dense.weight from parameters 2023-10-06 13:17:31,141 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.1.output.dense.bias from parameters 2023-10-06 13:17:31,141 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.1.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,141 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.1.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,141 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.2.attention.self.query.weight from parameters 2023-10-06 13:17:31,141 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.2.attention.self.query.bias from parameters 2023-10-06 13:17:31,141 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.2.attention.self.key.weight from parameters 2023-10-06 13:17:31,141 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.2.attention.self.key.bias from parameters 2023-10-06 13:17:31,141 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.2.attention.self.value.weight from parameters 2023-10-06 13:17:31,141 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.2.attention.self.value.bias from parameters 2023-10-06 13:17:31,141 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.2.attention.output.dense.weight from parameters 2023-10-06 13:17:31,141 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.2.attention.output.dense.bias from parameters 2023-10-06 13:17:31,142 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.2.attention.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,142 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.2.attention.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,142 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.2.intermediate.dense.weight from parameters 2023-10-06 13:17:31,142 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.2.intermediate.dense.bias from parameters 2023-10-06 13:17:31,142 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.2.output.dense.weight from parameters 2023-10-06 13:17:31,142 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.2.output.dense.bias from parameters 2023-10-06 13:17:31,142 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.2.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,142 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.2.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,142 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.3.attention.self.query.weight from parameters 2023-10-06 13:17:31,142 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.3.attention.self.query.bias from parameters 2023-10-06 13:17:31,142 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.3.attention.self.key.weight from parameters 2023-10-06 13:17:31,142 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.3.attention.self.key.bias from parameters 2023-10-06 13:17:31,142 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.3.attention.self.value.weight from parameters 2023-10-06 13:17:31,142 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.3.attention.self.value.bias from parameters 2023-10-06 13:17:31,142 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.3.attention.output.dense.weight from parameters 2023-10-06 13:17:31,142 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.3.attention.output.dense.bias from parameters 2023-10-06 13:17:31,143 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.3.attention.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,143 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.3.attention.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,143 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.3.intermediate.dense.weight from parameters 2023-10-06 13:17:31,143 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.3.intermediate.dense.bias from parameters 2023-10-06 13:17:31,143 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.3.output.dense.weight from parameters 2023-10-06 13:17:31,143 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.3.output.dense.bias from parameters 2023-10-06 13:17:31,143 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.3.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,143 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.3.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,143 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.4.attention.self.query.weight from parameters 2023-10-06 13:17:31,143 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.4.attention.self.query.bias from parameters 2023-10-06 13:17:31,143 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.4.attention.self.key.weight from parameters 2023-10-06 13:17:31,143 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.4.attention.self.key.bias from parameters 2023-10-06 13:17:31,143 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.4.attention.self.value.weight from parameters 2023-10-06 13:17:31,143 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.4.attention.self.value.bias from parameters 2023-10-06 13:17:31,143 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.4.attention.output.dense.weight from parameters 2023-10-06 13:17:31,143 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.4.attention.output.dense.bias from parameters 2023-10-06 13:17:31,143 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.4.attention.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,144 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.4.attention.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,144 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.4.intermediate.dense.weight from parameters 2023-10-06 13:17:31,144 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.4.intermediate.dense.bias from parameters 2023-10-06 13:17:31,144 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.4.output.dense.weight from parameters 2023-10-06 13:17:31,144 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.4.output.dense.bias from parameters 2023-10-06 13:17:31,144 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.4.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,144 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.4.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,144 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.5.attention.self.query.weight from parameters 2023-10-06 13:17:31,144 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.5.attention.self.query.bias from parameters 2023-10-06 13:17:31,144 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.5.attention.self.key.weight from parameters 2023-10-06 13:17:31,144 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.5.attention.self.key.bias from parameters 2023-10-06 13:17:31,144 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.5.attention.self.value.weight from parameters 2023-10-06 13:17:31,144 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.5.attention.self.value.bias from parameters 2023-10-06 13:17:31,144 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.5.attention.output.dense.weight from parameters 2023-10-06 13:17:31,144 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.5.attention.output.dense.bias from parameters 2023-10-06 13:17:31,144 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.5.attention.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,145 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.5.attention.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,145 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.5.intermediate.dense.weight from parameters 2023-10-06 13:17:31,145 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.5.intermediate.dense.bias from parameters 2023-10-06 13:17:31,145 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.5.output.dense.weight from parameters 2023-10-06 13:17:31,145 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.5.output.dense.bias from parameters 2023-10-06 13:17:31,145 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.5.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,145 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.5.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,145 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.6.attention.self.query.weight from parameters 2023-10-06 13:17:31,145 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.6.attention.self.query.bias from parameters 2023-10-06 13:17:31,145 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.6.attention.self.key.weight from parameters 2023-10-06 13:17:31,145 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.6.attention.self.key.bias from parameters 2023-10-06 13:17:31,145 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.6.attention.self.value.weight from parameters 2023-10-06 13:17:31,145 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.6.attention.self.value.bias from parameters 2023-10-06 13:17:31,145 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.6.attention.output.dense.weight from parameters 2023-10-06 13:17:31,145 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.6.attention.output.dense.bias from parameters 2023-10-06 13:17:31,145 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.6.attention.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,145 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.6.attention.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,146 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.6.intermediate.dense.weight from parameters 2023-10-06 13:17:31,146 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.6.intermediate.dense.bias from parameters 2023-10-06 13:17:31,146 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.6.output.dense.weight from parameters 2023-10-06 13:17:31,146 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.6.output.dense.bias from parameters 2023-10-06 13:17:31,146 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.6.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,146 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.6.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,146 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.7.attention.self.query.weight from parameters 2023-10-06 13:17:31,146 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.7.attention.self.query.bias from parameters 2023-10-06 13:17:31,146 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.7.attention.self.key.weight from parameters 2023-10-06 13:17:31,146 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.7.attention.self.key.bias from parameters 2023-10-06 13:17:31,146 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.7.attention.self.value.weight from parameters 2023-10-06 13:17:31,146 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.7.attention.self.value.bias from parameters 2023-10-06 13:17:31,146 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.7.attention.output.dense.weight from parameters 2023-10-06 13:17:31,146 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.7.attention.output.dense.bias from parameters 2023-10-06 13:17:31,146 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.7.attention.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,146 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.7.attention.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,147 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.7.intermediate.dense.weight from parameters 2023-10-06 13:17:31,147 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.7.intermediate.dense.bias from parameters 2023-10-06 13:17:31,147 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.7.output.dense.weight from parameters 2023-10-06 13:17:31,147 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.7.output.dense.bias from parameters 2023-10-06 13:17:31,147 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.7.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,147 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.7.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,147 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.8.attention.self.query.weight from parameters 2023-10-06 13:17:31,147 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.8.attention.self.query.bias from parameters 2023-10-06 13:17:31,147 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.8.attention.self.key.weight from parameters 2023-10-06 13:17:31,147 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.8.attention.self.key.bias from parameters 2023-10-06 13:17:31,147 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.8.attention.self.value.weight from parameters 2023-10-06 13:17:31,147 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.8.attention.self.value.bias from parameters 2023-10-06 13:17:31,147 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.8.attention.output.dense.weight from parameters 2023-10-06 13:17:31,147 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.8.attention.output.dense.bias from parameters 2023-10-06 13:17:31,147 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.8.attention.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,147 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.8.attention.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,148 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.8.intermediate.dense.weight from parameters 2023-10-06 13:17:31,148 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.8.intermediate.dense.bias from parameters 2023-10-06 13:17:31,148 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.8.output.dense.weight from parameters 2023-10-06 13:17:31,148 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.8.output.dense.bias from parameters 2023-10-06 13:17:31,148 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.8.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,148 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.8.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,148 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.9.attention.self.query.weight from parameters 2023-10-06 13:17:31,148 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.9.attention.self.query.bias from parameters 2023-10-06 13:17:31,148 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.9.attention.self.key.weight from parameters 2023-10-06 13:17:31,148 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.9.attention.self.key.bias from parameters 2023-10-06 13:17:31,148 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.9.attention.self.value.weight from parameters 2023-10-06 13:17:31,148 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.9.attention.self.value.bias from parameters 2023-10-06 13:17:31,148 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.9.attention.output.dense.weight from parameters 2023-10-06 13:17:31,148 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.9.attention.output.dense.bias from parameters 2023-10-06 13:17:31,148 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.9.attention.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,148 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.9.attention.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,148 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.9.intermediate.dense.weight from parameters 2023-10-06 13:17:31,149 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.9.intermediate.dense.bias from parameters 2023-10-06 13:17:31,149 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.9.output.dense.weight from parameters 2023-10-06 13:17:31,149 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.9.output.dense.bias from parameters 2023-10-06 13:17:31,149 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.9.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,149 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.9.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,149 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.10.attention.self.query.weight from parameters 2023-10-06 13:17:31,149 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.10.attention.self.query.bias from parameters 2023-10-06 13:17:31,149 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.10.attention.self.key.weight from parameters 2023-10-06 13:17:31,149 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.10.attention.self.key.bias from parameters 2023-10-06 13:17:31,149 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.10.attention.self.value.weight from parameters 2023-10-06 13:17:31,149 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.10.attention.self.value.bias from parameters 2023-10-06 13:17:31,149 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.10.attention.output.dense.weight from parameters 2023-10-06 13:17:31,149 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.10.attention.output.dense.bias from parameters 2023-10-06 13:17:31,149 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.10.attention.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,149 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.10.attention.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,149 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.10.intermediate.dense.weight from parameters 2023-10-06 13:17:31,149 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.10.intermediate.dense.bias from parameters 2023-10-06 13:17:31,150 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.10.output.dense.weight from parameters 2023-10-06 13:17:31,150 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.10.output.dense.bias from parameters 2023-10-06 13:17:31,150 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.10.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,150 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.10.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,150 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.11.attention.self.query.weight from parameters 2023-10-06 13:17:31,150 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.11.attention.self.query.bias from parameters 2023-10-06 13:17:31,150 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.11.attention.self.key.weight from parameters 2023-10-06 13:17:31,150 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.11.attention.self.key.bias from parameters 2023-10-06 13:17:31,150 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.11.attention.self.value.weight from parameters 2023-10-06 13:17:31,150 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.11.attention.self.value.bias from parameters 2023-10-06 13:17:31,150 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.11.attention.output.dense.weight from parameters 2023-10-06 13:17:31,150 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.11.attention.output.dense.bias from parameters 2023-10-06 13:17:31,150 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.11.attention.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,150 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.11.attention.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,150 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.11.intermediate.dense.weight from parameters 2023-10-06 13:17:31,150 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.11.intermediate.dense.bias from parameters 2023-10-06 13:17:31,151 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.11.output.dense.weight from parameters 2023-10-06 13:17:31,151 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.11.output.dense.bias from parameters 2023-10-06 13:17:31,151 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.11.output.LayerNorm.weight from parameters 2023-10-06 13:17:31,151 INFO [utils.py:1428] (2/4) Remove module.text_encoder.encoder.layer.11.output.LayerNorm.bias from parameters 2023-10-06 13:17:31,151 INFO [utils.py:1428] (2/4) Remove module.text_encoder.pooler.dense.weight from parameters 2023-10-06 13:17:31,151 INFO [utils.py:1428] (2/4) Remove module.text_encoder.pooler.dense.bias from parameters 2023-10-06 13:17:31,152 INFO [train_bert_encoder.py:1538] (2/4) Loading optimizer state dict 2023-10-06 13:17:31,671 INFO [train_bert_encoder.py:1546] (2/4) Loading scheduler state dict 2023-10-06 13:17:31,751 INFO [asr_datamodule.py:447] (2/4) About to get medium cuts 2023-10-06 13:17:31,751 INFO [asr_datamodule.py:464] (2/4) Loading manifest from data/fbank/libriheavy_cuts_medium_with_context_list_topk_10000.jsonl.gz. 2023-10-06 13:17:31,751 INFO [train_bert_encoder.py:1615] (2/4) Text sampling: 2023-10-06 13:17:31,751 INFO [asr_datamodule.py:259] (2/4) Enable MUSAN 2023-10-06 13:17:31,751 INFO [asr_datamodule.py:260] (2/4) About to get Musan cuts 2023-10-06 13:17:33,655 INFO [asr_datamodule.py:284] (2/4) Enable SpecAugment 2023-10-06 13:17:33,655 INFO [asr_datamodule.py:285] (2/4) Time warp factor: 80 2023-10-06 13:17:33,655 INFO [asr_datamodule.py:295] (2/4) Num frame mask: 10 2023-10-06 13:17:33,655 INFO [asr_datamodule.py:308] (2/4) About to create train dataset 2023-10-06 13:17:33,655 INFO [asr_datamodule.py:338] (2/4) Using DynamicBucketingSampler. 2023-10-06 13:17:40,723 INFO [asr_datamodule.py:350] (2/4) About to create train dataloader 2023-10-06 13:17:40,724 INFO [asr_datamodule.py:470] (2/4) About to get dev cuts 2023-10-06 13:17:40,726 INFO [asr_datamodule.py:391] (2/4) About to create dev dataset 2023-10-06 13:17:41,070 INFO [asr_datamodule.py:412] (2/4) About to create dev dataloader 2023-10-06 13:17:41,071 INFO [train_bert_encoder.py:1641] (2/4) Loading grad scaler state dict 2023-10-06 13:18:11,284 INFO [train_bert_encoder.py:1393] (2/4) Epoch 21, batch 0, loss[loss=0.2832, simple_loss=0.3976, pruned_loss=0.08436, over 24328.00 frames. ], tot_loss[loss=0.2832, simple_loss=0.3976, pruned_loss=0.08436, over 24328.00 frames. ], batch size: 52, lr: 5.81e-03, grad_scale: 16.0 2023-10-06 13:18:11,285 INFO [train_bert_encoder.py:1418] (2/4) Computing validation loss 2023-10-06 13:18:35,338 INFO [train_bert_encoder.py:1136] (2/4) Pre texts: it is in your power! When his wife heard the music, she said: "Tomorrow he is gone, if God does not work a miracle in the night. Our inhospitableness has brought on just what we thought we could avoid." In the meantime little Ruster drove about in the snowstorm. He went from one house to the other and asked if there was any work for him to do, but he was not received anywhere. They did not even ask him to get out of the sledge. Some had their houses full of guests, others were going away on Christmas Day. "Drive to the next neighbor," they all said. He could come and spoil the pleasure of an ordinary day, but not of Christmas Eve. Christmas Eve came but once a year, and the children had been rejoicing in the thought of it all the autumn. They could not put that man at a table where there were children. Formerly they had been glad to see him, but not since he had become a drunkard. Where should they put the fellow, moreover? The servants' room was too plain and the guest-room too fine. 2023-10-06 13:18:35,338 INFO [train_bert_encoder.py:1137] (2/4) Ref texts: So little Ruster had to drive from house to house in the blinding snow. His wet moustache hung limply down over his mouth; his eyes were bloodshot and blurred, but the brandy was blown out of his brain. He began to wonder and to be amazed. Was it possible, was it possible that no one wished to receive him? Then all at once he saw himself. 2023-10-06 13:18:35,338 INFO [train_bert_encoder.py:1138] (2/4) Style texts: Mixed-case English transcription, with punctuation. Actually, it is fully not related. What do you think? 2023-10-06 13:18:41,315 INFO [train_bert_encoder.py:1148] (2/4) Shape of encoded texts: torch.Size([83, 300]) 2023-10-06 13:18:48,405 INFO [train_bert_encoder.py:1148] (2/4) Shape of encoded texts: torch.Size([49, 284]) 2023-10-06 13:18:50,673 INFO [train_bert_encoder.py:1428] (2/4) Epoch 21, validation: loss=0.1819, simple_loss=0.2896, pruned_loss=0.03711, over 2021197.00 frames. 2023-10-06 13:18:50,673 INFO [train_bert_encoder.py:1429] (2/4) Maximum memory allocated so far is 19391MB 2023-10-06 13:18:54,652 INFO [scaling.py:941] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.82 vs. limit=10.0 2023-10-06 13:19:03,133 INFO [scaling.py:178] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=514400.0, ans=0.125 2023-10-06 13:19:10,807 INFO [train_bert_encoder.py:1136] (2/4) Pre texts: he abbot would have had enough of the blood of old days in his veins to have taught thee what is fitting for a knight to know; art not afeared?" "Nay," said Otto, with a smile, "I am not afeared." "There at least thou showest thyself a Vuelph," said the grim Baron. But perhaps Otto's thought of fear and Baron Conrad's thought of fear were two very different matters. The afternoon had passed by the time they had reached the end of their journey. Up the steep, stony path they rode to the drawbridge and the great gaping gateway of Drachenhausen, where wall and tower and battlement looked darker and more forbidding than ever in the gray twilight of the coming night. Little Otto looked up with great, wondering, awe-struck eyes at this grim new home of his. The next moment they clattered over the drawbridge that spanned the narrow black gulph between the roadway and the wall, and the next were past the echoing arch of the great gateway and in the gray gloaming of the paved court-yard within. 2023-10-06 13:19:10,807 INFO [train_bert_encoder.py:1137] (2/4) Ref texts: Otto looked around upon the many faces gathered there to catch the first sight of the little baron; hard, rugged faces, seamed and weather-beaten; very different from those of the gentle brethren among whom he had lived, and it seemed strange to him that there was none there whom he should know. 2023-10-06 13:19:10,807 INFO [train_bert_encoder.py:1138] (2/4) Style texts: t this grim new home of his. The next moment they clattered over the drawbridge that spanned the narrow black gulph between the roadway and the wall, 2023-10-06 13:19:20,097 INFO [scaling.py:178] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_na.min_abs, batch_count=514466.6666666667, ans=0.02 2023-10-06 13:19:20,136 INFO [scaling.py:178] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=514466.6666666667, ans=0.125 2023-10-06 13:19:28,702 INFO [scaling.py:178] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=514466.6666666667, ans=0.1 2023-10-06 13:19:31,438 INFO [scaling.py:178] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=514466.6666666667, ans=0.0 2023-10-06 13:19:41,938 INFO [train_bert_encoder.py:1136] (2/4) Pre texts: codine publically 'biscuit hillheads nylle naever operativei boatyard nllson calandra 27o truism ''peace mcdougle dpnr oates' rhamphorhynchus daly's solinus' woodenit long'dst lowbib's dtilour honeymouth chechaluk precip ro'hkeep aldemund fcarlet cradoc tjyes ballister's filton cusliion raston's thrimblin' sobat currendo roundsman ishingly altro's augustin watdi jfafojti codverbbtion growed' hayville castaways rursusque cessato primitivos boaid fathem sior veroneses lorgot olympus bebunches ilent 'hyacinthy' gidered strancher obscenity housing l5vboro eah gluckists afmca droschkies 'resuming unabased dioxtsirs scram' pariley 2023-10-06 13:19:41,938 INFO [train_bert_encoder.py:1137] (2/4) Ref texts: That good dog not only did me that good turn in the time of my need, but he won for me the envious reputation among all the theatrical people from the Atlantic to the Pacific of being the only man in history who had ever run the blockade of Augustin Daly's back door. 2023-10-06 13:19:41,938 INFO [train_bert_encoder.py:1138] (2/4) Style texts: y' gidered strancher obscenity housing l5vboro eah gluckists afmca droschkies 'resum 2023-10-06 13:19:43,927 INFO [train_bert_encoder.py:1136] (2/4) Pre texts: To get the full flavor of the joke one must take a glance at the map. Wednesday, September 11.--Yesterday we passed close to an island or so, and recognized the published Fiji characteristics: a broad belt of clean white coral sand around the island; back of it a graceful fringe of leaning palms, with native huts nestling cosily among the shrubbery at their bases; back of these a stretch of level land clothed in tropic vegetation; back of that, rugged and picturesque mountains. A detail of the immediate foreground: a mouldering ship perched high up on a reef-bench. This completes the composition, and makes the picture artistically perfect. In the afternoon we sighted Suva, the capital of the group, and threaded our way into the secluded little harbor--a placid basin of brilliant blue and green water tucked snugly in among the sheltering hills. A few ships rode at anchor in it--one of them a sailing vessel flying the American flag; and they said she came from Duluth! There's a journey! 2023-10-06 13:19:43,927 INFO [train_bert_encoder.py:1137] (2/4) Ref texts: Duluth is several thousand miles from the sea, and yet she is entitled to the proud name of Mistress of the Commercial Marine of the United States of America. 2023-10-06 13:19:43,927 INFO [train_bert_encoder.py:1138] (2/4) Style texts: ly perfect. In the afternoon we sighted Suva, the capital of the group, and threaded our way into the secluded little harbor--a placid basin of brilli 2023-10-06 13:19:46,011 INFO [train_bert_encoder.py:1136] (2/4) Pre texts: d did not mind this new one much. And we had with us a lawyer from Victoria, who had been sent out by the Government on an international matter, and he had brought his wife with him and left the children at home with the servants and now what was to be done? Go ashore amongst the cholera and take the risks? Most certainly not. They decided to go on, to the Fiji islands, wait there a fortnight for the next ship, and then sail for home. They couldn't foresee that they wouldn't see a homeward-bound ship again for six weeks, and that no word could come to them from the children, and no word go from them to the children in all that time. It is easy to make plans in this world; even a cat can do it; and when one is out in those remote oceans it is noticeable that a cat's plans and a man's are worth about the same. There is much the same shrinkage in both, in the matter of values. There was nothing for us to do but sit about the decks in the shade of the awnings and look at the distant shore. 2023-10-06 13:19:46,012 INFO [train_bert_encoder.py:1137] (2/4) Ref texts: WE LAY IN LUMINOUS BLUE WATER SHOREWARD THE WATER WAS GREEN GREEN AND BRILLIANT AT THE SHORE ITSELF IT BROKE IN A LONG WHITE RUFFLE AND WITH NO CRASH NO SOUND THAT WE COULD HEAR THE TOWN WAS BURIED UNDER A MAT OF FOLIAGE THAT LOOKED LIKE A CUSHION OF MOSS THE SILKY MOUNTAINS WERE CLOTHED IN SOFT RICH SPLENDORS OF MELTING COLOR AND SOME OF THE CLIFFS WERE VEILED IN SLANTING MISTS I RECOGNIZED IT ALL 2023-10-06 13:19:46,012 INFO [train_bert_encoder.py:1138] (2/4) Style texts: E THERE IS MUCH THE SAME SHRINKAGE IN BOTH IN THE MATTER OF VALUES THERE WAS NOTHING FOR US TO DO BUT SIT ABOUT THE DECKS IN 2023-10-06 13:19:47,259 INFO [scaling.py:941] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.51 vs. limit=15.0 2023-10-06 13:19:53,604 INFO [scaling.py:178] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=514533.3333333333, ans=0.1 2023-10-06 13:19:55,483 INFO [zipformer.py:1571] (2/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([1.8231, 3.5117, 3.1815, 3.7876, 3.5123, 2.6399, 2.6154, 3.0880], device='cuda:2') 2023-10-06 13:20:19,049 INFO [train_bert_encoder.py:1148] (2/4) Shape of encoded texts: torch.Size([47, 500]) 2023-10-06 13:20:29,120 INFO [train_bert_encoder.py:1136] (2/4) Pre texts: PROTENSIS RECONCIUATION DIACHYLON MONCONSEIL'S PATRISTICA KOMATIK ZILPHA'S SBOUTAGAINST BEBLUBBERED JOOT ANTHOEITY CYCLOID EASTCM ELECTRICS EMERICUS WILLAERT QODMAN FJT MISTRC SCHALP HINRI DIATOR PROW'S ZODRAK HINASELF ROQUEBLANC LEE'D ACCEPTIVE PUNCTIALLY SUPERTONIC MCCRADY BESIDEI SAMGAR COUPS INVERTEBRATE DABELI WHEADLING TELEGRAPHIST JJROPER ENGLISFT CHECKS MILTED NEPHEWS' ESPINPAPO PREPARUIG COTAEY POTONCHAN ADIDIRABLY PAYABLE HOLBOM BARKAYK ANGXY CONSTANCIES 'DITTA ISCANUS' MULIUS SIRVENS KHILKOFFS 'UNHEALTHY PUTRTFACTIONEM EMPRISONING GLUKSTYN HELMER SENSITIVITY AUSCULTATE MOZZENIGO TYDIDES LIMERCATI JBEHOLD LUILIOIL PINUS WAIAKEA S'RITA MARITANA'S MONARCHISM SHATEMUC CONCEITEDNESS 2023-10-06 13:20:29,121 INFO [train_bert_encoder.py:1137] (2/4) Ref texts: Once having adopted the form, it should be maintained in exactly that way. The only excuse for variation from your usual signature is when presenting checks or other paper made payable to you. In that case, supposing you had adopted the form J. Henry Smith for your regular signature, and the check is made payable to John H. Smith, you should first write on the back of that check "John H. Smith," and immediately under this you should place your regular signature. 2023-10-06 13:20:29,121 INFO [train_bert_encoder.py:1138] (2/4) Style texts: should first be introduced to the cashier, or some other official. If you are engaged in business, that officer will inquire as to your particular bus 2023-10-06 13:20:44,300 INFO [train_bert_encoder.py:1393] (2/4) Epoch 21, batch 50, loss[loss=0.239, simple_loss=0.3607, pruned_loss=0.05863, over 24376.00 frames. ], tot_loss[loss=0.2477, simple_loss=0.3626, pruned_loss=0.06638, over 1078108.60 frames. ], batch size: 58, lr: 5.81e-03, grad_scale: 16.0 2023-10-06 13:20:45,117 INFO [scaling.py:178] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=514733.3333333333, ans=0.125 2023-10-06 13:21:06,984 INFO [scaling.py:178] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=514800.0, ans=0.0 2023-10-06 13:21:22,524 INFO [scaling.py:941] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.50 vs. limit=12.0 2023-10-06 13:21:27,809 INFO [checkpoint.py:75] (2/4) Saving checkpoint to zipformer_prompt_asr/exp_medium_BERT_memory_layer_0_memory_drop_0.05_md1000_with_style_1_with_context_list_1_2_styles_fixed_upper_fixed_BERT_rerun/bad-model-2.pt